Uncontested Lock Performance

On this page I attempt to measure the overhead of acquiring an uncontested lock, using various locking mechanisms in .NET and Java.

Overview

The .NET Framework and the Java Runtime Environment both provide a variety of standard locking mechanisms to synchronize concurrent data access. Modern high-level locks are designed to avoid costly operating system calls when the lock is uncontested, but even so lock acquisitions are not free in terms of runtime performance.

When a method definitely requires synchronization, this is simply a cost that must be paid, unless you wish to explore the murky depths of lock-free algorithms. But what about methods that might need synchronization in some cases, but not usually or always? Is uncontested locking cheap enough to perform pre-emptively, like checking for null references? We’ll examine this question for the standard monitor and reader/writer locks in .NET and Java.

Locking Test Programs

All results shown below were obtained with a suite of small test programs. The download package LockTest.zip (8.92 KB, ZIP archive) comprises the precompiled executables and their complete source code. Please refer to the enclosed ReadMe.txt file and the various batch files for the required development tools and expected file paths.

All tests perform 100,000,000 loop iterations over the default random number generator’s simplest call to obtain the next integer value. While the .NET and Java implementations are quite different, their execution times are roughly comparable. The test methods are identified as follows in the result table:

  • Unlocked — Simplest variant, with no locking statements whatsoever
  • Monitor — Platform’s standard monitor lock: lock in C#, synchronized in Java
  • RWLS — C# only: ReaderWriterLockSlim, standard reader/writer lock since .NET 3.5
  • RWL — C# only: ReaderWriterLock, .NET’s older and slower reader/writer lock
  • RRWL — Java only: ReentrantReadWriteLock, Java’s standard reader/writer lock

All reader/writer locks are tested using both read and write access. As it turns out, however, the differences between the two modes are never as significant as the differences between platforms, and locking mechanisms in the case of .NET.

Sample Test Results

The following tables show two sets of sample test results on my systems. All times are in milliseconds, showing both the total execution time and the difference to the unlocked case. The programs were tested with Microsoft Visual C# and Oracle Java, in both 32-bit and 64-bit versions.

Visual C# used full optimization (/o) with unchecked arithmetic and no debug information. Oracle currently provides both the Client and Server VM on 32-bit Windows but only the Server VM on 64-bit Windows, so these were our three test cases.

The first set of results was obtained on 19 March 2014. The system was Windows 8.1 (64 bit) with Visual Studio 2013 (.NET Framework 4.5.1) and Java SE 8, running on an Intel DX58SO motherboard with an Intel Core i7 920 CPU (2.67 GHz) and 6 GB RAM (DDR3-1333).

Visual C# Java SE
32 bit 64 bit Client/32 Server/32 Server/64
Unlocked 1,200 +0 1,080 +0 1,815 +0 1,625 +0 1,175 +0
Monitor 2,660 +1,460 2,530 +1,450 3,300 +1,485 3,100 +1,475 3,025 +1,850
RWLS-Read 7,105 +5,905 7,510 +6,430
RWLS-Write 5,970 +4,770 7,105 +6,025
RWL-Read 11,205 +10,005 9,365 +8,285
RWL-Write 10,465 +9,265 8,230 +7,150
RRWL-Read 4,535 +2,720 3,265 +1,640 3,190 +2,015
RRWL-Write 4,180 +2,365 3,000 +1,375 2,815 +1,640

The second set of results was obtained on 21 September 2015. The system was Windows 10 Pro (64 bit) with Visual Studio 2015 Community (.NET Framework 4.6) and Java SE 8u60, running on a Dell XPS 15 notebook (model 9530) with an Intel Core i7 4712HQ CPU (2.30 GHz) and 16 GB RAM (DDR3-1600).

Visual C# Java SE
32 bit 64 bit Client/32 Server/32 Server/64
Unlocked 1,358 +0 1,341 +0 2,120 +0 1,962 +0 1,576 +0
Monitor 3,277 +1,919 3,278 +1,936 4,313 +2,192 4,570 +2,608 3,547 +1,970
RWLS-Read 7,296 +5,938 7,485 +6,143
RWLS-Write 6,355 +4,997 6,603 +5,261
RWL-Read 12,177 +10,819 10,469 +9,127
RWL-Write 11,367 +10,009 9,382 +8,040
RRWL-Read 5,322 +3,201 4,077 +2,115 3,716 +1,803
RRWL-Write 5,492 +3,371 3,765 +1,803 3,390 +1,813

The relative timings were quite similar across systems, so we can summarize both tables together.

Test Conclusions

On all tested platforms, a basic monitor lock is roughly as expensive as a basic random number generator call. So an uncontested lock, while not hideously slow, can still double the execution time of a small non-trivial algorithm. Our first conclusion: locks should not be applied to short methods as a mere precaution. Only use locks if synchronization is definitely required.

Two other noteworthy results are evident. First and as claimed by Microsoft, .NET’s old ReaderWriterLock is indeed much slower than the new ReaderWriterLockSlim, so you should certainly use the new variant if possible. However, ReaderWriterLockSlim is still 2–3x slower than a standard monitor lock, whereas Java’s ReentrantReadWriteLock is less than 1.5x slower than a monitor in the Client VM, and about the same speed in the Server VM.

This somewhat disappointing result yields a second conclusion: do not preemptively replace .NET monitors with reader/writer locks, i.e. not unless you actually have multiple concurrent readers. Java developers on the Server VM need no such warning, as there is no performance penalty for uncontested reader/writer locks.

Contested Locks in Java — Martin Thompson compares Lock-Based vs Lock-Free Concurrent Algorithms with multiple readers and writers, using the Java 7u25 Server VM on Linux. Interestingly, ReentrantReadWriteLock shows significantly inferior read performance in all test cases, relative to any other synchronization method. That shouldn’t be happening. I recommend you read Thompson’s article and follow his advice to test your concrete application with several locking variants.