Throughout this book, we’ve mostly avoided the low-level details of the Java Memory Model and instead focused on higher-lvel design issues such as safe publication, specification of, and adherence to synchronization policies. These derive their safety from the JMM, and you may find it easier to use these mechanisms effectively when you understand why they work.
This chapter pulls back the curtain to reveal the low-level requirements and guarantees of the Java Memory Model and the reasoning behind some of the higher-level design rules offered in this book.
Suppose one thread assigns a value to
A memory model addresses the question “Under what conditions does a thread that reads
aVariable see the value
3?” This may sound like a dumb question, but in the absense of synchronization, there are a number of reasons a thread might not immediately - or ever - see the results of an operation in another thread.
In a single-threaded environment, all the tricks played on our program by the environment are hidden from us and have no effect other than to speed up execution. The Java Language Specification requires the JVM to maintain within-thread as-if-serial semantics: as long as the program has the same result as if it was exectued in program order in a strictly sequential environment.
In a multithreaded environment, the illusion of sequentiality cannot be maintained without significant performance cost. Since most of the time threads within a concurrent application are each “doing their own thing”, excessive inter-thread coordination would only slow down the application to no real benefit.
The JMM specifies the minimal guarantees the JVM must make about when writes to variables become visible to other threads. It was designed to balance the need for predicability and ease of program development with the realities of implementing high-performance JVMs on a wide range of popular processor architectures
In a shared-memory multiprocessor architecture, each processor has its own cache that is periodically reconciled with main memory. Processor architectures provide varying degrees of cache coherence; some provide minimal guarantees that allow different processors to see different values for the same memory location at virtually any time. The operating system, compiler, and runtime (and sometimes, the program, too) must make up the difference between what the hardware provides and what thread safety requires.
Ensuring that every processor knows what every other processor is doing at all times is expensive. Most of the time this information is not needed, so processors relax their memory-coherency guarantees to improve performance. An architecture’s memory model tells programs what guarantees they can expect from the memory system, and specifies the special instructions required (called memory barriers or fences) to get the additional memory coordination guarantees required when sharing data.
In order to shield the Java developer from the differences between memory models across architectures, Java provides its own memory model, and the JVM deals with the differences between the JMM and the underlying platform’s memory model by inserting memory barriers at the appropriate places.
One convenient mental model for program execution is to imagine that there is a single order in which the operations happen in a program, regardless of what processor they execute on, and that each read of a variable will see the last write in the execution order to that variable by any processor. This happy, if unrealistic, model is called sequential consistency. Software developers often mistakenly assume sequential consistency, but no modern multiprocessor offers sequential consistency and the JMM does not either. The classic sequential computing model, the von Neumann model, is only a vague approximation of how modern multiprocessors behave.
The bottom line is that modern shared-memory multiprocessors (and compilers) can do some surprising things when data is shared across threads, unless you’ve told them not to through the use of memory barriers. Fortunately, Java programs need not specify the placement of memory barriers; they need only identify when shared state is being accessed, through the proper use of synchronization.
To make matters worse, the JMM can permit actions to appear to execute in different orders from the perspective of different threads, making reasoning about ordering in the absence of synchronization even more complicated. The various reasons why operations might be delayed or appear to execute out of order can all be grouped into the general category of reordering.
PossibleReordering illustrates how difficult it is to reason about the behavior of even the simplest concurrent programs unless they are correctly synchronized. It is fairly easy to imagine how
PossibleReordering could print
(1, 0), or
(0, 1), or
(1, 1): thread A could run to completion before B starts, B could run to completion before A starts, or their actions could be interleaved. But, strangely,
PossibleReordering can also print
The actions in each thread have no dataflow dependence on each other, and accordingly can be executed out of order. (Even if they are executed in order, the timing by which caches are flushed to main memory can make it appear, from the perspective of B, that the assignments in A occurred in the opposite order.) Figure below shows a possible interleaving with reordering that results in printing
Reordering at the memory level can make programs behave unexpectedly. It is prohibitively difficult to reason about ordering in the absence of synchronization; it is much easier to ensure that your program uses synchronization appropriately.
The Java Memory Model is specified in terms of actions, which include reads and writes to variables, locks and unlocks of monitors, and starting and joining with threads. The JMM defines a partial ordering called happens-before on all actions within the program. To guarantee that the thread executing action B can see the results of action A (whether or not A and B occur in different threads), there must be a happens-before relationship between A and B. In the absence of a happens-before ordering between two operations, the JVM is free to reorder them as it pleases.
A data race occurs when a variable is read by more than one thread, and written by at least one thread, but the reads and writes are not ordered by happens-before. A correctly synchronized program is one with no data races; correctly synchronized programs exhibit sequential consistency, meaning that all actions within the program appear to happen in a fixed, global order.
The rules for happens-before are:
- Program order rule: Each action in a thread happens-before every action in that thread that comes later in the program order
- Monitor lock rule: An unlock on a monitor lock happens-before every subsequent lock on that same monitor lock
- Volatile variable rule: A write to a volatile field happens-before every subsequent read of that same field
- Thread start rule: A call to
Thread.starton a thread happens-before every action in the started thread
- Thread termination rule: Any action in a thread happens-before any other thread detects that thread has terminated, either by successfully return from
- Interruption rule: A thread calling interrupt on another thread happens-before the interrupted thread detects the interrupt (either by having
InterruptedExceptionthrown, or invoking isInterrupted or interrupted)
- Finalizer rule: The end of a constructor for an object happens-before the start of the finalizer for that object
- Transitivity. If A happens-before B, and B happens-before C, then A happens-before C
Even though actions are only partially ordered, synchronization actions—lock acquisition and release, and reads and writes of
volatile variables—are totally ordered. This makes it sensible to describe happens-before in terms of “subsequent” lock acquisitions and reads of
Figure above illustrates the happens-before relation when two threads synchronize using a common lock. All the actions within thread A are ordered by the program order rule, as are the actions within thread B. Because A releases lock M and B subsequently acquires M, all the actions in A before releasing the lock are therefore ordered before the actions in B after acquiring the lock. When two threads synchronize on different locks, we can’t say anything about the ordering of actions between them—there is no happens-before relation between the actions in the two threads.
Because of the strength of the happens-before ordering, you can sometimes piggyback on the visibility properties of an existing synchronization. This entails combining the program order rule for happens-before with one of the other ordering rules (usually the monitor lock or
volatile variable rule) to order accesses to a variable not otherwise guarded by a lock. This technique is very sensitive to the order in which statements occur and is therefore quite fragile; it is an advanced technique that should be reserved for squeezing the last drop of performance out of the most performance-critical classes like
The implementation of the protected
AbstractQueuedSynchronizer methods in
FutureTask illustrates piggybacking. AQS maintains an integer of synchronizer state that
FutureTask uses to store the task state: running, completed, or cancelled. But
FutureTask also maintains additional variables, such as the result of the computation. When one thread calls set to save the result and another thread calls get to retrieve it, the two had better be ordered by happens-before. This could be done by making the reference to the result
volatile, but it is possible to exploit existing synchronization to achieve the same result at lower cost.
FutureTask is carefully crafted to ensure that a successful call to
tryReleaseShared always happens-before a subsequent call to
tryAcquireShared; tryReleaseShared always writes to a volatile variable that is read by
innerGet methods that are called when the result is saved or retrieved; since
innerSet writes result before calling
releaseShared (which calls
innerGet reads result after calling
acquireShared (which calls
tryAcquireShared), the program order rule combines with the volatile variable rule to ensure that the write of result in
innerGet happens-before the read of result in
We call this technique “piggybacking” because it uses an existing happens-before ordering that was created for some other reason to ensure the visibility of object X, rather than creating a happens-before ordering specifically for publishing X.
In some cases piggybacking is perfectly reasonable, such as when a class commits to a happens-before ordering between methods as part of its specification. For example, safe publication using a
BlockingQueue is a form of piggybacking. One thread putting an object on a queue and another thread subsequently retrieving it constitutes safe publication because there is guaranteed to be sufficient internal synchronization in a
BlockingQueue implementation to ensure that the enqueue happens-before the dequeue.
Other happens-before orderings guaranteed by the class library include:
- Placing an item in a thread-safe collection happens-before another thread retrieves that item from the collection
- Counting down on a
CountDownLatchhappens-before a thread returns from await on that latch
- Releasing a permit to a
Semaphorehappens-before acquiring a permit from that same
- Actions taken by the task represented by a
Futurehappens-before another thread successfully returns from
- Submitting a
Executorhappens-before the task begins execution
- A thread arriving at a
Exchangerhappens-before the other threads are released from that same barrier or exchange point. If
CyclicBarrieruses a barrier action, arriving at the barrier happens-before the barrier action, which in turn happens-before threads are released from the barrier
The safe publication techniques described there derive their safety from guarantees provided by the JMM; the risks of improper publication are consequences of the absence of a happens-before ordering between publishing a shared object and accessing it from another thread.
The possibility of reordering in the absence of a happens-before relationship explains why publishing an object without adequate synchronization can allow another thread to see a partially constructed object.
Initializing a new object involves writing to variables — the new object’s fields. Similarly, publishing a reference involves writing to another variable — the reference to the new object. If you do not ensure that publishing the shared reference happens-before another thread loads that shared reference, then the write of the reference to the new object can be reordered (from the perspective of the thread consuming the object) with the writes to its fields. In that case, another thread could see an up-to-date value for the object reference but out-of-date values for some or all of that object’s state — a partially constructed object.
Unsafe publication can happen as a result of an incorrect lazy initialization, as shown above. At first glance, the only problem here seems to be the race condition described in Section 2.2.2. Under certain circumstances, such as when all instances of the
Resource are identical, you might be willing to overlook these (along with the inefficiency of possibly creating the
Resource more than once). Unfortunately, even if these defects are overlooked,
UnsafeLazyInitialization is still not safe, because another thread could observe a reference to a partially constructed
Suppose thread A is the first to invoke
getInstance. It sees that
null, instantiates a new
Resource, and sets
resource to reference it. When thread B later calls
getInstance, it might see that
resource already has a non-null value and just use the already constructed
Resource. This might look harmless at first, but there is no happens-before ordering between the writing of resource in A and the reading of resource in B. A data race has been used to publish the object, and therefore B is not guaranteed to see the correct state of the
Resource constructor changes the fields of the freshly allocated
Resource from their default values (written by the
Object constructor) to their initial values. Since neither thread used synchronization, B could possibly see A’s actions in a different order than A performed them. So even though A initialized the
Resource before setting
resource to reference it, B could see the write to
resource as occurring before the writes to the fields of the
Resource. B could thus see a partially constructed
Resource that may well be in an invalid state — and whose state may unexpectedly change later.
With the exception of immutable objects, it is not safe to use an object that has been initialized by another thread unless the publication happens-before the consuming thread uses it.
The safe-publication idioms described in Chapter 3 ensure that the published object is visible to other threads because they ensure the publication happens-before the consuming thread loads a reference to the published object. If thread A places X on a
BlockingQueue (and no thread subsequently modifies it) and thread B retrieves it from the queue, B is guaranteed to see X as A left it. This is because the
BlockingQueue implementations have sufficient internal synchronization to ensure that the put happens-before the take. Similarly, using a shared variable guarded by a lock or a shared
volatile variable ensures that reads and writes of that variable are ordered by happens-before.
This happens-before guarantee is actually a stronger promise of visibility and ordering than made by safe publication. When X is safely published from A to B, the safe publication guarantees visibility of the state of X, but not of the state of other variables A may have touched. But if A putting X on a queue happens-before B fetches X from that queue, not only does B see X in the state that A left it (assuming that X has not been subsequently modified by A or anyone else), but B sees everything A did before the handoff (again, subject to the same caveat).
Why did we focus so heavily on
@GuardedBy and safe publication, when the JMM already provides us with the more powerful happens-before? Thinking in terms of handing off object ownership and publication fits better into most program designs than thinking in terms of visibility of individual memory writes. The happens-before ordering operates at the level of individual memory accesses — it is a sort of “concurrency assembly language”. Safe publication operates at a level closer to that of your program’s design.
It sometimes makes sense to defer initialization of objects that are expensive to initialize until they are actually needed, but we have seen how the misuse of lazy initialization can lead to trouble.
UnsafeLazyInitialization can be fixed by making the
getResource method synchronized, as shown above. Because the code path through
getInstance is fairly short (a test and a predicted branch), if
getInstance is not called frequently by many threads, there is little enough contention for the
SafeLazyInitialization lock that this approach offers adequate performance.
The treatment of static fields with initializers (or fields whose value is initialized in a static initialization block) is somewhat special and offers additional thread-safety guarantees. Static initializers are run by the JVM at class initialization time, after class loading but before the class is used by any thread.
Because the JVM acquires a lock during initialization and this lock is acquired by each thread at least once to ensure that the class has been loaded, memory writes made during static initialization are automatically visible to all threads. Thus statically initialized objects require no explicit synchronization either during construction or when being referenced. However, this applies only to the as-constructed state—if the object is mutable, synchronization is still required by both readers and writers to make subsequent modifications visible and to avoid data corruption.
Using eager initialization, shown above, eliminates the synchronization cost incurred on each call to
SafeLazyInitialization. This technique can be combined with the JVM’s lazy class loading to create a lazy initialization technique that does not require synchronization on the common code path. The lazy initialization holder class idiom in uses a class whose only purpose is to initialize the
Resource. The JVM defers initializing the
ResourceHolder class until it is actually used, and because the
Resource is initialized with a static initializer, no additional synchronization is needed. The first call to
getResource by any thread causes
ResourceHolder to be loaded and initialized, at which time the initialization of the
Resource happens through the static initializer.
No book on concurrency would be complete without a discussion of the infamous double-checked locking (DCL) antipattern, shown above. In very early JVMs, synchronization, even uncontended synchronization, had a significant performance cost. As a result, many clever (or at least clever-looking) tricks were invented to reduce the impact of synchronization—some good, some bad, and some ugly. DCL falls into the “ugly” category.
The real problem with DCL is the assumption that the worst thing that can happen when reading a shared object reference without synchronization is to erroneously see a stale value (in this case,
null); in that case the DCL idiom compensates for this risk by trying again with the lock held. But the worst case is actually considerably worse—it is possible to see a current value of the reference but stale values for the object’s state, meaning that the object could be seen to be in an invalid or incorrect state.
Subsequent changes in the JMM (Java 5.0 and later) have enabled DCL to work if resource is made
volatile, and the performance impact of this is small since
volatile reads are usually only slightly more expensive than nonvolatile reads. However, this is an idiom whose utility has largely passed—the forces that motivated it (slow uncontended synchronization, slow JVM startup) are no longer in play, making it less effective as an optimization. The lazy initialization holder idiom offers the same benefits and is easier to understand.
The guarantee of initialization safety allows properly constructed immutable objects to be safely shared across threads without synchronization, regardless of how they are published—even if published using a data race. (This means that UnsafeLazyInitialization is actually safe if
Resource is immutable.)
Without initialization safety, supposedly immutable objects like
String can appear to change their value if synchronization is not used by both the publishing and consuming threads. The security architecture relies on the immutability of
String; the lack of initialization safety could create security vulnerabilities that allow malicious code to bypass security checks.
Initialization safety guarantees that for properly constructed objects, all threads will see the correct values of final fields that were set by the constructor, regardless of how the object is published. Further, any variables that can be reached through a final field of a properly constructed object (such as the elements of a final array or the contents of a
HashMap referenced by a
final field) are also guaranteed to be visible to other threads.
For objects with final fields, initialization safety prohibits reordering any part of construction with the initial load of a reference to that object. All writes to final fields made by the constructor, as well as to any variables reachable through those fields, become “frozen” when the constructor completes, and any thread that obtains a reference to that object is guaranteed to see a value that is at least as up to date as the frozen value. Writes that initialize variables reachable through final fields are not reordered with operations following the post-construction freeze.
Initialization safety means that
SafeStates could be safely published even through unsafe lazy initialization or stashing a reference to a
SafeStates in a public static field with no synchronization, even though it uses no synchronization and relies on the non-thread-safe
However, a number of small changes to
SafeStates would take away its thread safety. If states were not final, or if any method other than the constructor modified its contents, initialization safety would not be strong enough to safely access
SafeStates without synchronization. If
SafeStates had other nonfinal fields, other threads might still see incorrect values of those fields. And allowing the object to escape during construction invalidates the initialization-safety guarantee.
Initialization safety makes visibility guarantees only for the values that are reachable through
final fields as of the time the constructor finishes. For values reachable through nonfinal fields, or values that may change after construction, you must use synchronization to ensure visibility.
The Java Memory Model specifies when the actions of one thread on memory are guaranteed to be visible to another. The specifics involve ensuring that operations are ordered by a partial ordering called happens-before, which is specified at the level of individual memory and synchronization operations.
In the absence of sufficient synchronization, some very strange things can happen when threads access shared data. However, the higher-level rules offered in Chapters 2 and 3, such as
@GuardedBy and safe publication, can be used to ensure thread safety without resorting to the low-level details of happens-before.
(To Be Continued)