Java Concurrency in Practice Notes: Sharing Objects

(This notes is from reading Brian Goetz‘s Java Concurrency in Practice)

Chapter 3 Sharing Objects

Last chapter was about using synchronization to prevent multiple threads from accessing the same data at the same time, this chapter examines techniques for sharing and publishing objects objects so they can be safely accessed by multiple threads. Together, they lay the foundation for building thread-safe classes and safely structuring concurrent application using java.util.concurrent.

It is a common misconception that synchronized is only about atomicity of demarcating “critical sections”. Synchronization also has another significant, and subtle, aspsect: memory visibility.

We want not only to prevent one thread from modifying the state of an object when another is using it, but also to ensure that when a thread modifies the state of an object, other threads can actually see the changes that were made.

Visibility

In a single-threaded environment, if you write a value to a variable and later read that variable with no intervening writes, you can expect to get the same value back. But when reads and writes occur in different threads, this is simply not the case. Three is no guarantee that the reading thread will see a value written by another thread on a timely basis, or even at all.

In order to ensure visibility of memory writes across threads, you must use synchronization.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
public class NoVisibility {
private static boolean ready;
private static int number;
private static class ReaderThread extends Thread {
public void run() {
while (!ready)
Thread.yield();
System.out.println(number);
}
}
public static void main(String[] args) {
new ReaderThread().start();
number = 42;
ready = true;
}
}

The above program could loop forever because the value of ready might never become visible to the reader thread. Even more strangely, NoVisibility could print zero because the write to ready might be made visible to the reader thread before the write to number, a phenomenon known as reordering.

In the absence of synchronization, the compiler, processor, and runtime can do some downright weird things to the order in which operations appear to execute. Attempts to reason about the order in which memory actions “must” happen in insufficiently synchronized multithreaded programs will almost certainly be incorrect.

So, always use the proper synchronization whenver data is shared across threads.

Stale data

NoVisibility demonstrated one of the ways that insufficiently synchronized programs can cause surprising results: stale data. When the reader thread examines ready, it may see an out-of-date value. Unless synchronization is used every time a variable is accessed, it is possible to see a stale value for that variable.

Stale data can cause serious and confusing failures such as unexpected exceptions, corrupted data structures, inaccurate computations, and infinite loops.

1
2
3
4
5
6
7
8
9
10
11
12
@NotThreadSafe
public class MutableInteger {
private int value;
public int get() {
return value;
}
public void set(int value) {
this.value = value;
}
}

MutableInteger is not thread-safe because the value field is accessed from both get and set without synchronization. We can make MutableInteger thread safe by synchronizing the getter and setter as shown in SynchronizedInteger below:

1
2
3
4
5
6
7
8
9
10
11
12
@ThreadSafe
public class SynchronizedInteger {
@GuardedBy("this") private int value;
public synchronized int get() {
return value;
}
public synchronized void set(int value) {
this.value = value;
}
}

Note synchronizing only the setter would not be sufficient, threads calling get would still be able to see stale values.

Nonatomic 64-bit operations

For 64-bit numeric variables (double and long) that are not declared volatile, the JVM is permitted to tread a 64-bit read or write as two separate 32-bit operations.

Thus, even if you don’t care about stale values, it is not safe to use shared mutable long and double variables in multithreaded programs unless they are declared volatile or guarded by a lock.

Locking and visibility

Intrinsic locking can be used to guarantee that one thread sees the effect of another in a predicatable manner, as illustrated below:

Visibility guarantees for synchronization

Locking is not just about mutual exclusion; it is also about memory visibility. To ensure that all threads see the most up-to-date values of shared mutable variables, the reading and writing threads must synchronize on a common lock.

volatile variables

When a field is declared volatile, the compiler and runtime are put on notice that is variable is shared and that operations on it should not be reordered with other memory operations.

Use volatile variables only when they simplify implementing and verifying your synchronization policy; avoid using volatile variables when verifying correctness would require subtle reasoning about visibility. Good use of volatile variables include ensuring the visibility of their own state, that of the object they refer to, or indicating that an important life-cycle event (such as initialization or shutdown) has occurred.

1
2
3
4
5
volatile boolean asleep;
...
while (!asleep) {
countSomeSheep();
}

Volatile variables can be used for other kinds of state information, but more care is required when attempting this. (e.g. The semantics of volatile is not strong enough to make the increment operation count++ atomic, unless you can guarantee that the variable is written only from a single thread)

Locking can guarantee both visibility and atomicity; volatile variables can only guarantee visibility.

Use volatile variables only when all following criteria are met:

  • Writes to the variable do not depend on its current value, or you can ensure that only a single thread ever updates the value
  • The variable dose not participate in invariants with other state variables
  • Locking is not required for any other reason while the variable is being accessed

Publication and escape

Publishing an object means making it available to code outside of its current scope.

Publishing internal state variables can compromise encapsulation and make it more difficult to preserve invariants.

An object htat is published when it should not have been is said to have escaped.

1
2
3
4
5
public static Set<Secret> knownSecrets;
public void initialize() {
knownSecrets = new HashSet<Secret>();
}
1
2
3
4
5
6
7
class UnsafeStates {
private String[] states = new String[] {
"AK", "AL" ...
};
public String[] getStates() { return states; }
}

Publishing states in this way is problematic because any caller can modify its contents. In this case, the states array has escaped its intended scope, because what was supposed to be private state has been effectively made public.

Another mechanism by which an object or its internal state can be published is to publish an inner class intance:

1
2
3
4
5
6
7
8
9
10
public class ThisEscape {
public ThisEscape(EventSource source) {
source.registerListener(
new EventListener() {
public void onEvent(Event e){
doSomething(e);
}
});
}
}

When ThisEscape publishes EventListener, it implcitly publishes the enclosing ThisEscape instance as well, because inner classes intances contains a hidden reference to the enclosing instance.

Safe construction practices

Do not allow the this reference to escape during construction. (A common mistake that let this reference escape during construction is to start a thread from a constructor)

If you are tempted to register an event listener or start a thread from a constructor, you can avoid the improper construction by using a private constructor and a public factory method:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
public class SafeListener {
private final EventListener listener;
private SafeListener() {
listener = new EventListener() {
public void onEvent(Event e) {
doSomething(e);
}
};
}
public static SafeListener newInstance(EventSource source) {
SafeListener safe = new SafeListener();
source.registerListener(safe.listener);
return safe;
}
}

Thread confinement

Accessing shared, mutable data requires using synchronization; one way to avoid this requirement is to not share. If data is only accessed from a single thread, no synchronization is needed. This technique, thread confinement is one of the simplese ways to achieve thread safety.

Thread confinement is an element of your program’s design that must be enforced by its implementation. The language and core libraries provide mechanisms that can hel in maintaining thread confinement - local variables and the ThreadLocal class - but even with these, it is still the programmer’s responsibility to ensure that thread-confined objects do not escape from their intended thread.

Ad-hoc thread confinement

Ad-hoc thread confinement describes when the responsibility for maintaining thread confinement falls entirely on the implementation. (e.g. GUI as a single-threaded sub-system)

It is fragaile, but sometimes it can be beneficial.

Stack confinement

Stack confinement is a special case of thread confinement in which an object can only be reached through local variables. It’s simpler to maintain and less fragile than ad-hoc thread confinement.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
public class Animals {
Ark ark;
Species species;
Gender gender;
public int loadTheArk(Collection<Animal> candidates) {
SortedSet<Animal> animals;
int numPairs = 0;
Animal candidate = null;
// animals confined to method, don't let them escape!
animals = new TreeSet<Animal>(new SpeciesGenderComparator());
animals.addAll(candidates);
for (Animal a : animals) {
if (candidate == null || !candidate.isPotentialMate(a))
candidate = a;
else {
ark.load(new AnimalPair(candidate, a));
++numPairs;
candidate = null;
}
}
return numPairs;
}
}

Maintaining stack confinement for object references requires a little more assistance from the programmer to ensure that the referent does not escape.

Using a non-thread-safe object in a within-thread context is still thread-safe. However, be careful: the design requirement that the object be confined to the executing thread, or the awareness that the confined object is not thread-safe, often exists only in the head of the developer when the code is written. If the assumption of within-thread usage is not clearly documented, future maintainers might mistakenly allow the object to escape.

ThreadLocal

A more formal means of maintaining thread confinement is ThreadLocal, which allows you to associate a per-thread value with a value-holding object.

ThreadLocal provides get and set accessor methods that maintain a separate copy of the value for each thread that uses it, so a get returns the most recent value passed to set from the currently executing thread.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
public class ConnectionDispenser {
static String DB_URL = "jdbc:mysql://localhost/mydatabase";
private ThreadLocal<Connection> connectionHolder
= new ThreadLocal<Connection>() {
public Connection initialValue() {
try {
return DriverManager.getConnection(DB_URL);
} catch (SQLException e) {
throw new RuntimeException("Unable to acquire Connection, e");
}
};
};
public Connection getConnection() {
return connectionHolder.get();
}
}

This technique can also be used when a frequently used operation requires a temporary object such as a buffer and wants to avoid reallocating the temporary object on each invocation.

Immutability

The other end-run around the need to synchronize is to use immutable objects.

An immutable object is one whose state can not be changed after construction. Immutable objects are inherently thread-safe; their invariants are established by the constructor, and if their state can not be changed, these invariants always hold.

An object is immutable if:

  • Its state can not be modified after construction
  • All its fields are final
  • It is properly constructed (the this reference does not escape during construction)

Immutable objects can still use mutable objects internally to manage their state:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
@Immutable
public final class ThreeStooges {
private final Set<String> stooges = new HashSet<String>();
public ThreeStooges() {
stooges.add("Moe");
stooges.add("Larry");
stooges.add("Curly");
}
public boolean isStooge(String name) {
return stooges.contains(name);
}
public String getStoogeNames() {
List<String> stooges = new Vector<String>();
stooges.add("Moe");
stooges.add("Larry");
stooges.add("Curly");
return stooges.toString();
}
}

Program state stored in immutable objects can still be updated by “replacing” immutable objects with a new instance holding new state.

Final fields

It is the final fields makes possible the guarantee of initialization safety. that lets immutable objects be freely accessed and shared without synchronization.

Just as it is a good practice to make all fields private unless they need grater visibility, it is a good practice to make all fields final unless they need to be mutable.

Using volatile to publish immutable objects

Whenever a group of related data items mut be acted on atomically, consider creating an immutable holder class for them. See below:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
@Immutable
public class OneValueCache {
private final BigInteger lastNumber;
private final BigInteger[] lastFactors;
public OneValueCache(BigInteger i,
BigInteger[] factors) {
lastNumber = i;
lastFactors = Arrays.copyOf(factors, factors.length);
}
public BigInteger[] getFactors(BigInteger i) {
if (lastNumber == null || !lastNumber.equals(i))
return null;
else
return Arrays.copyOf(lastFactors, lastFactors.length);
}
}

Then we could use it to store the cached number and factors:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
@ThreadSafe
public class VolatileCachedFactorizer extends GenericServlet implements Servlet {
private volatile OneValueCache cache = new OneValueCache(null, null);
public void service(ServletRequest req, ServletResponse resp) {
BigInteger i = extractFromRequest(req);
BigInteger[] factors = cache.getFactors(i);
if (factors == null) {
factors = factor(i);
cache = new OneValueCache(i, factors);
}
encodeIntoResponse(resp, factors);
}
}

The cache-related operations can not interfere with each other because OneValueCache is immutable and the cache field is accessed only once in each of the relevant code paths.

This combination of an immutable holder object for multiple state variables related by an invariant, and a volatile reference used to ensure its timely visibility, allows VolatileCachedFactorizer to be thread-safe even though it does no explicit locking.

Safe publication

Sometimes we do want to share objects across threads, and in this case we must do so safely.

The code below does not guarantee safety:

1
2
3
4
5
6
7
public class StuffIntoPublic {
public Holder holder;
public void initialize() {
holder = new Holder(42);
}
}

This improper publication could allow another thread to observe a partially constructed object.

Improper publication: when good objects go bad

1
2
3
4
5
6
7
8
9
10
11
12
public class Holder {
private int n;
public Holder(int n) {
this.n = n;
}
public void assertSanity() {
if (n != n)
throw new AssertionError("This statement is false.");
}
}

This might fail since other threads might see a partially constructed object, or stale values.

Immutable objects and initialization safety

Java Memory Model offers a special guarantee of initialization safety for sharing immutable objects. As we’ve seen, that an object reference becomes visible to another thread does not necessarily mean that the state of that object is visible to the consuming thread. Immutable objects, on the other hand, can be safely accessed even when sychronization is not used to publish the object reference.

Safe publication idioms

To publish an object safely, both the reference to the object and the object’s state must be made visible to other threads at the same time. A properly constructed object can be safely published by:

  • Initializing an object reference from a static initializer
  • Storing a reference to it into a volatile field of AtomicReference
  • Storing a reference to it into a final field of a properly constructed object
  • Storing a reference to it into a field that is properly guarded by a lock

Effectively immutable objects

Objects that are not techincally immutable, but whose state will not be modified after publication, are called effectively immutable.

Safely published effectively immutable objects can be used safely by any thread without additional synchronization.

e.g. Date is mutable, but if you use it as if it were immutable, you may be able to eliminate the locking that otherwise be required when sharing a Date across threads. Suppose you want to maintain a Map storing the last login time of each user:

1
2
public Map<String, Date> lastLogin =
Collections.synchronizedMap(new HashMap<String, Date>());

if the Date value are not modified after they are placed in the Map, then the synchronization in the synchronizedMap implementation is sufficient to publish the Date values safely, and no additional synchronization is needed when accessing them.

Mutable objects

The publication requirements of an object depend on its mutability:

  • Immutable objects can be published through any mechanism
  • Effective immutable objects must be safely published
  • Mutable objects must be safely published, and must be either thread-safe or guarded by a lock

Sharing objects safely

The most useful policies for using and sharing objects in a concurrent program are:

  • Thread-confined: A thread-confined object is owned exclusively by and confined to one thread, and can be modified by its owning thread
  • Shared read-only: A shared read-only object can be accessed concurrently by multiple threads without additional synchronization, but cannot be modified by any thread. Shared read-only objects include immutable and effectively immutable objects
  • Shared thread-safe: A thread-safe object performs synchronization internally, so multiple threads can freely access it through its public interface without further synchronization
  • Guarded: A guarded object can be accessed only with a specific lock held. Guarded objects include those that are encapsulated within other thread-safe objects and published objects that are known to be guarded by a specific lock

(To be continued)