There is often a tension between safety and liveness. We use locking to ensure thread safety, but indiscriminate of locking can cause lock-ordering deadlocks. Similarly, we use thread pool and semaphores to bound resource consumption, but failure to understand the activities being bounded cause resource deadloacks.
When thread A holds lock L and tries to acquire lock M, but at the same time thread B holds M and tries to acquire L, both threads will wait forever. This situation is the simplest case of deadlock, when multiple threads wait forever due to a cyclic locking dependency (Think of the treads as the nodes of a directed graph whose edges represent the relation “Thread A is waiting for a resource held by thread B”. If this graph is cyclical, there is a deadlock)
Like many other concurrency hazards, deadlocks rarely manifest themselves immediately. The fact that a class has a potential deadlock doesn’t mean that it ever will deadlock, just that it can. When deadlocks do manifest themselves, it is often as the worst possible time - under heavy production load.
LeftRightDeadlock is at the risk for deadlock. If one thread calls
leftRight and another calls
rightLeft, and their actions are interleaved as shown below, they will deadlock.
The deadlock in
LeftRightDeadLock came about because the two threads attempted to acquire the same locks in a different order. If they asked for the locks in the same order, there would be no cyclic locking dependency and therefor no deadlock.
A program will be free of lock-ordering deadlocks if all threads acquire the locks they need in a fixed global order.
Sometimes it is not obvious that you have sufficient control over lock ordering to prevent deadlocks.
transferMoney acquires the locks on both
Account objects before executing the transfer, ensuring that the balances are updated atomically and without violating invariants such as “an account can not have a negative balance”.
Deadlock can occur if two threads call
transferMoney at the same time, one transferring X to Y, and the other doing the opposite:
Since the order of arguments is out of our control, to fix the problem we must induce an ordering on the locks and acquire them according to the induced ordering consistently throughout the application.
System.identityHashCode to generate an ordering on the input arguments to avoid deadlock. But if two objects has the same hascode, then a third “tie breaking” lock is used.
Account has a unique, immutable, comparable key such as an account number, inducing a lock ordering is even easier: order objects by their key, thus eliminating the need for the tie-breaking lock.
Deadlocks are a serious problem in real systems. A production application may perform billions of lock acquire-release cycles per day. Only one of those needs to be timed just wrong to bring the application to deadlock.
DemonstratedDeadLock deadlocks fairly on most systems:
Multiple lock acquisition is not always as obvious as in
transferMoney, the two locks need not be acquired by the same method.
While no method explicitly acquires two locks, callers of
getImage can acquire two locks just the same.
If a thread calls
setLocation in response to an update from a GPS receiver, it first updates the taxi’s location and then checks to see if it has reached its destination. If it has, it informs the dispatcher that it needs a new destination. Since both
synchronized, the thread calling
setLocation acquires the
Taxi lock and then the
Dispatcher lock. Similarly, a thread calling
getImage acquires the
Dispatcher lock and then each
Taxi lock. Two locks are acquired by two threads in different orders, risking deadlock.
Invoking an alien method with a lock held is asking for liveness trouble. The alien method might acquire other locks (risking deadloack) or block for an unexpectedly long time, stalling other threads that need the lock you hold.
Calling an alien method with a lock held is difficult to analyze and therefore risky.
Calling a method with no locks held is called an open call, and classes that rely on open calls are more well-behaved and composable than classes that make calls with locks held. Using open calls to avoid deadlock is analogous to using encapsulation to provide thread safety: while one can certainly construct a thread-safe program without any encapsulation, the thread safety analysis of a program that makes effective use of encapsulation is far easier than that of one that does not. Similarly, the liveness analysis of a program that relies exclusively on open calls is far easier than that of one that does not.
Strive to use open calls throughout your program. Programs that rely on open calls are far easier to analyze for deadlock-freedom than those that allow calls to alien methods with locks held.
synchronized block to allow open calls can sometimes have undesirable consequences, since it takes an operation that was atomic and makes it not atomic. In some cases, the loss of atomicity is a problem, and here you will have to use another technique to achieve atomicity. One such technique is to structure a concurrent object so that only one thread can execute the code path following the open call.
Just as threads can deadlock when they are each waiting for a lock that the other holds and will not release, they can also deadlock when waiting for resources.
e.g. If a task requires connections to both databases and the two resources are not always requested in the same order, thread A could be holding a connection do database D1 while waiting for a connection to database D2, and thread B could be holding a connection to D2 while waiting for a connection to D1.
A lock that never acquires more than one lock at a time cannot experience lock-ordering deadlock. Of course, this is not always practical, but if you can get away with it, it’s a lot less work. If you must acquire multiple locks, lock ordering must be a part of your design: try to minimize the number of potential locking interactions, and follow and document a lock-ordering protocol for locks that may be acquired together.
Another technique for detecting and recovering from deadlocks is to use the timed
tryLock feature of the explicit
Lock classes instead of intrinsic locking. Where intrinsic locks wait forever if they cannot acquire the lock, explicit locks let you specify a timeout after which
tryLock returns failure.
While preventing deadlocks is mostly your problem, the JVM can help identify them when they do happen using thread dumps. A thread dump includes a stack trace for each running thread, similar to the stack trace that accompanies an exception. Thread dumps also include locking information, such as which locks are held by each thread, in which stack frame they were acquired, and which lock a blocked thread is waiting to acquire.
Starvation occurs when a thread is perpetually denied access to resources it needs in order to make progress; the most commonly starved resource is CPU cycles. Starvation in Java applications can be caused by inappropriate use of thread priorities. It can slso be caused by executing nonterminating constructs (infinite loops or resource waits that do not terminate) with a lock held, since other threads that need that lock will never be able to acquire it.
Avoid the temptation to use thread priorities, since they increase platform dependence and can cause liveness problems. Most concurrent applications can use the default priority for all threads.
One step removed from starvation is poor responsiveness, which is not uncommon in GUI applications using background threads.
Poor responsiveness can also be caused by poor lock management.
Livelock is a form of liveness failure in which a thread, while not blocked, still cannot make progress because it keeps retrying an operation that will always fail.
Livelock often occurs in transactional messaging applications, where the messaging infrastructure rolls back a transaction if a message cannot be processed successfully, and puts it back at the head of the queue. If a bug in the message handler for a partucular type of message causes it to fail, every time the message is dequeued and passed to the buggy handler, the transaction is rolled back. Since the message is not wack at the head of the queue, the handler is called over and over again. (This is sometimes called the poison message problem) This form of livelock often comes from overeager error-recovery code that mistakenly treats an unrecoverable error as a recoverable one.
Livelock can also occur when multiple cooperating threads change their state in response to the others in such a way that no thread can ever make progress.
The solution of this variety of livelock is to introduce some randomness into the retry mechanism. Retrying with random waits and backoffs can be equally effective for avoiding livelock in concurrent applications.
Liveness failures are a serious problem because there is no way to recover from them short of aborting the application.
The most common form of liveness failure is lock-ordering deadlock. Avoiding lock ordering deadlock starts at design time: ensure that when threads acquire multiple locks, they do so in a consistent order. The best way to do this is by using open calls throughout your program. This greatly reduces the number of places where multiple locks are held at once, and makes it more obvious where those places are.