pthread_mutex_lock (&mutex_b); | pthread_mutex_lock (&mutex_a); |
TABLE 3.1
Both of the threads shown in Table 3.1 may complete the first step about the same time. Even on a uniprocessor, a thread might complete the first step and then be timesliced (preempted by the system), allowing the second thread to complete its first step. Once this has happened, neither of them can ever complete the second step because each thread needs a mutex that is already locked by the other thread.
Consider these two common solutions to this type of deadlock:
• Fixed locking hierarchy: All code that needs both mutex_a and mutex_b must
• Try and back off: After locking the first mutex of some set (which can be allowed to block), use pthread_mutex_trylock to lock additional mutexes in the set. If an attempt fails, release all mutexes in the set and start again.
There are any number of ways to define a fixed locking hierarchy. Sometimes there's an obvious hierarchical order to the mutexes anyway, for example, if one mutex controls a queue header and one controls an element on the queue, you'll probably have to have the queue header locked by the time you need to lock the queue element anyway.
When there's no obvious logical hierarchy, you can create an arbitrary hierarchy; for example, you could create a generic 'lock a set of mutexes' function that sorts a list of mutexes in order of their identifier address and locks them in that order. Or you could assign them names and lock them in alphabetical order, or integer sequence numbers and lock them in numerical order.
To some extent, the order doesn't really matter as long as it is always the same. On the other hand, you will rarely need to lock 'a set of mutexes' at one time. Function A will need to lock mutex 1, and then call function B, which needs to also lock mutex 2. If the code was designed with a functional locking hierarchy, you will usually find that mutex 1 and mutex 2 are being locked in the proper order, that is, mutex 1 is locked first and then mutex 2. If the code was designed with an arbitrary locking order, especially an order not directly controlled by the code, such as sorting pointers to mutexes initialized in heap structures, you may find that mutex 2 should have been locked before mutex 1.
If the code invariants permit you to unlock mutex 1 safely at this point, you would do better to avoid owning both mutexes at the same time. That is, unlock mutex 1, and then lock mutex 2. If there is a broken invariant that requires mutex 1 to be owned, then mutex 1 cannot be released until the invariant is restored. If this situation is possible, you should consider using a backoff (or 'try and back off') algorithm.
'Backoff means that you lock the first mutex normally, but any additional mutexes in the set that are required by the thread are locked conditionally by
calling pthread_mutex_trylock. If pthread_mutex_trylock returns EBUSY, indicating that the mutex is already locked, you must unlock
The backoff solution is less efficient than a fixed hierarchy. You may waste a lot of time trying and backing off. On the other hand, you don't need to define and follow strict locking hierarchy conventions, which makes backoff more flexible. You can use the two techniques in combination to minimize the cost of backing off. Follow some fixed hierarchy for well-defined areas of code, but apply a backoff algorithm where a function needs to be more flexible.
The program below, backoff.c, demonstrates how to avoid mutex deadlocks by applying a backoff algorithm. The program creates two threads, one running function lock_forward and the other running function lock_backward. The two threads loop ITERATIONS times, each iteration attempting to lock all of three mutexes in sequence. The lock_forward thread locks mutex 0, then mutex 1, then mutex 2, while lock_backward locks the three mutexes in the opposite order. Without special precautions, this design will always deadlock quickly (except on a uniprocessor system with a sufficiently long timeslice that either thread can complete before the other has a chance to run).
15 You can see the deadlock by running the program as backoff 0. The first argument is used to set the backoff variable. If backoff is 0, the two threads will use pthread_mutex_lock to lock each mutex. Because the two threads are starting from opposite ends, they will crash in the middle, and the program will hang. When backoff is nonzero (which it is unless you specify an argument), the threads use pthread_mutex_trylock, which enables the backoff algorithm. When the mutex lock fails with EBUSY, the thread will release all mutexes it currently owns, and start over.
16 It is possible that, on some systems, you may not see any mutex collisions, because one thread is always able to lock all mutexes before the other thread has a chance to lock any. You can resolve that problem by setting the yield_flag variable, which you do by running the program with a second argument, for example, backoff 1 1. When yield_flag is 0, which it is unless you specify a second argument, each thread's mutex locking loop may run uninterrupted, preventing a deadlock (at least, on a uniprocessor). When yield_flag has a value greater than 0, however, the threads will call sched_yield after locking each mutex, ensuring that the other thread has a chance to run. And if you set yield_ flag to a value less than 0, the threads will sleep for one second after locking each mutex, to be
70-75 After locking all of the three mutexes, each thread reports success, and tells how many times it had to back off before succeeding. On a multiprocessor, or when you've set yield_flag to a nonzero value, you'll usually see a lot more nonzero backoff counts. The thread unlocks all three mutexes, in the reverse order of locking, which helps to avoid unnecessary backoffs in other threads. Calling sched_yield at the end of each iteration 'mixes things up' a little so one thread doesn't always start each iteration first. The sched_yield function is described in Section 5.5.2.
¦ backoff.c
1 #include <pthread.h>
2 #include 'errors.h' 3
4 #define ITERATIONS 10 5
6 /*
7 * Initialize a static array of 3 mutexes.
8 */
9 pthread_mutex_t mutex[3] = {
10 PTHREAD_MUTEX_INITIALIZER,
11 PTHREAD_MUTEX_INITIALIZER,
12 PTHREAD_MUTEX_INITIALIZER
13 }; 14
15 int backoff = 1; /* Whether to backoff or deadlock */
16 int yield_flag = 0; /* 0: no yield, >0: yield, <0: sleep */ 17
18 /*
19 * This is a thread start routine that locks all mutexes in
20 * order, to ensure a conflict with lock reverse, which does the
21 * opposite.
22 */
23 void *lock_forward (void *arg)
24 {
25 int i, iterate, backoffs;
26 int status;
27
28 for (iterate = 0; iterate < ITERATIONS; iterate++) {
29 backoffs = 0;
30 for (i = 0; i < 3; i++) {
31 if (i == 0) {
32 status = pthread_mutex_lock (&mutex[i]);