Deadlock

Consider the following threaded code:

Figure 1

#include <pthread.h>
#include <stdlib.h>
#include <unistd.h>

int i;

void
increment(void) {
    i++;
}

void*
thread_1(void *arg) {
    int j;
    for (j = 0; j < 100; ++j)
        increment();

    pthread_exit(NULL);
    return NULL;
}

void*
thread_2(void *arg) {
    int j;
    for (j = 0; j < 100; ++j)
        increment();

    pthread_exit(NULL);
    return NULL;
}

int
main() {
    i = 0;
    pthread_t t;

    // don't really care about the thread_t id
    pthread_create(&t, NULL, thread_1, NULL);
    pthread_create(&t, NULL, thread_2, NULL);
}

In this unsafe code, we have the race condition where two threads could simultaneously try to update the value of i, which ultimately leads to unpredictable behavior. As discussed in the previous lectures, one of the solutions to fix the race condition is to use locking. However, locking using threads can lead to the dangerous situation of deadlock when two or more threads each attempt to acquire a lock while holding on to another lock to which another thread needs access. We will later broaden this definition to show how deadlock is possible in other circumstances.

We can simulate this situation in the above code by adding in two locks:

Figure 2

#include <pthread.h>
...

pthread_mutex_t lock_1, lock_2;
...

void*
thread_1(void *arg) {
    int j;
    for (j = 0; j < 100; ++j) {
        pthread_mutex_lock(&lock_1);
        pthread_mutex_lock(&lock_2);
        increment();
        pthread_mutex_unlock(&lock_2);
        pthread_mutex_unlock(&lock_1);
    }

    pthread_exit(NULL);
    return NULL;
}

void*
thread_2(void *arg) {
    int j;
    for (j = 0; j < 100; ++j) {
        pthread_mutex_lock(&lock_2);
        pthread_mutex_lock(&lock_1);
        increment();
        pthread_mutex_unlock(&lock_1);
        pthread_mutex_unlock(&lock_2);
    }

    pthread_exit(NULL);
    return NULL;
}

int
main(int argc, char *argv[]) {
    //initialize the locks with the macro
    lock_1 = PTHREAD_MUTEX_INITIALIZER; 
    lock_2 = PTHREAD_MUTEX_INITIALIZER; 

    ...
}

While this example is somewhat contrived, we see that in this situation, thread_1 can acquire lock_1 and then after a context switch to thread_2 can acquire lock_2. This leaves both threads waiting on a shared resource that is being used by the other. Since neither thread is able to acquire either the respective locks, no forward progress can be made in the code and the program can be said to be deadlocked.

The four following conditions are necessary for any system to be in deadlock:

  1. Mutual Exclusion
  2. Hold and Wait
  3. Circular Waiting
  4. No Preemption
Each of the above conditions must be held simultaneously for deadlock to occur.
For the above example deadlock-prone code, we can quantify each of the conditions:

The above conditions are by no means exclusive to threads. Processes may encounter similar circumstances when waiting for shared resources such as printers or even when using file locks.

Returning to the threaded example thread_1 and thread_2 become deadlocked because the order in which the resources are acquired necessitates it. Better resource allocation would allow each thread to lock the mutexes in an order that does not prevent the other thread from any further progress (although it may temporarily block waiting for the release of resources, at least one thread is able to make forward progress). Resource ordering in this case is quite simple: acquire the locks in the same order in each thread.

Figure 3

   //both threads acquire and release locks in this order
        pthread_mutex_lock(&lock_1);
        pthread_mutex_lock(&lock_2);
        increment();
        pthread_mutex_unlock(&lock_2);
        pthread_mutex_unlock(&lock_1);

For example if thread_1 executes first, if it obtains lock_1, then there is no way for it to block waiting for lock_2 since thread_2 cannot have obtained it first (otherwise thread_2 would have acquired lock_1 first and thread_1 would not have). While this is a conveniently trite example since there are only two threads attempting to acquire two different locks, Nutt describes on page 335-337 the Dining Philosophers problem in which multiple philosophers (read: threads) attempt to acquire their left fork and right fork (read: two different locks).

For multiple threads we have to considered a better resource ordering. For example, let's say that each philosopher has the functions lock_left_fork and lock_right_fork, which correspond to locking the appropriate resources. If each philosopher calls the lock_right_fork function first, then there exists the possibility that all of the them will be unable to proceed since all of the forks will be locked (assuming all the lock_right_fork calls executed) but none of the philosophers will be able to lock their left fork. Nutt presents an alternate solution on page 388 where one philosopher always calls lock_left_fork first. Because of this new resource ordering, at least one philosopher will always be able to acquire both of their forks. I leave it as an exercise to the reader to prove all the cases involving different fork-locking orderings with the left-first philosopher.

Another possibility is to allow for threads (or processes) to test whether a lock is set before blocking waiting for the resource. By using the ability to check on a lock's status, a thread can alter its behavior if the resource is locked. We can visualize this with some pseudo-code for the philosopher's dinner:

Figure 4

    //pretend there is some philospher code above this

     lock_right_fork();
     if(lock_left_fork() < 0) { //the lock failed
         release_right_fork();
     }
     
     //some code here to loop again if both locks aren't acquired
     //or do some eating if you have both forks

By releasing the fork in hand if we can't acquire both, no predetermined resource ordering is needed. The left-first philosopher is still welcomed at the table but even without the left-first philosopher dinner can still continue.

We can apply this in a more concrete manner to our toy example from the beginning where the locks were acquired out of order.

Figure 5

       //thread_1's locking code

        //busy waits trying to acquire lock_1 (exercise to reader to make it not busy wait)
        while (pthread_mutex_trylock(&lock_1) < 0 && errno == EBUSY);
	while(1) {
	     pthread_mutex_lock(&lock_1);
	     if (pthread_mutex_trylock(&lock_2) < 0 && errno == EBUSY) { 
		  //someone else has the lock!
		  pthread_mutex_unlock(&lock_1);
	     }
	     else if (errno != EINVAL) {
		  //we have both locks
		  break;
	     }
	     else 
	          //the lock call could've blown up so this is just good coding.
		  exit(-errno); 
	     }
	}						       
        increment();
        pthread_mutex_unlock(&lock_2);
        pthread_mutex_unlock(&lock_1);

The code for thread_2 would just attempt to acquire the locks in the reverse order. However, the difference between this code and that in Figure 2 is that if unsuccessful, thread_1 releases lock_1 which could potentially allow thread_2 to execute. By using this we no longer need to identify before run-time all the possible ordering situations that might occur and then statically alter one or more threads that acquires the resources differently to avoid any deadlocking situations. We can instead allow threads to voluntarily give up their locks before all the necessary locking has occured (but before any work requiring locks has been done) if not all the locks can be obtained so that other threads may try to acquire them. This effectively removes the No Preemption condition from deadlock.


Further Reading

Wikipedia: Deadlock
Wikipedia: Thread
Linux pthreads Tutorial