This is not the current version of the class.

Lecture 21: RCU

Notes by Thomas Lively

Front matter

Problem set 4 is no longer stalled

See new test program, system call, etc.

RCU: Read-Copy-Update

Here's the paper

Remember the RW lock:

struct rwlock {
    atomic<int> v_;
    
    lock_read() {
        while (v_ >= 0 && cas(&v_, v, v+1)) {}
    }
    
    unlock_read() {
        --v_;
    }
    
    ... see last notes

Problems with RW lock:

Let's talk about consistent reads:

int i; // global

int tmp = i; // load `i`

This works in practice (on x86) but is not allowed according to the rules!!! And the rules say "Violators may be prosecuted" (rules, page ii).

Let's follow the rules this time:

std::atomic<int> i; // global
int tmp = i.load(std::memory_order_relaxed); // AFAPBNUB

(AFAPBNUB = As far as possible but no undefined behavior)

This follows the rules, but only for smallish values. What do we do in the following situation?

struct largestruct {
    char[10000] c;
}

spinlcok l_lock;
largestruct *l; // global
std::atomic<largestruct*> l;

// read
largestruct *tmp = l.load();

// write
largestruct *n = allocate();
l_lock.lock();
largestruct* ptr = l.load();
ptr.store(n);
l_lock.unlock();
kfree(old);

Now we have a lock but at least we have consistent lock-free reads.

But we don't know when we can safely free the data!

Goal: Let's delay the free until readers are done (without a read lock). We can do this because freeing data:

Let's consider an example syscall

case SYSCALL_BAD: {
    x, y, z
    if (x == 0) {
        largestruct *tmp = l.read();
        // calculate checksum ...
        return checksum;
    } else {
        // write large struct
        largestruct *n = ...;
        memcpy(&n->c, y, z);
        l.store(n);
    }
    ...
}

We can have multiple threads reading and writing simultaneously. We want to know when an old snapshot is no longer visible to any readers.

Once a reader gets back to user mode it is impossible for it to be looking at the old snapshot or ever get another reference to it. So we just need to make sure every CPU has called cpu::schedule before we free the old object.

Interrupts can break the RCU guarantee. If an interrupt happens during an RCU critical section, and the interrupt causes the RCU-using task to be rescheduled, the RCU system can get confused and free something that is still referenced. So we have to disable interrupts or set a flag to disable preemption.