Blocking
- A thread is blocked when it takes up no CPU time
- Managing blocked threads is an important operating system task
- Threads frequently need to wait for an event or state to change
- It’s bad to waste CPU time on threads that have no actual work to do
- Wastes energy/battery life, leaves less CPU time for actual work
- Alternative to blocking: polling
- Thread remains runnable
- “Are we there yet? Are we there yet?”
Is this process blocked?
void process_main() {
sys_sleep_for_100_seconds();
}
...
uintptr_t proc::syscall(regstate* regs) {
case SYSCALL_SLEEP_FOR_100_SECONDS: {
unsigned long endticks = ticks + 100 * HZ;
while (long(endticks - ticks) > 0) {
this->yield();
}
return 0;
}
It depends on which thread you mean!
- The user context is blocked for 100 seconds
- The CPU never runs in user mode on behalf of the process
- The kernel task context is never blocked
- The
proc
remains runnable throughout
- Advantages and disadvantages?
Blocking goals
- Blocking should be efficient
- Cheap to block, cheap to wake up
- Blocking should be fine-grained
- Frequently a thread wants to block until some state changes
- Examples: reading from a network connection, waiting for a child process to exit
- Want threads to remain blocked until it’s very likely the state has changed
- Coarse-grained blocking means frequent spurious wakeups: a thread
runs even though the state it’s waiting for has not changed
- Blocking should be comprehensive
- When a user context has no work to do, the corresponding kernel task should also block
- Blocking must be reliable
- No lost wakeups: any thread that should be runnable must eventually run
System calls to illustrate blocking
- Kernel maintains an unsigned integer stage that only increases
sys_set_stage(i)
: Set stage to at least i
- The new stage is ≥
i
and ≥ the old stage
sys_block_until(i)
: Block this process until stage ≥ i
Bad blocking
// helpers
void proc::block() {
this->pstate_ = ps_blocked;
this->yield();
}
void proc::unblock() {
this->pstate_ = ps_runnable;
cpus[runq_cpu_].enqueue(this);
}
...
std::atomic<uint64_t> stage;
case SYSCALL_SET_STAGE: {
stage = max(stage.load(), egs->reg_rdi);
spinlock_guard guard(ptable_lock);
for (int i = 1; i != NPROC; ++i) {
if (ptable[i])
ptable[i]->unblock();
}
return 0;
}
case SYSCALL_BLOCK_UNTIL:
while (stage < regs->reg_rdi) {
this->block();
}
return 0;
- Sleep–wakeup race
- A race condition between the code that blocks (“sleep”) and the code
that unblocks (“wakeup”) causes a lost wakeup, and a task blocks forever
Key problem: Atomic blocking and unlocking
while (stage < regs->reg_rdi) {
********** THE CRITICAL MOMENT **********
this->pstate_ = ps_blocked;
this->yield();
}
- Lost wakeup when
p->unblock()
happens on another CPU at the critical moment
- Despite wakeup and state change, task marks itself blocked and yields,
blocking potentially forever
Synchronization invariants for blocking
- A
proc
may be on at most one CPU’s run queue.
- A CPU’s run queue is protected by
c->runq_lock_
. cpustate::enqueue()
acquires that lock.
- Any context may change any task’s
p->pstate_
from proc::ps_blocked
to proc::ps_runnable
.
- Only task
p
may change p->pstate_
from proc::ps_runnable
to proc::ps_blocked
.
- The
proc::yield()
function must be called with no spinlocks held.
Check twice?
while (stage < regs->reg_rdi) {
this->pstate_ = ps_blocked;
std::atomic_thread_fence();
if (stage >= regs->reg_rdi) {
this->pstate_ = ps_runnable;
}
this->yield();
}
Idea 1: Block, then check
- Mark self as blocked before checking the wakeup condition
- Update wakeup state before searching for blocked tasks
- Locks or atomics can provide ordering
Avoid spurious wakeups
sys_set_stage()
examines every process, blocked or not
- Wakes up every blocked process, regardless of what it’s waiting for
- Prefer \(O(W)\) work, where \(W\) is the number of kernel tasks blocked in
sys_block_until()
Idea 2: Associate each condition with a list of blocked tasks
Linked list of waiters
spinlock waiters_lock;
list<proc, blocking_links_> waiters;
case SYSCALL_SET_STAGE: {
stage = max(stage.load(), regs->reg_rdi);
spinlock_guard guard(waiters_lock);
while (auto p = waiters.pop_front()) {
p->unblock();
}
return 0;
}
case SYSCALL_BLOCK_UNTIL:
assert(!this->blocking_links_.is_linked());
while (stage < regs->reg_rdi) {
spinlock_guard guard(waiters_lock);
waiters.push_back(this);
this->pstate_ = ps_blocked;
this->yield();
}
Linked list of waiters, attempt 2
case SYSCALL_BLOCK_UNTIL:
assert(!this->blocking_links_.is_linked());
while (stage < regs->reg_rdi) {
spinlock_guard guard(waiters_lock);
if (stage < regs->reg_rdi) {
waiters.push_back(this);
this->pstate_ = ps_blocked;
}
this->yield();
}
Linked list of waiters, attempt 3
case SYSCALL_BLOCK_UNTIL:
assert(!this->blocking_links_.is_linked());
while (stage < regs->reg_rdi) {
spinlock_guard guard(waiters_lock);
if (stage < regs->reg_rdi) {
waiters.push_back(this);
this->pstate_ = ps_blocked;
}
guard.unlock();
this->yield();
}
Waiting on multiple conditions
- What if a task wants to block until one of several conditions occurs?
- There’s only one
blocking_links_
per proc
proc
can be a member of at most one linked list at a time
- When are the links needed?
- Only while a kernel task is blocked
- What space is available while the kernel task is blocked?
- Local variables live in kernel task stacks
- Kernel task stacks are preserved while a task is blocked
Wait queues
- The wait queue abstraction represents these ideas
struct waiter
, struct wait_queue
struct waiter {
proc* p_;
wait_queue* wq_;
list_links links_;
};
struct wait_queue {
list<waiter, &waiter::links_> q_;
spinlock lock_;
};
Using wait_queue
wait_queue stage_waiters;
case SYSCALL_SET_STAGE: {
stage = max(stage.load(), regs->reg_rdi);
stage_waiters.wake_all();
return 0;
}
case SYSCALL_BLOCK_UNTIL: {
waiter w;
while (true) {
w.prepare(stage_waiters);
// expands to `stage_waiters.push_back(&w)`
// + `this->pstate_ = ps_blocked`
if (stage >= regs->reg_rdi) {
break;
}
w.maybe_block();
// expands to `this->yield()` + other stuff
}
w.clear();
// expands to `stage_waiters.erase(&w)`
// + `this->pstate_ = ps_runnable`
return 0;
}
block_until
shorthand
case SYSCALL_BLOCK_UNTIL: {
waiter().block_until(stage_waiters, [] (&) {
return stage >= regs->reg_rdi;
});
return 0;
}
Links on stacks (drawing)
- Process 2 calls
sys_block_until(1)
- Process 3 calls
sys_block_until(1)
- Process 4 calls
sys_block_until(1)
- Process 5 calls
sys_set_stage(1)