A synchronization bug
What potential bug was addressed by commit d12e98cdb959bb9cdb85fc8e1b0878733026388e? Describe a possible execution of the old code that could violate some kernel invariant or otherwise cause a problem.
The old code violates a Chickadee invariant, which is that the currently running process’s
yields_
andregs_
can be set only when interrupts are disabled and remain disabled until the nextproc::yield_noreturn()
. Violations of this invariant can cause lost wakeups and crashes.A correct execution with nested resumption states
When a Chickadee context switch or exception occurs, Chickadee saves resumption state on the relevant kernel task stack. Because kernel tasks are suspendable, this resumption state can be nested. For example, this can happen:
- A process makes a system call. The architecture disables interrupts, twiddles some registers, and jumps to
syscall_entry
.syscall_entry
savesregstate
resumption state on the kernel task stack and callsproc::syscall
.- The system call implementation enables interrupts and later calls
proc::yield
.proc::yield
savesyieldstate
resumption state on the stack.- Before it finishes, an interrupt occurs. The architecture disables interrupts, pushes a partial
regstate
onto the CPU stack, and jumps toexception_entry
.exception_entry
moves theregstate
to the kernel task stack (above theyieldstate
) and completes it, then callsproc::exception
.proc::exception
callsproc::yield
.proc::yield
savesyieldstate
resumption state on the kernel task stack.
proc::yield
stores a pointer to thatyieldstate
in theproc::yields_
member, disables interrupts, and switches to thecpustate
stack.Only in step 9 does the kernel change to another stack (first the CPU stack, and then, potentially, another kernel task stack). That’s why step 9 stores a pointer to the resumption state in a location independent of stack depth (
proc::yields_
). Until step 9,%rsp
and local variables suffice to tell the kernel where to resume.Later, when the kernel resumes the yielded process, steps 5–8 will be undone.
- (undoes 8)
proc::resume
loads%rsp
with the value stored inproc::yields_
, erasesproc::yields_
, pops callee-saved registers from the on-stackyieldstate
, and executes theretq
instruction.- (undoes 7) That returns to
proc::exception
. Assumeproc::exception
then returns.- (undoes 6) The second half of
exception_entry
reloads registers from the on-stackregstate
and…- (undoes 5) executes
iretq
.At this point, the
proc::yield
execution resumes. The process is going to sleep again! This shouldn’t be a problem—and it isn’t.
proc::yield
stores a pointer to theyieldstate
in theproc::yields_
member, disables interrupts, and switches to thecpustate
stack.Later, when the kernel resumes the re-yielded process again, steps 1–4 are undone.
- (undoes 14 and 4)
proc::resume
loads%rsp
with the value stored inproc::yields_
, pops callee-saved registers, and executesretq
.- (undoes 3) That returns to
proc::syscall
. Assumeproc::syscall
then returns.- (undoes 2) The second half of
syscall_entry
skips over the on-stackregstate
and…- (undoes 1) executes
iretq
.And the process resumes.
A problematic execution
This is the old yield code:
// store yieldstate pointer movq %rsp, 16(%rdi) // disable interrupts, switch to cpustack cli movq %rdi, %rsi movq %gs:(0), %rdi leaq CPUSTACK_SIZE(%rdi), %rsp // call scheduler jmp _ZN8cpustate8scheduleEP4proc
The problem triggers when an interrupt occurs immediately before
cli
.exception_entry
will store aregstate
, andproc::exception
will execute, withyields_
set to a non-null value. That’s already weird, but things really go wrong ifproc::exception
then callsproc::yield
. The secondproc::yield
call overwrites the storedyields_
:And the overwritten
yields_
is never recovered. Eventually there will be no place for the process to resume!The fix
In the revised, correct implementation,
yields_
is set after interrupts are disabled. As a result, no exception will overwriteyields_
unexpectedly, and the process always resumes at the correct place.
Red zone
The Chickadee kernel must be compiled with the GCC flag -mno-red-zone
, which
disables the x86-64 red
zone,
a feature of the System V AMD64 ABI (Application Binary
Interface).
Describe what the -mno-red-zone
flag does, and why the Chickadee kernel must
be compiled with that flag.
The
-mno-red-zone
prevents the compiler from using the red zone for kernel functions, so kernel functions will never access data at negative offsets from%rsp
. This is important because of interrupts. If a kernel task runs with interrupts enabled and an interrupt occurs, the processor’s interrupt mechanism stores the five critical registers immediately below the active%rsp
. This would irretrievably overwrite any data the compiler stored in the red zone.
Processor affinity and CPU migration
Multiprocessor operating systems support notions of processor affinity and
CPU migration, in which tasks (including process threads and kernel tasks)
switch from processor to processor as they run. This can be important for load
balancing—maybe all the threads on processor 0 happen to exit at about the
same time, leaving processor 0 idle and the other processors oversubscribed—so
a good kernel scheduler will proactively migrate tasks to mitigate this
imbalance. There are also system calls that manage CPU placement directly,
such as sched_setaffinity
.
But moving a task from one CPU to another is harder than it might appear, because of synchronization invariants.
Design a system-call-initiated CPU migration for Chickadee. Specifically,
describe the implementation of a sys_sched_setcpu(pid_t p, int cpu)
system
call, which should work as follows:
If
cpu < 0 || cpu >= ncpu
, then return an error.Otherwise, if
p != 0 && p != current()->id_
, then return an error (you’ll fix this in the next exercise).Otherwise, the system call returns 0 and the calling thread (process) next executes on CPU
cpu
. That is, when the system call returns 0, the unprivileged process code is executing on CPUcpu
.
Your implementation must obey all Chickadee invariants and
should not cause undefined behavior. You will almost certainly add one or more
members to struct proc
; describe them and any new invariants.
Migrating other processes
Extend your sys_sched_setcpu
design so processes can change
other processes’ CPU placements (that is, support p != 0 && p !=
current()->id_
). Again, your implementation must obey all Chickadee
invariants and should not cause undefined behavior.
Our design of this feature does not add any new proc
members, but it does
add new invariants, and it changes one of the invariants we added above.
Exit design
Problem Set 2, Part B asks you to implement part of a sys_exit
system call. One of the invariants mentioned says that “The kernel task
responsible for the exiting process must delegate its final freeing to some
other logical thread of execution”. Come up with an initial design for this
delegation.
We recommend having a process set its state (
proc::state_
) to some constant that means “in the process of exiting”, and then having the CPU scheduler,cpustate::schedule()
, or perhaps an idle task complete the free.