- A. Exit
- B. Sleep
- C. Parent processes
- D. Wait and exit status
- E. Halting
- F. True blocking
- G. System call interruption
Turnin
Fill out psets/pset2answers.md
and psets/pset2collab.md
and push
to GitHub. Then submit on
the grading server.
Intermediate checkin 1: Turn in parts A–B by 11:59pm Friday 2/16.
Intermediate checkin 2: Turn in parts C–E by 11:59pm Friday 2/23.
Final checkin: Turn in all parts by 11:59pm Wednesday 2/28.
Reminder about intermediate checkins: We do not grade intermediate checkins; their purpose is simply to keep you working and on track. If you want to discuss your code after an intermediate checkin, make sure it’s pushed to the grading server and see us during office hours.
Preparation
First, update your code from our handout repository. You will need a Git remote for the handout repository, if you don’t have one already:
$ git remote add handout git@github.com:CS161/chickadee-s24
$ git fetch handout
$ git pull handout main
$ git push
This should add a bunch of new test programs to your repository, including
p-testwaitpid.cc
.
Second, from here on, processes should start up with console access
automatically enabled. Keep the sys_map_console
system
call you added in pset 1, in case a process wants to map
the console at a different virtual address, but processes should have console
physical memory mapped at virtual address 0xB8000 by default. To enable this,
add this line to a sensible place in boot_process_start
:
vmiter(p, CONSOLE_ADDR).map(CONSOLE_ADDR, PTE_PWU);
This should suffice since all process page tables derive via fork
ed copies
from a boot_process_start
page table.
You should already have handled the console in your syscall_fork
logic, but you might as well check now that it works. fork
should not copy the physical memory corresponding to the console, since that
memory represents a shared device resource. Instead, the child process should
share the parent’s console mappings.
p-testforksimple
will produce wrong output
unless this works.
A. Exit
Implement the sys_exit
system call.
Implementing exit
Your initial version of exit
should do the following:
- Remove the current task from the
ptable
. - Free all memory associated with the current process (which will exercise your buddy allocator from pset 1).
When you’re done, make run-allocexit
should show an endless explosion of different
colors as new processes are created and old processes exit (freeing memory).
There should be no leaks. (A leak will appear as a “free” physical page—a
period in the physical page map—that is never reused.)
You’ll need to maintain several invariants and obey some synchronization constraints.
-
The process table
ptable
is protected by theptable_lock
. -
The current stack page must not be free.
This means that a kernel task, such as the kernel half of a process, cannot fully free itself! (When the kernel is running on behalf of a process, the kernel’s stack pointer is in that process’s stack page.) The kernel task responsible for the exiting process must delegate the final
kfree
—thekfree
of thestruct proc
—to some other logical thread of execution in the kernel.You could delay this final freeing until later in this pset, when you implement
sys_waitpid
, but until thenrun-allocexit
will create at most 15 processes and you’ll miss some potential bugs. But there are also kernel functions that run on their own, on stacks that are never freed.. Perhaps you could use such a function tokfree
exited processes. -
The current page table must not be free.
This means that before freeing a page table, you must ensure that no CPU has that page table installed. Think about which CPUs might have a page table installed when
exit
is called. The function callset_pagetable(early_pagetable)
may be useful; this switches to a page table that maps only the kernel and that is never freed. -
The memviewer examines process page tables, so synchronization is required when freeing process page tables. You can accomplish this one of three ways:
- The memviewer holds the
ptable_lock
, so you may free page tables while holding theptable_lock
. - The memviewer only examines processes in
ptable
, so you may free the page table of a process that is no longer inptable
. (Of course, modifications toptable
require theptable_lock
. You would acquire that lock, remove the process fromptable
, and release the lock, and only then free the corresponding page table.) - Our handout code has stub functions for implementing a process page
table lock. The memviewer calls
proc::lock_pagetable_read()
before reading a process page table andproc::unlock_pagetable_read()
after it’s done. If these functions provided mutual exclusion, they would suffice. However, these functions are stubs that do nothing—if you depend on them, you’ll need to actually implement them.
- The memviewer holds the
Strictly speaking, the memviewer might already cause undefined behavior, because it might access a process’s page table while that process is adding page table mappings on a parallel CPU via
sys_page_alloc
. This violates the First Law of Synchronization (“If two or more threads concurrently access a non-atomic object in memory, then all such accesses must be reads.”). We could eliminate this undefined behavior by usingstd::atomic
in thex86_64_pagetable
type. However, in practice, we do not see serious problems when the memviewer runs in parallel with a thread that adds page table entries, and we do see serious problems when the memviewer runs in parallel with a thread that frees a page table. Why might this be? Think about it—it can give you good insight.
You’ll probably need to change your code for sys_fork
, and possibly other
code, to avoid leaks.
B. Sleep
Implement the sys_msleep
system call, which sleeps (i.e., does not return to
its caller) until at least N
milliseconds have passed.
For now, sys_msleep
should return 0. In a future part, you will add support
for system call interruption, and sys_msleep
will return the error constant
E_INTR
if it is interrupted.
To add the system call, you’ll add a system call number to lib.hh
, complete
the user-level wrapper in u-lib.hh
, and add an implementation case to
proc::syscall
. The kernel implementation will use the kernel’s ticks
variable, which is a global variable incremented by the timer interrupt
handler every 0.01 seconds. (So you’ll want to round the sleep argument up to
the nearest multiple of 10 milliseconds.)
Use make run-testmsleep
to test your work. You should see 8 numbered lines
in sequential order, like this:
1 [pid 3]
2 [pid 2]
3 [pid 9]
4 [pid 6]
5 [pid 7]
6 [pid 4]
7 [pid 5]
8 [pid 8]
You should see lines numbered 1-8 in order.
C. Parent processes
Implement parent processes and the sys_getppid
system call.
In Unix, every process has a parent; the getppid
system call returns the
parent’s process ID. If a parent exits before one or more of its child
processes, those children are reparented to have parent process ID 1. Process
ID 1 is special: that process, called the init process, corresponds to an
unkillable kernel task. Chickadee follows this design.
First implement the easy cases. These can be completed before you figure out performance and synchronization.
- Add a parent process ID member to
struct proc
. - Add the system call: add the system call number to
lib.hh
, complete the user-level wrapper inu-lib.hh
, and add an implementation case toproc::syscall
. - Maintain parent process IDs correctly in
fork
.
Use make run-testppid
to test your work. You
should succeed through the “tests without exit succeeded” line. (The test
program depends on a working sys_msleep
implementation from Part B, though
the rest of the code in this part is independent of sys_msleep
.)
Implementation note: As you add content to struct proc
, you may need to
update your pset 1 stack canary code, especially if you hard-wired
offset constants into assembly. Try to keep your stack canaries working as you
go through the problem sets; they have caught real bugs in prior students’ code!
Then, implement exit
and process reparenting.
- Add a kernel initialization step that creates the init process. Ensure it
has pid 1 and ppid 1. Its initial code can be a loop that calls
proc::yield()
, but this will change as you updateexit
. - Change
kernel_start
so that the initial user process has pid 2 and ppid 1. - Maintain parent process IDs correctly in
exit
. (This is the hard part.)
Write up in pset2answers.md
:
- The design of your per-process metadata (see Performance, below).
- Your synchronization invariants or other synchronization plan (see Synchronization, below).
Performance
When a process exits, the kernel must reparent its children. This should take \(O(C)\) time for a process with \(C\) children (rather than, say, \(O(P)\) time, where \(P\) is the total number of processes in the system).
Implementing \(O(C)\) exit will require tracking additional per-process metadata, which your writeup should describe.
Synchronization
Chickadee processes so far have been quite isolated: no action performed by one running process ever accessed another running process’s state. Here is a visualization of kernel programming under these conditions [1]:
Starting in this portion, actions taken by one running process can affect
other running processes. And Chickadee is a multiprocessor kernel, so these
processes might be running simultaneously, in kernel mode, on different CPUs.
What happens if one process exits while one of its children is calling
sys_getppid
? What happens if a process and one of its children are exiting
simultaneously? It is incredibly important to implement correct
synchronization or the kernel could break. Here is a visualization of kernel
programming under these conditions:
Correct synchronization requires planning ahead. A good approach is to write down synchronization invariants: what locks are required to examine or modify particular data items? Chickadee already has synchronization invariants; you will now start adding your own.
The simplest invariants are usually the most coarse-grained. (Early multicore
operating systems used a giant lock
approach, or “big kernel lock,” in
which the entire kernel was protected by a single lock!) For instance,
perhaps any change to the process hierarchy should require the ptable_lock
.
Finer-grained invariants are also possible when you reason about state. For
example, a global process_hierarchy_lock
could protect access to the process
hierarchy (that is, process parent and child links). This is somewhat more
scalable than using ptable_lock
for everything: some accesses to ptable
don’t require the process_hierarchy_lock
, and some accesses to the process
hierarchy don’t require the ptable_lock
.
Or you could have a lock per process. But this would be hard and we strongly
recommend you implement something simpler first! (For what it’s worth, Linux
uses a global “task list lock”, which is like a process_hierarchy_lock
, but
implements it as a readers-writer
lock rather than a
spinlock. The super-scable sv6 operating
system uses a
lock per process. Remember that correctness matters way more than performance
for us.)
Describe your synchronization invariants in your writeup.
D. Wait and exit status
Implement the sys_waitpid
system call and the exit status argument to
sys_exit
. Use the p-testwaitpid
and p-testzombie
programs to check your
work.
Specification
Chickadee sys_waitpid
implements a subset of the Unix waitpid
behavior. It implements blocking and polling, exit status return, and
selective waiting, but does not (yet) implement signal reporting, job control,
or process groups.
pid_t sys_waitpid(pid_t pid, int* stat = nullptr, int options = 0)
-
pid
names the child process to wait for. Ifpid == 0
, then the system call waits for any child (it returns the first child to exit). Ifpid != 0
and none of this process’s children have that pid, or ifpid == 0
and this process has no children at all, then the system call returnsE_CHILD
. -
options
determines whether the system call should block until a child exits. If it is 0, then the system call should block; if it isW_NOHANG
, then the system call should returnE_AGAIN
rather than blocking if no relevant child has exited. -
If the system call succeeds, then its return value is the process ID of the exited child, and (unless
stat == nullptr
)*stat
is set to the child’s exit status (i.e., the argument given tosys_exit
). -
It is only possible to wait for a child’s status once. After
sys_waitpid
successfully reports a child’s exit status, the relevant PID becomes available for reuse. (The PID is not available for reuse before that point.)
Implementation notes
sys_waitpid
will change the mechanism for freeing exited processes. Exited
processes still become non-runnable, but their exit status must be preserved,
and their PIDs not reused, until their parent calls sys_waitpid
. You’ll need
to add some stuff to struct proc
.
You’ll also need to worry, again, about synchronization invariants.
waitpid
involves cross-process communication, so you must ensure that your
kernel sys_exit
handler synchronizes as appropriate with your kernel
sys_waitpid
handler, and with other code. For instance, sys_waitpid
must
not free a zombie process while another CPU is running on the corresponding
kernel task stack! Your writeup should describe your synchronization
invariants.
System calls and assembly functions can define their own calling conventions.
We recommend that the kernel return both return value and exit status in
one register (possible since both process IDs and statuses are 32-bit
int
s); the user-level system call wrapper in u-lib.hh
should unpack that
register and do the right things with its parts.
Use make run-testwaitpid
, make run-testzombie
, and make run-allocexit
to check your work.
The p-testwaitpid
program tests W_NOHANG
functionality before it tests
blocking functionality, so you can test a simple, nonblocking version of the
system call first. The p-testzombie
program also tests important
sys_waitpid
functionality. Incorrect exit and waiting logic will cause
p-allocexit
to grind to a halt; make sure it still runs indefinitely.
What should happen if a process exits without reaping all its zombie children?
A long run of p-allocexit
will test this case.
You'll need to update your init
process to sit in a loop and waitpid()
for its children. Due to reparenting, those children might not be the ones
that init
started with!
E. Halting
Change your init process’s code to call process_halt()
once there are no
more runnable processes in the system (that is, when all user processes exit).
This function, defined in k-hardware.cc
, will exit QEMU automatically if
HALT=1
was supplied on the command line.
Test your work by running make HALT=1 run-testhalt
. QEMU should
automatically exit a second after the success message is printed.
F. True blocking
Chickadee now has two blocking system calls, sys_msleep
and sys_waitpid
,
but (unless you were really ambitious) neither of these system calls has a truly
blocking implementation. User processes calling these system calls appear to
block, but their corresponding kernel tasks remain runnable and repeatedly
yield. This is easy to program but inefficient. It’s far better for the kernel
tasks corresponding to blocked processes to become non-runnable.
The problem can be quantified. We added a counter to each process that counts
how many times that process resumes. With a proc::yield
–based implementation
of sys_msleep
, the p-testmsleep
program results in more than 100,000 total
resume events!
In this problem you will add true blocking to Chickadee, while avoiding the dreaded sleep-wakeup race condition, also known as the lost wakeup problem.
Lost wakeups in sys_waitpid
To understand lost wakeups, consider broken pseudocode like this for waitpid
:
proc::syscall_waitpid(...) {
...
while (/* no child is ready */) {
this->pstate_ = proc::ps_blocked; // new value
this->yield();
}
...
}
proc::syscall_exit(...) {
...
proc* parent = ptable[this->ppid_];
parent->unblock(); // which in turn calls:
parent->pstate_.compare_exchange_strong(proc::ps_blocked, proc::ps_runnable);
cpus[parent->runq_cpu_].schedule(parent);
...
}
Looks OK, right? waitpid
blocks by setting proc::pstate_
to ps_blocked
;
exit
unblocks the parent by setting its pstate_
to ps_runnable
and
enqueuing it on its home CPU via proc::wake
. But this is not OK. If the
parent calls waitpid
on one CPU while one of its children calls exit
on
another:
waitpid (in parent): exit (in child):
parent->unblock();
// the parent is still runnable,
// so this does nothing
this->pstate_ = proc::ps_blocked;
this->yield();
The parent blocks indefinitely, even though it has a child with status ready to collect. There’s a race condition between sleeping and wakeup.
Correctness requires that when wakeup and sleep occur simultaneously, wakeup
must win. Therefore, the sleeping and wakeup code must synchronize somehow.
The sleeping code must assign proc::pstate_
to ps_blocked
with some lock
held, and validate that it should go to sleep before releasing that lock.
There’s usually a natural lock associated with the blocking decision; in
waitpid
, this is likely the ptable_lock
or a process hierarchy lock.
However, knowing the lock doesn’t solve all problems: there are some
complicated data structure manipulations and special cases to worry about. To
solve these, we turn to an interesting kernel abstraction called wait
queues.
Wait queues
Operating system kernels are full of different conditions that must be
awaited, so kernels provide data types that simplify the process of waiting.
You’ll implement a wait queue data type. To introduce a condition, the
kernel declares a wait_queue
. A sleeping kernel thread initializes and
manipulates a waiter
object, in the following way:
waiter w;
spinlock_guard guard(/* appropriate lock */);
while (true) {
w.prepare(waitq);
if (/* this process should wake up */) {
break;
}
guard.unlock();
w.maybe_block();
guard.lock();
}
w.clear();
(The waiter::block_until()
function automates this loop.) The wait_queue
object contains a spinlock for synchronization and a list of blocked
waiter
s; much like condition variables, a wait_queue
is associated
with a condition for which tasks might wait. The waiter
object represents
the possibility that this task might block. A waiter
contains pointers
to the wait queue and the blocked process, and forms part of a linked list
headed at the wait_queue
; this allows wakeup code to find and wake up every
waiting process. wait_queue
objects are typically global; for instance, you
might have a single wait_queue
for all timers, or one wait_queue
per
proc
for that task’s child processes. waiter
objects, on the other hand,
are ephemeral. They are always declared as local variables on kernel task
stacks. In sum, the wait queue is a linked list whose head is stored in the
kernel’s data or heap segments, and whose members are threaded through various
proc
s’ kernel task stacks.
Implementation
-
Implement a wait queue abstraction. Start from our skeleton in
k-waitstruct.hh
andk-wait.hh
, and read our wait queue notes. -
Use your wait queue to implement correctly-synchronized true blocking for
sys_msleep
. Where is the sleeping code? Where is the wakeup code? As before, check that the tests still work and that the number ofproc::resume
calls has reduced. -
A single wait queue for timer interrupts leads to lots of unnecessary wakeups. To reduce this expense, it’s data structure time. Try implementing a timing wheel: \(N\) wait queues, where queue \(i\) holds all
waiter
s that should be awoken at any time \(\equiv i \pmod N\). (Timing wheels blog post; “Hashed and Hierarchical Timing Wheels: Data Structures for the Efficient Implementation of a Timer Facility”) Or get really fancy and implement a timer heap (Go example; libev uses a 4-heap, which has better cache behavior). -
Implement correctly-synchronized true blocking for
waitpid
andexit
. You can use wait queues for this, but you may be able to get away with callingproc::unblock
directly. Check that the tests still work; use your own tests to validate that blocking is working (i.e., that there are many fewerproc::resume
calls).
Testing
We provide some “torture tests” for your wait queue implementation in
k-testwait.cc
. Activate them with make run-ktestwait
. A successful run
will print something like this (the “done round” numbers may differ):
ktestwait phase 1 commencing
done round 500/1600
done round 1002/1600
done round 1502/1600
ktestwait phase 2 commencing
ktestwait phase 3 commencing
done round 300/1200
done round 551/1200
done round 802/1200
done round 1054/1200
ktestwait phase 4 commencing
done round 31/100
done round 63/100
done round 95/100
ktestwait phase 5 commencing
done round 200/1000
done round 400/1000
done round 600/1000
done round 800/1000
[01;32mktestwait succeeded![m
The torture tests will stress your wait queue implementation, but bugs can be surprisingly elusive and hard to trigger—especially in emulated situations, such as M1 Macs, where the emulator is not truly multi-threaded. You should both test your implementation and think about your implementation.
G. System call interruption
In the last part of the problem set, you will implement system call
interruption. Specifically, if a child exits while a parent is blocked in
sys_msleep
, then the parent’s sys_msleep
should return early with the
E_INTR
return code.
Use make run-testeintr
to check your work.
Note: This part of the problem set models the Unix signal mechanism. Unfortunately, as we discussed in CS61, the signal mechanism has inherent race conditions that are difficult to solve. It is OK for your interruption mechanism to have similar race conditions as Unix signals. For instance:
- If a child exits immediately before the parent calls
sys_msleep
, thensys_msleep
need not be interrupted. - If a child exits while the parent is processing
sys_msleep
, but beforesys_msleep
decides to block, thensys_msleep
need not be interrupted.
However, if a child exits after sys_msleep
decides to block, then
sys_msleep
must return early with the E_INTR
return code.
Hint: You may need to add extra member variables to struct proc
.