This is not the current version of the class.

Problem set 2: Process hierarchies and wait queues

Preparation

In pset1, Chickadee displayed visual results to us via the kernel-implemented memviewer visualization. However, some of Chickadee's user-level test programs need to write console output to the screen via console_printf(). To enable user-level console outputs, update boot_process_start() so that the virtual address 0xB8000 is mapped to the physical address 0xB8000 (i.e., ktext2pa(console)). This mapping will allow a process to access console via a virtual address whose value is the same as the physical address of console in the kernel text. [You might wonder, "what is the origin of the magic number 0xB8000?" It's a good question! As it turns out, VGA graphics hardware learns what information to display by looking at writes made to a block of memory addresses starting at 0xB8000. For more information, see here and here!]

User-level test programs that need to invoke console_printf() must first invoke sys_consoletype(int type), passing in the appropriate type argument. To determine the appropriate value for type, poke around in Chickadee's kernel code and the code of preexisting user-level programs that call sys_consoletype(int type). As you work through this pset and encounter new user-mode programs that invoke console_printf(), you'll need to add the appropriate invocation to sys_consoletype(int type) at the beginning of process_main().

A. Exit

Implement the sys_exit system call.

Implementing exit

Your initial version of exit should do the following:

When you’re done, make run-allocexit should show an endless explosion of different colors as new processes are created and old processes exit (freeing memory). There should be no leaks. (A leak will appear as a “free” physical page—a period in the physical page map—that is never reused.)

To make exit work in Chickadee, you’ll need to obey several invariants.

And you’ll probably need to change your code for sys_fork, and possibly other code, to avoid leaks.

B. Sleep

Implement the sys_msleep system call, which sleeps (i.e., does not return to its caller) until at least N milliseconds have passed.

For now, sys_msleep should return 0. In a future part, you will add support for system call interruption, and sys_msleep will return the error constant E_INTR if it is interrupted.

To add the system call, you’ll add a system call number to lib.hh, complete the user-level wrapper in u-lib.hh, and add an implementation case to proc::syscall. The kernel implementation will use the kernel’s ticks variable, which is a global variable incremented by the timer interrupt handler every 0.01 seconds. (So you’ll want to round the sleep argument up to the nearest multiple of 10 milliseconds.)

Use make run-testmsleep to test your work. You should see 8 lines in sequential order, where the "sequence number" of a line corresponds to the leftmost number in the line.

C. Parent processes

Implement parent processes and the sys_getppid system call.

In Unix, every process has a parent; the getppid system call returns the parent’s process ID. If a parent exits before a child, the child is reparented to have parent process ID 1. Process ID 1 is special: that process, called the init process, corresponds to an unkillable kernel task. Chickadee follows this design.

First implement the easy cases. These can be completed before you figure out performance and synchronization.

Use make run-testppid to test your work. You should succeed through the “tests without exit succeeded” line. (The test program depends on a working sys_msleep implementation from Part B, though the rest of the code in this part is independent of sys_msleep.)

Implementation note: As you add content to struct proc, you may need to update your pset 1 stack canary code, especially if you hard-wired offset constants into assembly. Try to keep your stack canaries working as you go through the problem sets; they have caught real bugs in prior students’ code!

Then, implement exit and process reparenting.

Write up in pset2answers.md:

Performance

When a process exits, the kernel must reparent its children. This should take O(C) time for a process with C children (rather than, say, O(P) time, where P is the total number of processes in the system).

Implementing O(C) exit will require tracking additional per-process metadata, which your writeup should describe.

Synchronization

Chickadee processes so far have been quite isolated: no action performed by one running process ever accessed another running process’s state. Here is a visualization of kernel programming under these conditions [1]:

Teletubbies

Starting in this portion, actions taken by one running process can affect other running processes. And Chickadee is a multiprocessor kernel, so these processes might be running simultaneously, in kernel mode, on different CPUs. What happens if one process exits while one of its children is calling sys_getppid? What happens if a process and one of its children are exiting simultaneously? It is incredibly important to implement correct synchronization or the kernel could break. Here is a visualization of kernel programming under these conditions:

Hellatubbies

Correct synchronization requires planning ahead. A good approach is to write down synchronization invariants: what locks are required to examine or modify particular data items? Chickadee already has a couple synchronization invariants; you will now start adding your own.

The simplest invariants are usually the most coarse-grained. (Early multicore operating systems used a giant lock approach, or “big kernel lock,” in which the entire kernel was protected by a single lock!) For instance, perhaps any change to the process hierarchy should require the ptable_lock. Finer-grained invariants are also possible when you reason about state. For example, a global process_hierarchy_lock could protect access to the process hierarchy (that is, process parent and child links). This is somewhat more scalable than using ptable_lock for everything since some accesses to ptable don’t require the process_hierarchy_lock, and some accesses to the process hierarchy don’t require the ptable_lock. Or you can have a lock per process, if you can make it work. (For what it’s worth, Linux uses a global “task list lock”, which is like a process_hierarchy_lock, but implements it as a readers-writer lock rather than a spinlock. The super-scable sv6 operating system uses a lock per process. Remember that correctness matters way more than performance for us.)

Describe your synchronization invariants in your writeup.

D. Wait and exit status

Implement the sys_waitpid system call and the exit status argument to sys_exit. Use the p-testwaitpid and p-testzombie programs to check your work.

Specification

Chickadee sys_waitpid implements a subset of the Unix waitpid behavior. It implements blocking and polling, exit status return, and selective waiting, but does not (yet) implement signal reporting, job control, or process groups.

pid_t sys_waitpid(pid_t pid, int* stat = nullptr, int options = 0)

Implementation notes

sys_waitpid will change the mechanism for freeing exited processes. Exited processes still become non-runnable, but their exit status must be preserved, and their PIDs not reused, until their parent calls sys_waitpid. You’ll need to add some stuff to struct proc.

You’ll also need to worry, again, about synchronization invariants. waitpid involves cross-process communication, so you must ensure that your kernel sys_exit handler synchronizes as appropriate with your kernel sys_waitpid handler, and with other code. (For instance, sys_waitpid must not free a zombie process while another CPU is running on the corresponding kernel task stack!) Your writeup should describe your synchronization invariants.

System calls and assembly functions can define their own calling conventions. We recommend that the kernel return both return value and exit status in one register (possible since both process IDs and statuses are 32-bit ints); the user-level system call wrapper in u-lib.hh should unpack that register and do the right things with its parts.

Use p-testwaitpid, p-testzombie, and p-allocexit to check your work. The p-testwaitpid program tests W_NOHANG functionality before it tests blocking functionality, so you can test a simple, nonblocking version of the system call first. The p-testzombie program also tests important sys_waitpid functionality. Incorrect exit and waiting logic will cause p-allocexit to grind to a halt; make sure it still runs indefinitely.

What should happen if a process exits without reaping all its zombie children? A long run of p-allocexit will test this case.

You'll need to update your init process to sit in a loop and waitpid() for its children. Due to reparenting, those children might not be the ones that init started with!

E. Halting

Change your init process’s code to call process_halt() once there are no more runnable processes in the system (that is, when all user processes exit). This function, defined in k-hardware.cc, will exit QEMU automatically if HALT=1 was supplied on the command line.

Test your work by running make HALT=1 run-testhalt. QEMU should automatically exit a second after the success message is printed.

F. True blocking

Chickadee now has two blocking system calls, sys_msleep and sys_waitpid, but (unless you were really ambitious) neither of these system calls has a truly blocking implementation. User processes calling these system calls appear to block, but their corresponding kernel tasks remain runnable and repeatedly yield. This is easy to program but inefficient. It’s far better for the kernel tasks corresponding to blocked processes to become non-runnable.

The problem can be quantified. We added a counter to each process that counts how many times that process resumes. With a proc::yield–based implementation of sys_msleep, the p-testmsleep program results in more than 100,000 total resume events!

In this problem you will add true blocking to Chickadee, while avoiding the dreaded sleep-wakeup race condition, also known as the lost wakeup problem.

Lost wakeups in sys_waitpid

To understand lost wakeups, consider broken pseudocode like this for waitpid:

waitpid(proc* parent, ...) {
    ...
    while (/* no child is ready */) {
        parent->pstate_ = proc::blocked;
        parent->yield();
    }
    ...
}

exit(proc* p, ...) {
    ...
    proc* parent = ptable[p->ppid_];
    // wake up the parent; locking elided
    parent->pstate_ = proc::ps_runnable;
    cpus[parent->cpu_].enqueue(parent);
    ...
}

Looks OK, right? waitpid blocks by setting proc::pstate_ to ps_blocked; exit unblocks the parent by setting its pstate_ to ps_runnable and enqueuing it on its home CPU. It’s not OK. If the parent calls waitpid on one CPU while one of its children calls exit on another:

waitpid:                                    exit:
                                                parent->pstate_ = proc::ps_runnable;
    parent->pstate_ = proc::ps_blocked;
    parent->yield();
    CPU dequeues `parent`, doesn’t run it
                                                cpus[0].enqueue(parent);
    ...later, CPU dequeues `parent` again,
    but still doesn’t run it, since its
    `pstate_` is still `ps_blocked`!

The parent blocks indefinitely, even though it has a child with status ready to collect. There’s a race condition between sleeping (waitpid) and wakeup (exit).

Correctness requires that when wakeup and sleep occur simultaneously, wakeup must win. Therefore, the sleeping and wakeup code must synchronize using some lock. The sleeping code must assign proc::pstate_ to ps_blocked with the lock held, and validate that it should go to sleep before releasing that lock. But what lock?

Wait queues

It’s obvious in exit what process to wake: it is the unique parent process of the exiting process. Blocking in sys_msleep is different: which processes should the timer interrupt wake up? We need a queue of processes waiting for a timer interrupt.

Operating system kernels are full of different conditions that must be awaited, so kernels provide data types that simplify the process of waiting. You’ll implement a wait queue data type. To introduce a condition, the kernel declares a wait_queue. A sleeping kernel thread initializes and manipulates a waiter object, in the following way:

waiter w;
while (1) {
    w.prepare(&waitq);
    if (this process should wake up) {
        break;
    }
    w.block();
}
w.clear();

The wait_queue object contains a spinlock for synchronization and a list of blocked waiters. The waiter object stores the wait queue and the blocked process, allowing wakeup code to find and wake up every waiting process.

Read more about wait queues

Implementation

G. System call interruption

In the last part of the problem set, you will implement system call interruption. Specifically, if a child exits while a parent is blocked in sys_msleep, then the parent’s sys_msleep should return early with the E_INTR return code.

Use make run-testeintr to check your work.

Note: This part of the problem set models the Unix signal mechanism. Unfortunately, as we discussed in CS61, the signal mechanism has inherent race conditions that are difficult to solve. It is OK for your interruption mechanism to have similar race conditions as Unix signals. For instance:

However, if a child exits after sys_msleep decides to block, then sys_msleep must return early with the E_INTR return code.

Hint: You may need to add extra member variables to struct proc.

Turnin

Fill out psets/pset2answers.md and psets/pset2collab.md and push to GitHub. Then submit on the grading server.

Intermediate checkin 1: Turn in parts A–B by 11:59pm Monday 2/14.
Intermediate checkin 2: Turn in parts C–D by 11:59pm Monday 2/21.
Final checkin: Turn in all parts by 11:59pm Monday 2/28.

Reminder about intermediate checkins: We do not grade intermediate checkins; their purpose is simply to keep you working and on track. If you want to discuss your code after an intermediate checkin, make sure it’s pushed to the grading server and see us during office hours.