Problem set 5: Threads, journaling, and project

Turnin

Fill out psets/pset5answers.md and psets/pset5collab.md and push to GitHub. Then submit on the grading server.

You must complete all parts by Wednesday, May 7 at 11:59pm. There will be no further extensions beyond May 7.

A. Threads

Implement multithreaded processes: processes that can have multiple independent threads of control, but that share the same virtual memory space and file descriptor table. Use kernel threads, the design where every user-visible “thread” corresponds to a struct proc.

You’ll add three system calls:

You’ll also update some existing system calls:

You may assume that sys_execv is called only by single-threaded processes.

Run make run-testthread to check your work.

Process IDs and thread IDs

Kernel threading introduces a distinction between process and thread IDs: every user task has one of each, and they may not be equal. fork must allocate one of each.

We recommend storing the thread ID in proc::id_. This may feel backwards but it’s just easier to keep ptable indexed by id_. Add a new member, such as proc::pid_, to store the true process ID. You might add a table, such as proc* pidtable[NPROC], to store processes by true PID, but it’s also possible to rely entirely on ptable.

(For what it’s worth, Linux’s task_struct::pid member is a thread ID, while the true process ID is called a “thread group ID” and stored in task_struct::tgid. Its getpid system call returns tgid and its gettid returns pid.)

sys_clone interface

Your goal is for sys_clone(function, arg, stack_top) to set up a new thread running function(arg) on the given stack, where when that function returns, the thread exits via sys_texit. This means the initial thread stack frame should contain a return address pointing to sys_texit. Multithreading packages typically divide responsibility for this between the kernel’s system call and user-level code in the system call wrapper; most typically the stack setup is left to wrapper code.

Some examples from real operating systems:

Synchronization

In psets/pset5answers.md, explain your synchronization plan for the process state components that can now be shared by multiple tasks. Pay special attention to pagetable_ and the file descriptor table.

Exiting

Process exit for multithreaded processes introduces complications. The sys_exit system call should exit all of the process’s threads, not just the calling thread. But the process’s other threads might be blocked in the kernel, or running on other CPUs! This is different from prior psets, where only a sys_exit system call could cause task deletion.

Exiting a multithreaded process is complicated because:

  1. A task cannot be deleted while its kernel task stack is in use (for reasons explained in problem set 2 and the synchronization invariants).

  2. A task cannot be deleted while its user context is running.

  3. A kernel task cannot be deleted while it’s blocked. For instance, you can’t delete a kernel task that is blocked waiting for a timer (in kernel.cc) or a disk event (in k-chkfs.cc). This is because a blocked kernel task might implicitly hold resources that should be released before exit. (For instance, a kernel task might block while holding a reference to a bcentry; deleting the task without unblocking it would leak the reference.)

Putting this together, the only safe time to delete a task is when the corresponding kernel task is no longer running and its kernel task stack is effectively empty. Your code in pset 2 obeyed this constraint—assuming you got it right—but that’s mostly because process exit was simple in pset 2. Now, with multithreading, exit isn’t as simple. (Other features, such as kill, complicate exiting too.)

So how can the kernel detect whether it’s safe to exit a task? One idea might be to exit a thread immediately before resuming its user context. At that time, (1) the kernel task stack is not in use, (2) the user context is not running, and (3) the kernel task is not blocked. You can decide how to implement this—while obeying the invariants, of course. But there’s already some code that validates process state before resuming user context, in two places in k-exception.S. Maybe you can add some new state, and some code in k-exception.S that checks that state! (See synchronization invariant PSI5.)

Our solutions also modify wait_queue::block_until to check for exiting and interrupted tasks.

A separate issue with exiting is that when a child process exits, it might need to wake up multiple threads in the parent process (because any of the parent process’s threads can wait for a child process). But this issue is simpler to fix, thank goodness.

The later parts of p-testthread.cc check interactions between sys_exit and sys_texit.

B. Journaling (Optional)

Add journal support to your file system. This not only shows you can manage a complex on-disk structure, it also gives you practice in orchestrating a multi-part, complex parallel system to achieve a goal.

Your VFS should ensure that groups of disk writes that should be atomic are committed with a single journal transaction. For example, a free-block-bitmap write that allocates a block should commit in the same transaction as an inode write that references the block.

You may implement any journaling mode. Writeback mode, which does not write data blocks to the journal, is a fine first choice, though it has worse recovery properties than the other modes.

You may assume that the journal is empty at boot, and journal replay support is not required. (Our tests will use obj/chickdeefsck -s IMAGE, which implements offline replay.) But you may implement journal replay if you like. We recommend using the chkfs::journalreplayer base class, which implements the (complicated!) replay analysis procedure.

It can be fun to design your own journal subsystem. The critical questions are:

Give some thought to these issues. What is the simplest solution you can think of that supports allocation, matching, and ordering? But after some thought, most people should look at the in-depth suggestions.

C. Project

For this part, you should design and implement a project of your choice which adds a new feature to your operating system!

Describe your project in a file psets/pset5project.md that you add to your repository. Your writeup should answer these questions:

  1. What was your goal?
  2. What’s your design?
  3. What code did you write (what files and functions)?
  4. What challenges did you encounter?
  5. How can we test your work?

The writeup need not be very long; 300 words can do it if you use the words well.

Before you start working on your project in earnest, make a private Ed post which contains answers to the first and second questions above. A member of the course staff will then respond to that post and provide feedback about whether your idea is in scope and isn't too hard or too easy.

Once you finish your greenlit project, you should update psets/pset5project.md to describe what you built!