Problem set 3 – CS 161 2020

You should first merge your code with our pset updates. The easiest way:

git pull https://github.com/cs161/chickadee master

You will then have to fix up any conflicts. Follow Git’s instructions and commit the final result.

A. Initial read and write system calls

Take a look at p-cat.cc, and then run make run-cat. Type some stuff and press Return. p-cat should echo what you typed, one line at a time! The characters you type are echoed to the console by the keyboard driver in a green-blue; the characters printed by sys_write appear in bright white.

Now read and understand the implementations of SYSCALL_READ and SYSCALL_WRITE. You may be interested in the implementation of the keyboardstate object, which supports line buffering; it is in k-devices.hh and k-devices.cc.

Your task has two parts:

The SYSCALL_READ system call waits until a line is available, but the current implementation does this using kernel polling (yielding) rather than blocking. Change it to block instead of yield. You will use a waiter and block_until, and you’ll want to add a wait_queue to keyboardstate (see k-devices.hh and keyboardstate::handle_interrupt in k-devices.cc).
The SYSCALL_READ and SYSCALL_WRITE implementations do not check their arguments for validity, making them seriously dangerous. Change them to verify that the relevant memory ranges are present, user-accessible, and (for SYSCALL_READ) writable. Also check for integer overflow. You may want to augment vmiter with another function to check a range of address permissions.

Use make run-testrwaddr to check your work.

B. File descriptors and VFS

The goal of this part is to implement a VFS skeleton and per-process file descriptor table. This includes significant design effort. By the end of this part, your Chickadee should support:

The dup2 and close system call implementations.
A per-process file descriptor table, initialized for the first process as in Unix (i.e., fds 0–2 refer to the keyboard/console) and copied across fork.
File structures.
A vnode abstraction that can support different kinds of files.
A class derived from vnode for the keyboard/console file system.

When you’re done, the p-testrwaddr.cc and p-cat.cc programs should work as they did before, but using your VFS abstraction. The p-testvfs.cc program should work too.

Design document

Write up a design document for your VFS layer abstraction. Answer the following kinds of questions:

What types of objects make up your VFS layer?
What are the names and types of their methods?
What are the relationships among the objects?
How is memory for each kind of object allocated and freed?
What synchronization invariants must VFS callers obey? For instance, which objects have locks? Is there a lock order necessary to prevent deadlock?
What future OS changes might require updates to the VFS layer? Consider especially support for multithreaded processes.
Which functions rely for safety on checks made elsewhere in the system?
Which VFS functions, if any, can block?

Please write up the design document by 5pm on Monday 3/9 and check it in to your repository. You can use Markdown (psets/pset3vfs.md), plaintext, or Google doc (post a link to the doc in your repository). We threw together an example for your reference.

Here are some references to other operating systems’ VFS layers. Your design should not be as complicated as that of FreeBSD or Linux! It should probably be more complex than that of xv6.

OS/161: kern/include/fs.h, kern/include/vnode.h
xv6: file.h

xv6 implements the object-oriented aspect of the VFS layer in a particularly simple way: each “device class” supports just two operations, read and write.
FreeBSD: sys/sys/file.h, sys/sys/vnode.h, sys/sys/filedesc.h

The object-oriented aspect is defined in file.h, starting around here.
Linux: include/linux/fs.h, include/linux/fdtable.h

Some places to concentrate: struct files_struct (a file descriptor table) is defined in fdtable.h. struct file (an open file) is defined around here in fs.h. struct inode (a “vnode” structure) is defined around here in fs.h. The object-oriented aspects of the VFS layer, namely the abstract operations for which each file system provides its own implementations, are broken into two parts, struct file_operations and struct inode_operations; they are defined around here.

Note on keyboard/console

We referenced a “keyboard/console file system” above. Though that might sound surprising, it follows Unix’s typical behavior. Try running this Unix program to see what we mean.

#include <unistd.h>
#include <stdio.h>

int main() {
    write(1, "Hello\n", 6);
    close(1);
    write(0, "Hello again\n", 12);
    dup2(0, 1);
    write(1, "Hi\n", 3);
}

(To compile and run a one-file program like this, run c++ FILENAME.cc; ./a.out. Or tell the compiler a better name for the executable: c++ -o EXECUTABLE FILENAME.cc; ./EXECUTABLE.)

Implementation

Implement your design. Keep track of any changes required to the initial design; describe those changes briefly in your pset3answers.md file. Again, use make run-testvfs to check your work.

C. Pipes

Implement the SYSCALL_PIPE system call. Test your work with make run-testpipe.

Some notes:

The sys_pipe wrapper function (provided for you) expects the kernel to return either a negative error code on error, or the (positive) 64-bit value rfd | (wfd << 32), where rfd and wfd are file descriptor numbers. The wrapper function translates this calling convention into the more usual pipe(int pfd[2]) convention.
Aside from normal behavior, the p-testpipe program tests that reading from a write end, or writing to a read end, returns E_BADF; that reading from a pipe with closed write end returns 0 (EOF); that writing to a pipe with closed read end returns E_PIPE; and various properties of file descriptor inheritance to children.
Typically pipes are implemented using a bounded buffer abstraction. We built a correctly-synchronized bounded buffer in CS 61; you may want to refer to that.

You will want to check that any dynamically-allocated memory for pipes is freed at the appropriate times.

Use make run-testpipe to check your work.

Keep track of any changes pipes require to your initial VFS design. Describe those changes briefly in your pset3answers.md file.

D. Memfs

Parts D and E are independent.

The Chickadee kernel image contains a set of binaries—things like obj/p-allocator, the allocator process, or obj/p-testpipe. These binaries comprise an in-memory file system where each file consists of contiguous kernel memory. Each file on the file system is defined by a struct memfile (see k-devices.hh). In this part, you will let user processes access that in-memory file system via the SYSCALL_OPEN system call.

SYSCALL_OPEN takes two arguments, const char* pathname and int flags. Here’s how it should work for now.

The pathname is a null-terminated string. The kernel must validate that the pathname is in valid memory and return E_FAULT if it is not. You’ll need to implement a new validation helper function, because unlike with read and write, the size of the string isn’t passed explicitly!
The kernel searches for a memfile with the given pathname in the memfile::initfs array. If the pathname isn’t found:
- If the OF_CREATE flag is present, the kernel should create a new empty file with that name, returning E_NOSPC if out of room.
- Otherwise, the kernel should return E_NOENT.
The memfile::initfs_lookup function implements much of this logic for you; see its definition in k-devices.cc.
If the OF_TRUNC flag is present, the kernel should set the file’s length to zero.
Finally, the kernel should add a new file descriptor to the process’s file descriptor table. This new file descriptor should refer to the given memfile.
- The file descriptor is open for reading and/or writing depending on whether OF_READ and/or OF_WRITE are present in the open flags. An attempt to read from a file descriptor open only for writing, or to write to a file descriptor open only for reading, should return E_BADF.
- Writing beyond the end of file should extend the file’s length (see memfile::set_length).

The current memfile::initfs is unsynchronized (because the current kernel never modifies it), so you will need to add synchronization. Write up your synchronization plan.

Use make run-testmemfs to check your work.

Keep track of any changes required to your initial VFS design. Describe those changes briefly in your pset3answers.md file.

E. Execv

Parts D and E are independent.

Implement the SYSCALL_EXECV system call, which should replace the current process image with a fresh process image running a binary found on the memfs.

The user-level sys_execv wrapper passes three arguments to the kernel: %rdi contains const char* pathname, the name of the binary; %rsi contains const char* const* argv, an argument array; and %rdx contains int argc, the number of valid arguments in the argument array. The initial kernel system call implementation should:

Validate its arguments: the pathname argument should be a valid user-readable null-terminated string. (For argv and argc, see below.)
Look up the memfile named by pathname. Return E_NOENT if not found.
Create a new page table and use a memfile_loader and proc::load to populate it. If this fails (because of out-of-memory, for example), free any allocated memory and return the appropriate error. (Note: The provided memfile::initfs_lookup and proc::load functions already implement much of the required error checking.)
Allocate and map a new stack page into the new page table, and map the console into the new page table. If this fails, free any allocated memory and return E_NOMEM.
At this point, the system call will succeed. The remaining steps will likely stomp over the regstate that was stored in the kernel stack by syscall_entry.
Install the new page table and initialize a new register set in the proc using proc::init_user. Then change regs_->reg_rip and regs_->reg_rsp to point to the new entry point and stack page (see boot_process_start). Note that the process’s file descriptor table and child processes do not change.
Use set_pagetable to install the new page table into the CPU. Then free the old page table and all the user memory it references.
Resume the process using proc::yield_noreturn. (You must use proc::yield_noreturn because you are returning to proc::regs_.)

Use make run-execallocexit to test execv without arguments. This isn’t a stress test; you might want to develop your own test to check for absence of memory leaks.

Then it’s time for argument passing. Your goal is to set up the new process so that its %rdi register contains argc and its %rsi register contains a pointer to strings equivalent to argv. For instance, make run-exececho should print

About to greet you...
hello, world

But passing arguments to a process isn’t trivial. You can’t just copy the input argv into the new %rsi register, because the memory pointed to by argv has been obliterated as part of the exec procedure! The kernel must copy the arguments to new user-accessible memory, and then map that memory into the new page table. But where? For a hint, compile and run this program:

#include <stdio.h>
#include <stdint.h>

int main(int argc, char** argv) {
    uintptr_t rsp;
    asm volatile("movq %%rsp, %0" : "=r" (rsp));
    printf("%%rsp 0x%lx\n", rsp);
    printf("argc %d\n", argc);
    printf("argv %p (%%rsp+0x%lx)\n",
           argv, reinterpret_cast<uintptr_t>(argv) - rsp);
    for (int i = 0; i < argc; ++i) {
        printf("argv[%d] @%p: %p (%%rsp+0x%lx) \"%s\"\n", i, &argv[i],
               argv[i], reinterpret_cast<uintptr_t>(argv[i]) - rsp,
               argv[i]);
    }
}

Setting up arguments is the most fiddly part of this problem set; it rivals the buddy allocator for potential for errors and confusion. Draw your argument layout carefully with pencil and paper, and think in advance about how you will test your code. For instance, could you organize argument passing into a function that could be unit tested separately from exec?

F. Shell

Run make run-sh. You now have a shell. If your work on Parts C and E is correct, you’ll be able to run command lines like echo foo | wc. If your work on Parts D and E is correct, you’ll be able to run cat emerson.txt or echo Hello > hi.txt; cat hi.txt. There’s no work to do for this part. Play around!

Turnin

Fill out psets/pset3answers.md and psets/pset3collab.md and push to GitHub. Then submit on the grading server.

Initial checkin: Your VFS design document is due by Monday 3/9 at 5pm.
Final checkin: You should be able to complete all parts by 11:59pm Monday 3/16, during spring break. (Happy break!) We won’t take off points for pset 3 until Monday, 3/23, but we will release pset 4 during break, and you will want to get started on it then.

Problem set 3: File descriptors and VFS