WeensyOS Minilab 1

Extra Credit Only

During Lecture 3, we discussed how to build an operating system that does nothing but print "111" on the console; we used it to learn about hardware interaction and bootstrapping. Well, that kind of teensy-weensy operating system isn't just a thought experiment. It is real.

`weensyos1.tar.gz`	Source code for WeensyOS 1.0, which builds these two hard disk images:
`print111os.img`	Print111OS, which prints "`111!`" to the console.
`passwdos.img`	PasswdOS, which has two cooperative processes: the first writes a password to the second, which checks that password.

In this simple problem set, you'll browse, partially understand, and change these tiny operating systems.

Handing in

You will electronically hand in code and a small writeup containing answers to the numbered exercises. The problem set code, weensyos1.tar.gz, unpacks into a directory called weensyos1. (We explain how to unpack it below.) You'll modify the code in this directory, and add a text file with your answers to the numbered exercises. When you're done, run the command gmake tarball. This should create a file named weensyos1-yourusername.tar.gz. You'll turn in this file to CourseWeb.

Answers to the numbered exercises should be in a file named answers.txt, answers.html, or answers.pdf. Text files are strongly preferred. No Microsoft Word documents (or other binary format, except for PDF) will be accepted! For coding exercises (Exercises 2, 3, 5, and 6), it's OK for answers.txt to just refer to your code (as long as you comment your code).

To review:

Download and unpack weensyos1.tar.gz.
Do your work in the weensyos1 directory.
Put your answers to the numbered exercises in a answers.txt file (or answers.html or answers.pdf) in that weensyos1 directory.
When you're done, run gmake tarball from the weensyos1 directory. This will create a file named weensyos1-yourusername.tar.gz.
Submit that weensyos1-yourusername.tar.gz file to CourseWeb.

Setting up

You could take one of the disk image files this minilab builds, write it to your laptop's hard drive, and boot up your operating system directly if you wanted! However, it's much easier to work with a virtual machine or PC emulator.

An emulator mimics, or emulates, the behavior of a full hardware platform. A PC emulator acts like a Pentium-class PC: it emulates the execution of Intel x86 instructions, and the behavior of other PC hardware. For example, it can treat a normal file in your home directory as an emulated hard disk; when the program inside the emulator reads a sector from the disk, the emulator simply reads 512 bytes from the file. PC emulators are much slower than real hardware, since they do all of the regular CPU's job in software -- not to mention the disk controller's job, the console's job, and so forth. However, debugging with an emulator is a whole lot friendlier, and you can't screw up your machine!

We've used two PC emulators. The Bochs emulator has pretty nice debugging support. The QEMU package is fast and sleek, but it might be too fast for some of our purposes. If you work on your own machine, try QEMU. If you're interested in working from home, you can download the source for QEMU and/or Bochs and install your own copy using these instructions. Precompiled binaries for Windows and Mac OS X are available too.

You will also need a copy of GCC that compiles code for an x86 ELF target. ELF, or Executable and Linkable Format, is a particular format for storing machine language programs on disk. Recent Linux PCs have the right compiler already set up. However, if you want to work on other platforms, or on Windows, you'll need a cross-compiler: a version of GCC that runs on your machine, but generates binaries for WeensyOS.

We've set up all the required tools on the machines in the Linux lab, and the Solaris machines on SEASnet. In the Linux lab, no special setup is required. On SEASnet, you need to set your environment to use our tools.

Read the minilab tools page and set up your environment appropriately.

Now that you've got all the software set up (or you've just decided to use the Linux lab), it's time to download WeensyOS and take it out for a spin.

Download and unpack the source for weensyos1 using the following command.

% gzcat weensyos1.tar.gz | tar xf -

(On Linux, you can just say "gtar xzf weensyos1.tar.gz".) This should unpack the tarball into the weensyos1 directory.

% ls weensyos1
COPYRIGHT    elf.h          passwdos-boot.c   passwdos.h
GNUmakefile  mergedep.pl    passwdos-c.c      print111os-boot.c
bootstart.S  mkbootdisk.pl  passwdos-p.c      types.h
conf         mmu.h          passwdos-yield.S  x86.h
%

Now that you've unpacked the source, it's time to give the OSes a whirl.

Change into the weensyos1 directory and run the gmake program.

The WeensyOS Makefile (well, GNUmakefile) builds two hard disk images when it runs. The first image, print111os.img, contains the Print111OS; it is built from the bootstrapping code in bootstart.S and the actual Print-111 program, which is in print111os-boot.c. The second image, passwdos.img, contains the PasswdOS. Since PasswdOS contains two processes, it is a bit more complex than Print111OS; five files are compiled to create passwdos.img, namely bootstart.S, passwdos-boot.c (a simple boot loader), passwdos-p.c (the password printer), passwdos-c.c (the password checker), and passwdos-yield.S (some assembly language glue code for transferring control between the processes).

Gmake's output should look something like this:

% gmake
+ as bootstart.S
+ cc print111os-boot.c
+ ld obj/bootstart.o
+ mk print111os.img
+ cc passwdos-boot.c
+ ld obj/bootstart.o
+ cc passwdos-p.c
+ as passwdos-yield.S
+ ld obj/passwdos-p.o
+ cc passwdos-c.c
+ ld obj/passwdos-c.o
+ mk passwdos.img
%

Now that you've built the OS disk images, it's time to run them! We've made it very easy to boot a given disk image; just run this command:

% gmake run-print111os

This will start up Bochs, but not yet the emulated computer. (This is because Bochs is giving you a chance to set breakpoints on the emulated machine.) To start the emulated computer, type "c":

<bochs:1> c

After a moment you should see a window like this!

To quit Bochs, click the "Power" button in the upper-right corner. (Very funny, Bochs.) Then run the PasswdOS with gmake run-passwdos; it should print out "Y" instead of "111!".

QEMU Note. If you're running QEMU instead of Bochs, run the Print111OS with qemu -hda print111os.img, and the PasswdOS with qemu -hda passwdos.img. (The -hda option stands for Hard Disk A.) QEMU doesn't have a funky power button; just hit Control-C in the terminal to quit.

Print111OS

You're now ready to start learning about the OS code!

The natural place to start is the first code that gets run. That's the boot loader -- a small piece of code, residing in the hard disk's first sector, that's responsible for loading everything else. As we saw in class, each PC contains a little bit of firmware code, burned in to stable memory (either ROM or flash memory). This code is responsible for initializing the computer just enough so that other software -- namely, the OS itself -- can start. The firmware is called the BIOS, which stands for Basic Input/Output System. How does the BIOS bootstrap the operating system? Simple: The BIOS searches disks attached to a system for a valid boot sector. This is a sector ending with a two-byte magic number (see the code to find out which number). Once a boot sector is found, the BIOS reads it into memory at address 0x7C00, then jumps to address 0x7C00. This program is expected to be a boot loader, which takes whatever steps are necessary to load the rest of the kernel; and the BIOS is no longer in control.

In WeensyOS, we arrange the boot sector so that control is transferred to start in bootstart.S. Though simple, this code needs to jump through a couple hoops, since the BIOS tries to be compatible with operating systems written for 20+-year-old 8086 processors. Once it's done, it passes control on to another routine -- namely, bootmain.

Read the comment at the head of bootstart.S. This assembly-language file starts the boot process. Don't worry about the assembly language itself!! Just read and understand the first comment.

Understand the comments and the code in print111os-boot.c.

Exercise 1. Answer the following question: Why are we lucky that the print111os-boot.c program is well under 510 bytes in length?

Exercise 2. Change the print111os-boot.c program so that it fills up the console with stars, not spaces. Run and test your operating system.

Exercise 3. Change the print111os-boot.c program so that it prints "111!" in red-on-white text. Run and test your operating system.

Hint: Check out this document on the "VGA Programming Model". In particular, search for "Figure 12" and read the text that follows.

The code you hand in should follow both Exercise 1 and Exercise 2, so it should print 111! in red-on-white against a background of stars.

Exercise 4. print111os-boot.c accesses the video hardware using both memory-mapped I/O and programmed I/O (I/O port instructions). Which hardware feature corresponds to each method?

PasswdOS

The Print111 version of WeensyOS is really small: it supports only one process! There's nothing wrong with such a simple operating system; in fact, operating systems for embedded processors, like the things that run in your toaster or your car, often support exactly one process. But of course, it is far more powerful to support multiple processes at a time. This makes more efficient use of the computer's resources; for example, while one process waits for data to be retrieved from a disk, another process can go ahead with its work.

The PasswdOS is about the simplest multi-process operating system you can imagine. It supports two processes, a password printer and a password checker. The printer writes a password to the checker. The checker compares that password to the "correct" password; if they agree, it prints "Y" to the screen, and if they don't, it prints "N".

The processes in PasswdOS use cooperative multitasking. That is, processes give up control voluntarily. If one of the processes went into an infinite loop, the machine would entirely stop. This contrasts with preemptive multitasking, in which a trusted OS component (the kernel) can force an uncooperative process to give up control. Preemptive multitasking is more robust than cooperative multitasking, meaning it's more resilient to errors. However, it is slightly more complex, and preemption is by definition slower than cooperative multitasking. All modern PC-class operating systems use preemptive multitasking, but don't forget that cooperative multitasking exists: we will see how to use it in your own programs.

Because PasswdOS uses cooperative multitasking, it does not need an operating system kernel! A kernel is required in a system with preemptive multitasking; there has to be a trusted component that takes control and shuts down runaway processes. But not here. PasswdOS does not, in fact, have a kernel; after the OS boots, it runs just the two processes' code (the printer and the checker).

Of course, we still have to boot the OS. PasswdOS uses a separate bootloader, just like today's PC OSes. That bootloader simply loads the printer and the checker into memory, then transfers control to the printer. If you're curious, see passwdos-boot.c.

PasswdOS Memory Layout

The memory layout for PasswdOS looks like this. (Bar widths are not to scale.)

PasswdOS uses a single memory space. Its processes -- the printer and the checker -- are not isolated; they must share the same memory space. Thus, they must load at different addresses. We load the printer at address 0x100000, or 1 megabyte (the beginning of the PC's "extended memory"*), and the checker at 0x300000, or 3 megabytes. Just to keep things simple, we load different objects at megabyte boundaries; but most of the megabyte of memory between 0x100000 and 0x200000 is unused, since the printer needs only 156 bytes. A real OS would avoid wasting memory like this.

The printer and checker programs are machine code, of course, and machine code can contain memory addresses. So we must ensure that these programs use different regions of memory, or they'd clobber each other. This is the linker's job. Linking is the last stage in compiling a program.** The compiler and assembler turn source code into object files, which contain machine code. The processor can't interpret object files, though: most programs are built from multiple pieces, including source files plus some libraries, and no one piece can run on its own. So the linker combines all the important object files and libraries into an executable that the processor can run. This requires making sure that jumps from one object file to another, and references to different functions and data, use the right addresses. So the linker must rearrange the object files in a process called relocation.

The PasswdOS uses relocation to enforce the memory layout above, and thus to make sure that no two processes collide. We tell the linker explicitly to relocate the printer to address 0x100000, and the checker to address 0x300000. Check out the GNUmakefile if you want to see how.

(*The term "extended memory" is used because the original 8086 processor could not access addresses above 1 megabyte.)

(**The linker isn't the last step in modern operating systems that support shared libraries. Another relocation step happens at load time, when the process is loaded into memory. On Linux, the dynamic linker/loader is in charge of this: man ld.so for more info.)

Stacks

Of course, allocating space for the code is necessary, but not sufficient. We still need a place for data local to each process -- the process's current instruction pointer, say, or the arguments to its functions. There are several ways to implement process-local storage, but modern architectures and programming languages have special support for one particularly useful arrangement: the stack.

Stacks are good at handling nested procedure calls. Each active procedure call needs to allocate some local storage, to store local variables, parameters, and such. But because of the semantics of procedures -- because each procedure returns once -- the local storage can be thrown away (recycled) as soon as the procedure returns. This lets us allocate local storage in a very simple, cheap way. Think of it like a stack of plates, where a plate represents a function's local storage. When the processor calls a function, we put a plate onto the stack; when the topmost function returns, we pop its plate off the stack and break it into a million pieces. To allocate or free local storage, we just move a pointer forward or backward (that is, push or pop a "plate").

Here's some sample code:

int add_1(int arg) {
    return arg + 1;
}

int add_2(int x) {
    int y = add_1(x);
    return add_1(y);
}

The add_1 function just adds 1 to its argument. When we call add_2, it calls add_1; stores the result in y; then returns the result of calling add_1 again. Here's an outline of the call var = add_2(45), showing the stack at each step:

   | var = ??  |        | var = ??  |        | var = ??  |
   +-----------+        +-----------+        +-----------+
   |  x = 45   |        |  x = 45   |        |  x = 45   |
   |  y = ??   |        |  y = ??   |        |  y = 46   |
-> +-----------+        +-----------+     -> +-----------+
                        | arg = 45  |
                     -> +-----------+
  (a) on entry       (b) first call to    (c) after add_1
                         add_1                returns


   | var = ??  |        | var = ??  |        | var = 47  |
   +-----------+        +-----------+     -> +-----------+
   |  x = 45   |        |  x = 45   |
   |  y = 46   |        |  y = 46   |
   +-----------+     -> +-----------+
   | arg = 46  |
-> +-----------+
(d) second call to   (e) after add_1      (f) after add_2
    add_1                returns              returns

Notice how each call to add_1 adds a different "plate" to the stack, and how the pointer -> shifts as different functions are called. This is much cheaper than the complex garbage collection algorithms required to manage a more general memory structure, such as a heap. And note that stacks even handle recursive functions (functions that call themselves).

Most architectures have a special register called the stack pointer that points to the current function's local storage. On the x86 this register is called %esp. Stacks tend to grow downward, like the examples we've shown above. Calling a new function reduces the stack pointer, and returning from a function increases the stack pointer. This is because it is more intuitive to refer to local variables with positive offsets. (For example, in part (a) of the figure, the variable x might be located at address %esp + 4.) The x86 has many instructions that refer implicitly to the stack, including call and ret. The call function pushes the address of the next instruction onto the stack, then jumps to a particular address. The ret instruction undoes the effect of call: it pops an address from the stack, then jumps to that instruction.

So we need to allocate a stack for each process. Where? Well, stacks grow downward; so we simply allocate the printer's stack at the top of its memory block (0x200000), and the checker's stack at the top of its memory block (0x400000). This is good enough.

Note that some programming languages, such as Scheme, let functions return more than once. These languages can't be implemented with a simple stack. There are exceptions to the rule even in C -- the setjmp function, for example, can return more than once -- but those exceptions are carefully tailored to be compatible with a stack.

Problems

First, run the PasswdOS to verify that it works:

% gmake run-passwdos
...
<bochs:1> c

Exercise 5. Change both the printer and the checker to use password "111!" instead of the current "111". Verify that the checker still prints "Y".

Despite all this, there is one serious problem with our "password checking OS": the password printer and the password checker share the same memory space, without isolation. This means that the printer can cheat -- either by snooping into the checker's memory space, or more actively, by messing around with the checker's memory space. Modern operating systems solve this issue by providing virtual memory protection -- processes can't snoop on each other's memory without permission. But in the next problem, you'll think naughty and cheat.

Exercise 6. Make the printer cheat. The printer should cause the checker to output "Y", even though the printer "knows" the wrong password. To test this, change the printer's code to use password "666", but do NOT change the checker's code or its good_password (which should be "111!"). Despite this, the checker should print "Y" when you run the PasswdOS.

There are many ways for the printer cheat. Here are just a few:

Read the good password out of the checker's memory space, and use that password.
Modify the checker's good password.
Modify the checker's code to refer to a different "good" password. The existing good_password will remain unchanged; it just won't be used.
Modify the checker's code so that it prints "Y" no matter what password it gets.
Modify the checker's stack and/or process table entry so that it skips reading the password at all, and instead jumps straight to printing "Y".

And, of course, there are others (some simpler than any of these). Choose whichever one you like and implement it. Be creative!

Note that the printer doesn't necessarily have to print the incorrect password to the checker. But you must not hard code the correct password into the printer in any way. If you change the checker's good_password to another value, the printer should still cheat its way to a "Y" answer.

Many cheating mechanisms require that the printer know where the checker's code and data are stored in memory. If you wanted to blow my (and your) mind, the printer could reverse-engineer this information by tracing the checker's executable instructions! But it's much easier to just look this information up. Take a look at obj/passwdos-c.sym and obj/passwdos-c.asm. The .sym file tells you exactly which memory addresses are used for each program object. For example:

00200000 T getch
00200034 T strcmp
00200064 T write_char
002000bc T start
00200118 T _yield
00200128 t _yield2
00210000 D good_password
00210004 A __bss_start
00210004 A _edata
00210004 A _end

This says that the "good_password" symbol is located at address 0x210000. The .asm file shows you the process's assembly language code, along with memory addresses, interleaved with the C code that corresponds. For example:

        *PIPEBUF = 0;
  200028:       c6 05 00 00 30 00 00    movb   $0x0,0x300000

This says exactly how the C line ("*PIPEBUF = 0;") is implemented in machine language. You might use this file, combined with experimental changes to passwdos-c.c itself, to track down exactly which instructions the printer should change, and how.

The following functions might be useful, if you haven't done much of this kind of C programming before. You could change them in the obvious ways to read and write integers and even bigger structures.

#include "types.h"      // for the definition of 'uint32_t'

unsigned char
read_byte(uint32_t address)
    // Returns the byte stored in memory at the given 'address'.
{
    return *((unsigned char *) address);
}

void
write_byte(uint32_t address, unsigned char byte)
    // Writes 'byte' into memory at the given 'address'.
{
    *((unsigned char *) address) = byte;
}

// Example:
write_byte(0x7C00, 255);  // will write 255 into address 0x7C00

Exercise 6 will make Exercise 5 somewhat redundant (except that the checker will nominally check for "111!"); don't worry about it.

This completes the minilab.