During Lecture 3, we discussed how to build an operating system that does nothing but print "111" on the console; we used it to learn about hardware interaction and bootstrapping. Well, that kind of teensy-weensy operating system isn't just a thought experiment. It is real.
weensyos1.tar.gz |
Source code for WeensyOS 1.0, which builds these two hard disk images: |
print111os.img |
Print111OS, which prints "111!" to the console. |
passwdos.img |
PasswdOS, which has two cooperative processes: the first writes a password to the second, which checks that password. |
In this simple problem set, you'll browse, partially understand, and change these tiny operating systems.
You will electronically hand in code and a small writeup
containing answers to the numbered exercises.
The problem set code, weensyos1.tar.gz
, unpacks into a
directory called weensyos1
. (We explain how to unpack it
below.)
You'll modify the code in this directory, and add a text file with your
answers to the numbered exercises.
When you're done, run the command gmake tarball
.
This should create a file named
weensyos1-yourusername.tar.gz
.
You'll turn in this file to CourseWeb.
Answers to the numbered exercises should be in a file named
answers.txt
, answers.html
, or
answers.pdf
.
Text files are strongly preferred.
No Microsoft Word documents (or other binary format, except for PDF)
will be accepted!
For coding exercises (Exercises 2, 3, 5, and 6), it's OK for
answers.txt
to just refer to your code (as long as you comment
your code).
To review:
weensyos1.tar.gz
.weensyos1
directory.answers.txt
file (or answers.html
or answers.pdf
) in that weensyos1
directory.gmake tarball
from the weensyos1
directory. This will create a file named weensyos1-yourusername.tar.gz
.weensyos1-yourusername.tar.gz
file to
CourseWeb.You could take one of the disk image files this minilab builds, write it to your laptop's hard drive, and boot up your operating system directly if you wanted! However, it's much easier to work with a virtual machine or PC emulator.
An emulator mimics, or emulates, the behavior of a full hardware platform. A PC emulator acts like a Pentium-class PC: it emulates the execution of Intel x86 instructions, and the behavior of other PC hardware. For example, it can treat a normal file in your home directory as an emulated hard disk; when the program inside the emulator reads a sector from the disk, the emulator simply reads 512 bytes from the file. PC emulators are much slower than real hardware, since they do all of the regular CPU's job in software -- not to mention the disk controller's job, the console's job, and so forth. However, debugging with an emulator is a whole lot friendlier, and you can't screw up your machine!
We've used two PC emulators. The Bochs emulator has pretty nice debugging support. The QEMU package is fast and sleek, but it might be too fast for some of our purposes. If you work on your own machine, try QEMU. If you're interested in working from home, you can download the source for QEMU and/or Bochs and install your own copy using these instructions. Precompiled binaries for Windows and Mac OS X are available too.
You will also need a copy of GCC that compiles code for an x86 ELF target. ELF, or Executable and Linkable Format, is a particular format for storing machine language programs on disk. Recent Linux PCs have the right compiler already set up. However, if you want to work on other platforms, or on Windows, you'll need a cross-compiler: a version of GCC that runs on your machine, but generates binaries for WeensyOS.
We've set up all the required tools on the machines in the Linux lab, and the Solaris machines on SEASnet. In the Linux lab, no special setup is required. On SEASnet, you need to set your environment to use our tools.
Read the minilab tools page and set up your environment appropriately.
Now that you've got all the software set up (or you've just decided to use the Linux lab), it's time to download WeensyOS and take it out for a spin.
Download and unpack the source for weensyos1 using the following command.
% gzcat weensyos1.tar.gz | tar xf -
(On Linux, you can just say "gtar xzf weensyos1.tar.gz".) This should unpack the tarball into the weensyos1 directory.
% ls weensyos1
COPYRIGHT elf.h passwdos-boot.c passwdos.h
GNUmakefile mergedep.pl passwdos-c.c print111os-boot.c
bootstart.S mkbootdisk.pl passwdos-p.c types.h
conf mmu.h passwdos-yield.S x86.h
%
Now that you've unpacked the source, it's time to give the OSes a whirl.
Change into the weensyos1 directory and run the gmake program.
The WeensyOS Makefile (well, GNUmakefile) builds two hard disk images when it runs. The first image, print111os.img, contains the Print111OS; it is built from the bootstrapping code in bootstart.S and the actual Print-111 program, which is in print111os-boot.c. The second image, passwdos.img, contains the PasswdOS. Since PasswdOS contains two processes, it is a bit more complex than Print111OS; five files are compiled to create passwdos.img, namely bootstart.S, passwdos-boot.c (a simple boot loader), passwdos-p.c (the password printer), passwdos-c.c (the password checker), and passwdos-yield.S (some assembly language glue code for transferring control between the processes).
Gmake's output should look something like this:
% gmake
+ as bootstart.S
+ cc print111os-boot.c
+ ld obj/bootstart.o
+ mk print111os.img
+ cc passwdos-boot.c
+ ld obj/bootstart.o
+ cc passwdos-p.c
+ as passwdos-yield.S
+ ld obj/passwdos-p.o
+ cc passwdos-c.c
+ ld obj/passwdos-c.o
+ mk passwdos.img
%
Now that you've built the OS disk images, it's time to run them! We've made it very easy to boot a given disk image; just run this command:
% gmake run-print111os
This will start up Bochs, but not yet the emulated computer. (This is because Bochs is giving you a chance to set breakpoints on the emulated machine.) To start the emulated computer, type "c":
<bochs:1> c
After a moment you should see a window like this!
To quit Bochs, click the "Power" button in the upper-right corner. (Very funny, Bochs.) Then run the PasswdOS with gmake run-passwdos; it should print out "Y" instead of "111!".
QEMU Note. If you're running QEMU instead of Bochs, run the
Print111OS with qemu -hda print111os.img, and the
PasswdOS with qemu -hda passwdos.img. (The
-hda
option stands for Hard Disk A.) QEMU doesn't have a
funky power button; just hit Control-C in the terminal to quit.
You're now ready to start learning about the OS code!
The natural place to start is the first code that gets run. That's the boot loader -- a small piece of code, residing in the hard disk's first sector, that's responsible for loading everything else. As we saw in class, each PC contains a little bit of firmware code, burned in to stable memory (either ROM or flash memory). This code is responsible for initializing the computer just enough so that other software -- namely, the OS itself -- can start. The firmware is called the BIOS, which stands for Basic Input/Output System. How does the BIOS bootstrap the operating system? Simple: The BIOS searches disks attached to a system for a valid boot sector. This is a sector ending with a two-byte magic number (see the code to find out which number). Once a boot sector is found, the BIOS reads it into memory at address 0x7C00, then jumps to address 0x7C00. This program is expected to be a boot loader, which takes whatever steps are necessary to load the rest of the kernel; and the BIOS is no longer in control.
In WeensyOS, we arrange the boot sector so that control is transferred to start in bootstart.S. Though simple, this code needs to jump through a couple hoops, since the BIOS tries to be compatible with operating systems written for 20+-year-old 8086 processors. Once it's done, it passes control on to another routine -- namely, bootmain.
Read the comment at the head of bootstart.S. This assembly-language file starts the boot process. Don't worry about the assembly language itself!! Just read and understand the first comment.
Understand the comments and the code in print111os-boot.c.
Exercise 1. Answer the following question: Why are we lucky that the print111os-boot.c program is well under 510 bytes in length?
Exercise 2. Change the print111os-boot.c program so that it fills up the console with stars, not spaces. Run and test your operating system.
Exercise 3. Change the print111os-boot.c program so that it prints "111!" in red-on-white text. Run and test your operating system.
Hint: Check out this document on the "VGA Programming Model". In particular, search for "Figure 12" and read the text that follows.
The code you hand in should follow both Exercise 1 and Exercise 2, so it
should print 111!
in red-on-white against a background of
stars.
Exercise 4. print111os-boot.c accesses the video hardware using both memory-mapped I/O and programmed I/O (I/O port instructions). Which hardware feature corresponds to each method?
The Print111 version of WeensyOS is really small: it supports only one process! There's nothing wrong with such a simple operating system; in fact, operating systems for embedded processors, like the things that run in your toaster or your car, often support exactly one process. But of course, it is far more powerful to support multiple processes at a time. This makes more efficient use of the computer's resources; for example, while one process waits for data to be retrieved from a disk, another process can go ahead with its work.
The PasswdOS is about the simplest multi-process operating system you
can imagine.
It supports two processes, a password printer and a password
checker.
The printer writes a password to the checker.
The checker compares that password to the "correct" password; if they
agree, it prints "Y
" to the screen, and if they don't, it
prints "N
".
The processes in PasswdOS use cooperative multitasking. That is, processes give up control voluntarily. If one of the processes went into an infinite loop, the machine would entirely stop. This contrasts with preemptive multitasking, in which a trusted OS component (the kernel) can force an uncooperative process to give up control. Preemptive multitasking is more robust than cooperative multitasking, meaning it's more resilient to errors. However, it is slightly more complex, and preemption is by definition slower than cooperative multitasking. All modern PC-class operating systems use preemptive multitasking, but don't forget that cooperative multitasking exists: we will see how to use it in your own programs.
Because PasswdOS uses cooperative multitasking, it does not need an operating system kernel! A kernel is required in a system with preemptive multitasking; there has to be a trusted component that takes control and shuts down runaway processes. But not here. PasswdOS does not, in fact, have a kernel; after the OS boots, it runs just the two processes' code (the printer and the checker).
Of course, we still have to boot the OS. PasswdOS uses a separate
bootloader, just like today's PC OSes. That bootloader simply loads the
printer and the checker into memory, then transfers control to the printer.
If you're curious, see passwdos-boot.c
.
The memory layout for PasswdOS looks like this. (Bar widths are not to scale.)
PasswdOS uses a single memory space. Its processes -- the printer and the checker -- are not isolated; they must share the same memory space. Thus, they must load at different addresses. We load the printer at address 0x100000, or 1 megabyte (the beginning of the PC's "extended memory"*), and the checker at 0x300000, or 3 megabytes. Just to keep things simple, we load different objects at megabyte boundaries; but most of the megabyte of memory between 0x100000 and 0x200000 is unused, since the printer needs only 156 bytes. A real OS would avoid wasting memory like this.
The printer and checker programs are machine code, of course, and machine code can contain memory addresses. So we must ensure that these programs use different regions of memory, or they'd clobber each other. This is the linker's job. Linking is the last stage in compiling a program.** The compiler and assembler turn source code into object files, which contain machine code. The processor can't interpret object files, though: most programs are built from multiple pieces, including source files plus some libraries, and no one piece can run on its own. So the linker combines all the important object files and libraries into an executable that the processor can run. This requires making sure that jumps from one object file to another, and references to different functions and data, use the right addresses. So the linker must rearrange the object files in a process called relocation.
The PasswdOS uses relocation to enforce the memory layout above, and
thus to make sure that no two processes collide.
We tell the linker explicitly to relocate the printer to address 0x100000,
and the checker to address 0x300000. Check out the
GNUmakefile
if you want to see how.
(*The term "extended memory" is used because the original 8086 processor could not access addresses above 1 megabyte.)
(**The linker isn't the last step in modern operating systems that
support shared libraries. Another relocation step happens at load
time, when the process is loaded into memory. On Linux, the dynamic
linker/loader is in charge of this: man ld.so
for more
info.)
Of course, allocating space for the code is necessary, but not sufficient. We still need a place for data local to each process -- the process's current instruction pointer, say, or the arguments to its functions. There are several ways to implement process-local storage, but modern architectures and programming languages have special support for one particularly useful arrangement: the stack.
Stacks are good at handling nested procedure calls. Each active procedure call needs to allocate some local storage, to store local variables, parameters, and such. But because of the semantics of procedures -- because each procedure returns once -- the local storage can be thrown away (recycled) as soon as the procedure returns. This lets us allocate local storage in a very simple, cheap way. Think of it like a stack of plates, where a plate represents a function's local storage. When the processor calls a function, we put a plate onto the stack; when the topmost function returns, we pop its plate off the stack and break it into a million pieces. To allocate or free local storage, we just move a pointer forward or backward (that is, push or pop a "plate").
Here's some sample code:
int add_1(int arg) { return arg + 1; } int add_2(int x) { int y = add_1(x); return add_1(y); }
The add_1
function just adds 1 to its argument.
When we call add_2
, it calls add_1
; stores the
result in y
; then returns the result of calling
add_1
again.
Here's an outline of the call var = add_2(45)
, showing the
stack at each step:
| var = ?? | | var = ?? | | var = ?? | +-----------+ +-----------+ +-----------+ | x = 45 | | x = 45 | | x = 45 | | y = ?? | | y = ?? | | y = 46 | -> +-----------+ +-----------+ -> +-----------+ | arg = 45 | -> +-----------+ (a) on entry (b) first call to (c) after add_1 add_1 returns | var = ?? | | var = ?? | | var = 47 | +-----------+ +-----------+ -> +-----------+ | x = 45 | | x = 45 | | y = 46 | | y = 46 | +-----------+ -> +-----------+ | arg = 46 | -> +-----------+ (d) second call to (e) after add_1 (f) after add_2 add_1 returns returns
Notice how each call to add_1
adds a different "plate" to
the stack, and how the pointer ->
shifts as different
functions are called.
This is much cheaper than the complex garbage collection algorithms
required to manage a more general memory structure, such as a heap.
And note that stacks even handle recursive functions (functions that call
themselves).
Most architectures have a special register called the stack
pointer that points to the current function's local storage.
On the x86 this register is called %esp
.
Stacks tend to grow downward, like the examples we've shown above.
Calling a new function reduces the stack pointer, and returning from
a function increases the stack pointer.
This is because it is more intuitive to refer to local variables with
positive offsets. (For example, in part (a) of the figure, the variable
x
might be located at address %esp + 4
.)
The x86 has many instructions that refer implicitly to the stack, including
call
and ret
.
The call
function pushes the address of the next instruction
onto the stack, then jumps to a particular address.
The ret
instruction undoes the effect of call
: it
pops an address from the stack, then jumps to that instruction.
So we need to allocate a stack for each process. Where? Well, stacks grow downward; so we simply allocate the printer's stack at the top of its memory block (0x200000), and the checker's stack at the top of its memory block (0x400000). This is good enough.
Note that some programming languages, such as Scheme, let functions
return more than once.
These languages can't be implemented with a simple stack.
There are exceptions to the rule even in C -- the setjmp
function, for example, can return more than once -- but those exceptions
are carefully tailored to be compatible with a stack.
First, run the PasswdOS to verify that it works:
% gmake run-passwdos ... <bochs:1> c
Exercise 5. Change both the printer and the
checker to use password "111!
" instead of the current
"111
". Verify that the checker still prints
"Y
".
Despite all this, there is one serious problem with our "password checking OS": the password printer and the password checker share the same memory space, without isolation. This means that the printer can cheat -- either by snooping into the checker's memory space, or more actively, by messing around with the checker's memory space. Modern operating systems solve this issue by providing virtual memory protection -- processes can't snoop on each other's memory without permission. But in the next problem, you'll think naughty and cheat.
Y
", even though
the printer "knows" the wrong password. To test this, change the printer's
code to use password "666
", but do NOT change the checker's
code or its good_password
(which should be
"111!
"). Despite this, the checker should print
"Y
" when you run the PasswdOS.There are many ways for the printer cheat. Here are just a few:
good_password
will remain unchanged;
it just won't be used.Y
" no matter
what password it gets.Y
".And, of course, there are others (some simpler than any of these). Choose whichever one you like and implement it. Be creative!
Note that the printer doesn't necessarily have to print the incorrect
password to the checker. But you must not hard code the correct password
into the printer in any way. If you change the checker's
good_password
to another value, the printer should still cheat
its way to a "Y
" answer.
Many cheating mechanisms require that the printer know where the
checker's code and data are stored in memory.
If you wanted to blow my (and your) mind, the printer could
reverse-engineer this information by tracing the checker's
executable instructions!
But it's much easier to just look this information up.
Take a look at obj/passwdos-c.sym
and
obj/passwdos-c.asm
.
The .sym
file tells you exactly which memory addresses are
used for each program object. For example:
00200000 T getch 00200034 T strcmp 00200064 T write_char 002000bc T start 00200118 T _yield 00200128 t _yield2 00210000 D good_password 00210004 A __bss_start 00210004 A _edata 00210004 A _end
This says that the "good_password
" symbol is located at
address 0x210000. The .asm
file shows you the process's
assembly language code, along with memory addresses, interleaved with the C
code that corresponds. For example:
*PIPEBUF = 0; 200028: c6 05 00 00 30 00 00 movb $0x0,0x300000
This says exactly how the C line ("*PIPEBUF = 0;
") is
implemented in machine language. You might use this file, combined with
experimental changes to passwdos-c.c
itself, to track down
exactly which instructions the printer should change, and how.
The following functions might be useful, if you haven't done much of this kind of C programming before. You could change them in the obvious ways to read and write integers and even bigger structures.
#include "types.h" // for the definition of 'uint32_t' unsigned char read_byte(uint32_t address) // Returns the byte stored in memory at the given 'address'. { return *((unsigned char *) address); } void write_byte(uint32_t address, unsigned char byte) // Writes 'byte' into memory at the given 'address'. { *((unsigned char *) address) = byte; } // Example: write_byte(0x7C00, 255); // will write 255 into address 0x7C00
Exercise 6 will make Exercise 5 somewhat redundant (except that the
checker will nominally check for "111!
"); don't worry about
it.
This completes the minilab.