- schedule for the rest of the term is up, comments welcome - one challenge problem on next lab, turn in answers to questions Plan 9 revisited - Three principles -- Resources are named and accessed like files in a hierarchical FS -- Standard protocol for accessing files -- *Single private hierarchical file namespace* - Goal: Seamless interoperation - 8 1/2 window system -- X windows: network connections are "displays"; windows are integers ... You can read the whole window tree from the root! ... "Mechanism, not policy" -- considered a flexibility advantage, led to human interface mess -- 8 1/2: serves, to each window, a set of files ... /dev/cons for text files and to read from the keyboard ... /dev/mouse to read the mouse ... /dev/bitblt for bitmapped graphics ... Advantages/disadvantages? Private windows (more secure), flexibility (needed for e.g. xwrits) breaks the clean model -- Hold key ... Toggling once suspends client reads from the window, toggling again resumes normal reads ... Press the key, edit mail messages, press the key again ... No need for a separate editor -- exportfs ... 9P <=> system calls ... "Mount" exportfs onto a part of the namespace ==> remote file service ... int mount(int fd, int authfd, char* dst, int flag, char* attachname) ... int bind(char* src, char* dst, int flag) ... int unmount(char* src, char* dst) ... MREPL MBEFORE MAFTER MCREATE -- The import command ... Connect to a [new] exportfs on a remote machine, mount it -- The cpu command ... Start an exportfs HERE, and log in to a remote machine ... Export /dev to the remote machine ... All local device files visible remotely -- Import and cpu are duals - Microbenchmarks (100MHz MIPS R4400, IRIX 5.3, 1MbL2 cache) Plan 9 IRIX Context switch 39 us 150 us System call 6 us 36 us Light fork 1300us 2200us Pipe latency 110 us 200 us Pipe BW 11678 14545 KB/s Exokernel - Surprise! You're writing an exokernel - One principle -- Separate resource protection from management - Or, four principles -- Securely expose hardware ... Or, Avoid resource management: only manage resources to the extent required by protection -- Expose allocation -- Expose physical names ... Efficient (remove layer of indirection), encode important attributes -- Expose revocation - Goal: Maximum application flexibility and performance - Compare those principles & goals to those of Plan 9! - Central challenge: flexibility + fault isolation -- Tracking resource ownership -- Revoking access to resources -- Protecting apps from each other - Secure binding -- Fancy name for a simple idea: check once, use many - Example: memory system -- The MIPS architecture they're using has a TLB, not a hardware-defined page table like x86 ... Page tables defined in software -- Q: What would a secure binding be in this context? ... TLB entries ... Check on insertion, hardware lets you use indefinitely -- What would be the "exokernel" way of implementing secure bindings here? ... Answer: system calls that allow access to the TLB ... Supply capability for a physpage & you get to map it ... Why are capabilities important? Why not user ID/process ID? ==> Avoid encoding policy ... How to avoid kernel crossings? ==> Large software TLB cache -- How do you allocate a page? ... Allocate a physical page, get R/W capabilities ... Kernel records owner (capability [?]) and R/W capabilities ... Owner can change capabilities or deallocate page -- Why deallocate a page? ... Overuse/contention ... Multi-stage revocation protocol ... 1. Revoke something ... 2. Revoke something in < T seconds or face consequences ... 3. I have revoked something for you, I'll record it so you can see later what it was ... Visible revocation, not the usual invisible revocation ... Can 3. revoke anything? ==> essentially like killing the app! Each app gets a couple pages that are pinned. ... What if you need to revoke even that? ==> 4. Process must submit self to "Swap server" or be killed -- How to handle a TLB miss? [SKIPPABLE] ... If in standard user segment, go to application ... If in potentially-pinned segment, check for pinned mapping; if so, install and return; otherwise, go to application ... In application, look up virtaddr; if not allowed, segfault; otherwise, syscall(install TLB entry) ... Aegis checks args for rights; if OK, install ... application continues. -- How does this model change on x86? ... x86 defines the page table structure, TLB refills handled in hardware ... Applications must call the kernel to modify the page table ... But system calls can expose hardware capabilities (MMU protection bits e.g.) and kernel data structures (free lists, inverse mappings) ... Paging handled by user-level applications ... Simpler than Aegis design, which required careful coding, since the TLB refill handler might run in unusual situations -- for example, when none of the process's memory was mapped! - How is the CPU multiplexed? -- Time slices ... Partitioned at clock granularity, scheduled round-robin ... You're asked to give up your scheduling at the timeslice boundary via a context switch handler ... So you are responsible for giving up your time ... But keep the CPU for too long and you're killed ... LibOSes responsible for general context switching: register saving, releasing locks, etc. .... Sort of a micro-optimization -- What can a context switch handler do? ... Return control to the kernel ==> next app runs, round-robin order ... Return control to a specific app via yield ... yield can be used to implement complex scheduling policies (stride scheduling in paper) ... note that the scheduler is shared among many apps - Network system -- How to demultiplex a message sent in from outside? ... Look at packet header? ... Pass to each application? ... Pass to a trusted application? -- Solution: downloaded code ... In a little language that can be checked ... Compile to machine code for speed ... Check that filters don't match existing filters ... Is there a problem here? ==> IP fragmentation - Exceptions -- Aegis ... saves 3 scratch registers to a fixed physmem save are ... Loads exception PC, bad virtaddr, cause ... jumps to application exception-handler PC ... Application just keeps running! ... 18 instructions -- x86 ... Each environment gets a separate exception stack; exceptions are handled on that stack - Performance -- Null procedure calls: (DEC5000, 25MHz) 21.3us Ultrix, 1.6/2.3us Exo: 10x -- Exception dispatch: (protection) 154.0us Ultrix, 1.5us Exo -- Protected control transfer (yield): 1.4us Aegis, 9.3 normalized us L3 -- IPC: pipe: 199us Ultrix, 14us Aegis -- BUT: "Fast applications do not require good microbenchmark performance. The main benefit of an exokernel is not that it makes primitive operations efficient, but that it gives applications control over expensive operations such as I/O. It is this control that gives order of magnitude performance improvements to applications, not fast system calls. We heavily tuned Aegis to achieve excellent microbenchmark performance. Xok, on the other hand, is completely untuned. Nevertheless, applications perform well." -- AND: "Valuable information can be lost by implementing OS abstractions at application level." If everything uses the same LibOS, no problem; otherwise, hard to distinguish the block cache from virtual memory. - Extensibility -- Followon paper: Application Performance and Flexibility on Exokernel Systems -- Particular extensions made easy by exokernels ... Plus XN, a mindwarper -- But can implement the same extensions in a conventional OS ... Heavierweight, slower usually -- Extensions in this paper? ==> RPC in which server saves/restores regs, new page table, stride scheduler