Speculative OS

Your first full assignment, due March 1, is to write a brief, speculative paper on a new “architecture” for an operating system or OS subsystem.

Your work should live somewhere in the space between research and science fiction. Aim to follow the basic outline of a research paper, minus implementation and evaluation. The architecture need not be brand new to the world, but it should be new to you.

Alternately, you may write a brief, speculative paper that explores a problem with today’s computer systems that might benefit from a better OS or subsystem. This class focuses on operating systems, but any large software system is a valid subject: Web servers, databases, browsers, etc.

Aim for at least 2,500 words, which is roughly 3 pages of 10-point two-column text. If using LaTeX, try the acmart package; use \documentclass[sigplan]{acmart}.

The ideal paper will demonstrate grounded wild speculation, where “wild” means you brainstormed enough to come up with a biggish idea, and “grounded” means you thought through some of the consequences of that idea. It would be difficult to complete this assignment in a day: you want to give your mind time to do both the brainstorming and the grounding.

Collaboration on the brainstorming piece is welcome, but everyone writes their own paper.

Potential outline

Consider using this high-level outline as a guide for your work.

Title, possibly including the name of your system. Go broad—puns, jokes, obscure references to important books from your youth.
Abstract (optional). A couple paragraphs that lay out the main argument of your operating system. First, give a brief motivation: a few sentences about what’s wrong with current systems, or future developments that will cause problems for current systems. Then, describe in brief the core ideas of your system.
Introduction. The first several paragraphs of a speculative operating system paper often describe the degraded dystopia into which today’s systems have sunk. Identify a problem, pain point, or issue with current systems and explain its consequences. Then describe how these issues could be solved by a different system architecture. Usually a figure will help.
Design. Explain at a high level how your system architecture or subsystem architecture might work. Is there an important missing abstraction around which you believe operating systems should be built?
Discussion. No new system architecture will solve all problems. What might hold back the development and deployment of your proposal? How might current system architectures be better?
References. You should do a literature review and cite several papers throughout, though a full related work section is not required.

I would try to write the introduction first; titles and abstracts often come last.

Potential directions

As you consider this assignment, it may be useful to consider how previous researchers have come up with new operating system architectures. Here are some examples.

New hardware. New hardware features often inspire new system designs because they reveal unexpected consequences of the old designs. For example, as CPU became relatively faster than stable storage, OS virtual memory internals focused less on swapping (moving process images to and from disk) and more on memory management and performance.
- Embedded devices, like Internet-enabled light bulbs, run operating systems. Are conventional, monolithic system architectures too big and too inherently insecure for these devices? Could a system redesign make problems with embedded devices less common and less frustrating?
- Fast, byte-addressable persistent memory, including 3D XPoint and other, even cooler technologies, is a durable storage technology (meaning data survives power loss) that offers much lower latencies and much higher performance than SSDs. Does this kind of memory change the tradeoffs between volatile storage and stable storage around which OSes are built? For example, if stable memory is fast enough, swapping becomes very inexpensive.
- GPUs are super important for many applications, including graphics and machine learning, but their features around multiplexing and fair scheduling aren’t well integrated with the operating system. Could a redesign help?
- Fast networking, especially fast interconnects in data centers, can build enormous computing capacity from the combination of many servers. But these super-computers run best when their components are all running smoothly. Scheduling mishaps in a single OS can slow down performance. Might we need a new operating system that’s built to minimize latency?
- What does quantum computing ask of the operating system?
New workloads. As new kinds of applications rise, they often expose issues with system APIs that sometimes lead to wholesale redesigns. For example, client-server architectures and the Web inspired new, scalable system calls for processing I/O.
- Bitcoin! Lots of problems to think about here: Could systems make it harder to lose a Bitcoin wallet? Could systems make it easier to process Bitcoin transactions?
- Machine learning frameworks! This relates to GPUs, above, but overall machine learning workloads have very different performance characteristics. Could a new architecture take advantage of the predictability of these workloads to improve performance?
- Serverless computing and cloud computing!
New concerns. Just as new workloads sometimes arise, other aspects of the world sometimes change what matters most for existing workloads, making new cross-cutting concerns more important. New system designs can help.
- Security! Ubiquitous network connectivity and the (in some ways exaggerated) possibility of global-scale attacks make the security of today’s systems a much bigger concern. How would you build a system centered on security? Would you prove the system correct? Limit the damage that one system could do?
- Energy! Server farms are not the worst contributors to climate change (estimated to consume a couple percent of global electricity output per year, which itself is about 25% of greenhouse gas emissions), but waste not want not. Energy savings has been a major focus of consumer-grade processor design, and system changes were required to take advantage of energy-friendly processor features, such as dynamic voltage scaling.
- Inclusion! Systems like Linux are steeped in their own cultures and can sometimes reject outsiders. Some programming languages (such as Ruby, Rust) were developed intentionally for broadly inclusive community. Are systems ready for a similar transition?
Attacks. Some of the biggest recent reworks of Linux have been motivated by security concerns, such as the Spectre and Meltdown attacks that leverage processor speculation to violate kernel and process isolation. As attacks change, OS architectures change to adapt. Are there new attacks, or old attacks of renewed concern, that might recommend restructure?
- Browsers are incredibly important, but extremely complex. Could a different system design help browsers provide better security guarantees, for example by making it easier to separate Javascript code from core browser code?
- The speed of today’s computers make covert channels far more dangerous; it’s possible to steal surprising amounts of data using a covert channel attack. What would a system design look like that tried to avoid as many covert channels as possible?
Dissatisfaction. The Unix operating system was inspired by dissatisfaction with the complexity of MULTICS. The exokernel was inspired by dissatisfaction with monolithic kernels. Is there anything in today’s systems that annoys you? Don’t underestimate your annoyance: lean in!
Satisfaction. Maybe you like one aspect of a system and would like to redo the design of a system around that feature. For instance, conventional Unix I/O system calls, like read and write, are awkward and slow to use in event-driven programs, compared to other designs that resemble device communication rings. Maybe the entire kernel deserves reconstruction around one of those designs.