Section 4: Scalability and OS design

Multicore hardware has motivated lots of changes in operating systems, ranging from new, relatively simple system calls (like sched_setaffinity) to entire redesigns of operating system architecture.

In this section, we’ll focus on one important new OS design, the multikernel. The multikernel designers argue that monolithic kernel designs are ill-suited for modern hardware, and that a new kernel architecture, inspired by distributed systems, will scale better as kernels grow.

Inspired in part by the multikernel design, another group of researchers took a closer look at the question of whether monolithic kernel designs are really limited in terms of scalability. Their answer was maybe not!

Preparation

Before section, read the Barrelfish multikernel paper. Make sure you really digest some of the specific design ideas in Section 4, and understand at least one experiment in Section 5. (Paper-reading advice from Section 2)

The response paper is called An Analysis of Linux Scalability to Many Cores. Read its introduction, then get a high-level overview by reading this blog post about the paper. You may read the whole paper if you’re interested.

Come to section prepared to discuss these questions:

  1. What is one specific way that the Barrelfish multikernel design differs from previous multicore OSes? (Refer to a specific section of the paper.)
  2. How is that difference motivated by multicore hardware? (Why does the Barrelfish paper claim the multikernel architecture is better suited for multicore hardware, in the specific case of the difference you have chosen?)
  3. Does the Scalability to Many Cores paper invalidate the arguments in the multikernel paper? Why or why not?

Post an answer to the questions to the class discussion board at most two hours before class.

The Barrelfish multikernel

“The Multikernel: A new OS architecture for scalable multicore systems.” Andrew Baumann, Paul Barham, Pierre-Evariste Dagand, Tim Harris, Rebecca Isaacs, Simon Peter, Timothy Roscoe, Adrian Schüpbach, and Akhilesh Singhania. In Proc. SOSP 2009.

Commodity computer systems contain more and more processor cores and exhibit increasingly diverse architectural tradeoffs, including memory hierarchies, interconnects, instruction sets and variants, and IO configurations. Previous high-performance computing systems have scaled in specific cases, but the dynamic nature of modern client and server workloads, coupled with the impossibility of statically optimizing an OS for all workloads and hardware variants pose serious challenges for operating system structures.

We argue that the challenge of future multicore hardware is best met by embracing the networked nature of the machine, rethinking OS architecture using ideas from distributed systems. We investigate a new OS structure, the multikernel, that treats the machine as a network of independent cores, assumes no inter-core sharing at the lowest level, and moves traditional OS functionality to a distributed system of processes that communicate via message-passing.

We have implemented a multikernel OS to show that the approach is promising, and we describe how traditional scalability problems for operating systems (such as memory management) can be effectively recast using messages and can exploit insights from distributed systems and networking. An evaluation of our prototype on multicore systems shows that, even on present-day machines, the performance of a multikernel is comparable with a conventional OS, and can scale better to support future hardware.

Linux scalability

“An Analysis of Linux Scalability to Many Cores.” Silas Boyd-Wickizer, Austin T. Clements, Yandong Mao, Aleksey Pesterev, M. Frans Kaashoek, Robert Morris, and Nickolai Zeldovich. In Proc. OSDI 2010.

This paper analyzes the scalability of seven system applications (Exim, memcached, Apache, PostgreSQL, gmake, Psearchy, and MapReduce) running on Linux on a 48-core computer. Except for gmake, all applications trigger scalability bottlenecks inside a recent Linux kernel. Using mostly standard parallel programming techniques—this paper introduces one new technique, sloppy counters—these bottlenecks can be removed from the kernel or avoided by changing the applications slightly. Modifying the kernel required in total 3002 lines of code changes. A speculative conclusion from this analysis is that there is no scalability reason to give up on traditional operating system organizations just yet.