Lecture 3: Virtual memory – CS 161 lectures

A la Mickens

Slides from the inestimable James Mickens

Implementing protection

If the CPU is asked to execute an invalid instruction, it causes an exception
When the timer interrupt goes off (periodically, to avoid infinite loops), it raises an exception
Somewhat configurable (e.g., how often the timer interrupt goes off), but little variation among OSes
Memory protection is another story!
- Critical resource
- Layout hugely variable across OSes
- Time has revealed lots of use cases for interesting sharing
- Needs fine-grained configuration
- Paged virtual memory

Physical memory and virtual memory on x86-64

x86-64 computers support:
- Up to \(2^{52}\) bytes of physical memory (though most processors have much less)
- Up to \(2^{57}\) virtual addresses (though most processors support \(2^{48}\))
Physical addresses are contiguous in the range \([0, 2^{\texttt{MAXPHYADDR}})\)
Virtual addresses are not contiguous
- For a processor with \(2^{48}\) virtual addresses:
  - \([0, 2^{47})\) are valid low canonical addresses
    - 0x0000'0000'0000'0000–0x0000'7FFF'FFFF'FFFF
  - \([2^{64}-2^{47}, 2^{64})\) are valid high canonical addresses
    - 0xFFFF'8000'0000'0000–0xFFFF'FFFF'FFFF'FFFF
  - All other addresses are invalid!

Picturing the valid virtual address space

\(2^{48}\) valid virtual addresses, not to scale (valid addresses in red):

To scale (there really are red bars, you just can’t see them):
- \(2^{64} \gg 2^{48}!\)

A newer machine with \(2^{57}\) valid virtual addresses, to scale:

Discussion exercise: Why might designers choose this layout?

Contiguity

Ideally, unprivileged processes would have access to a large, contiguous range of virtual addresses (not a patchy range where some addresses were reserved)
But for technical and convenience reasons, some of the virtual address space must be reserved for the kernel
For compatibility reasons, upgrading the kernel or getting new hardware shouldn’t change this reservation
So where should the reservation go?
- Need room to grow the kernel’s reservation and room to grow the process’s reservation
- Ideally both reservations would be contiguous

WeensyOS interlude

WeensyOS has a simple and limited virtual memory design
Kernel instructions and data occupy the range 0x4'0000–0x8'0000
Unprivileged processes occupy the range 0x10'0000 and up
But this is too limited for a real kernel, which needs much more than 1/4MB of code and data
- For example, a real kernel stores information about the allocation state of each physical page
- Linux stores this information in a structure called struct page
- struct page is ~40 bytes (more or less depending on configuration)
- A machine with 32 GB of memory (not unusual) has \(2^{23}\) pages
- Requiring 320 MB to store!

Sign extension

Idea: split the address space into two halves that grow toward one another!
Most OSes use the upper half (so unprivileged processes aren’t saddled with addresses like 0xFFFFFFFF80000000)
Sign extension on x86-64 lets us represent addresses up to 0x7FFF'FFFF or above 0xFFFF'FFFF'8000'0000 especially efficiently!
- Sign extension is how twos-complement signed numbers of one bit width are translated to a larger bit width
- Copy the sign bit out to the desired width!
- So signed 32-bit 0x8000'0000 is extended to signed 64-bit 0xFFFF'FFFF'8000'0000

Examples from Chickadee:

# kernel.asm:
ffffffff80100142:       48 c7 c6 98 9b 10 80    mov    $0xffffffff80109b98,%rsi
    # 0xffff'ffff'8010'9b98 is the address of a string constant for panic();
    # the instruction bytestream only contains the least-significant 4 bytes

# p-allocator.asm:
100d62:                 bf e7 0d 10 00          mov    $0x100de7,%edi
    # 0x0000'0000'0010'0de7 is the address of a string constant for error_printf()

4-byte encodings of the lower 4 bytes of each address!

Putting it together

x86-64 hardware divides usable virtual address space in two
- Low canonical memory (0x0000'0000'0000'0000–0x0000'7FFF'FFFF'FFFF)
- High canonical memory (0xFFFF'8000'0000'0000–0xFFFF'FFFF'FFFF'FFFF)
Sign extension and x86-64 instruction encodings make the lowest and highest 2GB of the 64-bit space efficient locations for program instructions and objects
- 0x0000'0000'0000'0000–0x0000'0000'7FFF'FFFF
- 0xFFFF'FFFF'8000'0000–0xFFFF'FFFF'FFFF'FFFF
So on Chickadee,
- Process code is linked in the low 2GB, but can use all of low-canonical memory
- Kernel code is linked in the uppermost 2GB, which we call kernel text

Physical memory map

Kernels need an easy way to access any page of physical memory
- On 32-bit machines this was difficult: there were more physical addresses than virtual addresses, so kernels changed mappings to access some pages
- On 64-bit machines, virtual address space is abundant
- Reserve some of virtual address space for a physical memory map
- The page with physical address \(P\) can be accessed by the kernel at virtual address \(\texttt{MAP} + P\)
On Chickadee, the physical memory map is located in high-canonical addresses
- Page with physical address \(P\) is accessed at address \(\texttt{0xFFFF'8000'0000'0000} + P\)

Kernel text plus physical memory map

Kernel code is linked (and accessible) in the uppermost kernel text address range
Kernel code is also loaded into physical memory!
Therefore, kernel code is accessible via multiple virtual addresses
Example: kernel_start
- Accessible at kernel-text address 0xFFFF'FFFF'8010'0000
- Loaded at physical address 0x10'0000
- Accessible via physical memory map at 0xFFFF'8000'0010'0000

The boot process

x86-64 processors boot into an identity mapping
- Every virtual address maps to the numerically identical physical address
- The boot loader begins executing at address 0x7C00
The boot loader loads the Chickadee kernel into virtual memory
- Specifically, kernel text addresses!
Before loading, it must install a page table that maps kernel-text addresses to the appropriate physical memory

Boot-time page table constraints

An instruction movq REG, %cr3 instruction installs a new page table
That instruction itself is located in memory…
The address of the instruction after it must work in both the old page table (identity-mapped) and the new page table!

The boot-time page table

Maps physical addresses [0, 0x3FFF'FFFF) (i.e., first 1 GB) at…
- Low-canonical virtual addresses [0, 0x3FFF'FFFF)
  - So the machine doesn’t crash with page faults after installing the page table: boot loader instruction pointer, stack pointer, etc. are in low-canonical space
- High-canonical virtual addresses [0xFFFF'8000'0000'0000, 0xFFFF'8000'3FFF'FFFF)
  - A portion of the physical memory map
- Kernel text virtual addresses [0xFFFF'FFFF'8000'0000, 0xFFFF'FFFF'BFFF'FFFF)
  - This is so the boot loader can load the kernel at its linked virtual address, and execute kernel instructions where they expect

Boot sequence 1: Firmware loads boot loader with identity mapping (green)

Boot sequence 2: Boot loader installs boot page table

Boot sequence 3: Boot loader loads kernel

Instruction pointer is still near 0x7C00
Disk told to load into kernel text addresses
- Most instructions starting at 0xFFFF'FFFF'8010'0000 (a few at 0xFFFF'FFFF'8000'4000)

How is the boot page table represented? (Discussion question)

Check out bootentry.S!

Huge page table mappings!

In x86-64, a single 3rd-level page table entry can map 1 GB of physical memory
So the boot page table can be represented in only 2 physical pages!

Kernel’s early page table

The kernel constructs its own version of the boot page table, called the early page table
This early page table is a superset of the boot page table
- How does it differ?