A la Mickens
Slides from the inestimable James Mickens
Implementing protection
- If the CPU is asked to execute an invalid instruction, it causes an exception
- When the timer interrupt goes off (periodically, to avoid infinite loops), it raises an exception
- Somewhat configurable (e.g., how often the timer interrupt goes off), but little variation among OSes
- Memory protection is another story!
- Critical resource
- Layout hugely variable across OSes
- Time has revealed lots of use cases for interesting sharing
- Needs fine-grained configuration
- Paged virtual memory
Physical memory and virtual memory on x86-64
- x86-64 computers support:
- Up to \(2^{52}\) bytes of physical memory (though most processors have much less)
- Up to \(2^{57}\) virtual addresses (though most processors support \(2^{48}\))
- Physical addresses are contiguous in the range \([0, 2^{\texttt{MAXPHYADDR}})\)
- Virtual addresses are not contiguous
- For a processor with \(2^{48}\) virtual addresses:
- \([0, 2^{47})\) are valid low canonical addresses
- 0x0000'0000'0000'0000–0x0000'7FFF'FFFF'FFFF
- \([2^{64}-2^{47}, 2^{64})\) are valid high canonical addresses
- 0xFFFF'8000'0000'0000–0xFFFF'FFFF'FFFF'FFFF
- All other addresses are invalid!
- \([0, 2^{47})\) are valid low canonical addresses
- For a processor with \(2^{48}\) virtual addresses:
Picturing the valid virtual address space
- \(2^{48}\) valid virtual addresses, not to scale (valid addresses in red):
- To scale (there really are red bars, you just can’t see them):
- \(2^{64} \gg 2^{48}!\)
- A newer machine with \(2^{57}\) valid virtual addresses, to scale:
Discussion exercise: Why might designers choose this layout?
Contiguity
- Ideally, unprivileged processes would have access to a large, contiguous range of virtual addresses (not a patchy range where some addresses were reserved)
- But for technical and convenience reasons, some of the virtual address space must be reserved for the kernel
- For compatibility reasons, upgrading the kernel or getting new hardware shouldn’t change this reservation
- So where should the reservation go?
- Need room to grow the kernel’s reservation and room to grow the process’s reservation
- Ideally both reservations would be contiguous
WeensyOS interlude
- WeensyOS has a simple and limited virtual memory design
- Kernel instructions and data occupy the range 0x4'0000–0x8'0000
- Unprivileged processes occupy the range 0x10'0000 and up
- But this is too limited for a real kernel, which needs much more than 1/4MB
of code and data
- For example, a real kernel stores information about the allocation state of each physical page
- Linux stores this information in a structure called
struct page
struct page
is ~40 bytes (more or less depending on configuration)- A machine with 32 GB of memory (not unusual) has \(2^{23}\) pages
- Requiring 320 MB to store!
Sign extension
- Idea: split the address space into two halves that grow toward one another!
- Most OSes use the upper half (so unprivileged processes aren’t saddled with addresses like 0xFFFFFFFF80000000)
- Sign extension on x86-64 lets us represent addresses up to 0x7FFF'FFFF
or above 0xFFFF'FFFF'8000'0000 especially efficiently!
- Sign extension is how twos-complement signed numbers of one bit width are translated to a larger bit width
- Copy the sign bit out to the desired width!
- So signed 32-bit 0x8000'0000 is extended to signed 64-bit 0xFFFF'FFFF'8000'0000
- Examples from Chickadee:
4-byte encodings of the lower 4 bytes of each address!
# kernel.asm: ffffffff80100142: 48 c7 c6 98 9b 10 80 mov $0xffffffff80109b98,%rsi # 0xffff'ffff'8010'9b98 is the address of a string constant for panic(); # the instruction bytestream only contains the least-significant 4 bytes # p-allocator.asm: 100d62: bf e7 0d 10 00 mov $0x100de7,%edi # 0x0000'0000'0010'0de7 is the address of a string constant for error_printf()
Putting it together
- x86-64 hardware divides usable virtual address space in two
- Low canonical memory (0x0000'0000'0000'0000–0x0000'7FFF'FFFF'FFFF)
- High canonical memory (0xFFFF'8000'0000'0000–0xFFFF'FFFF'FFFF'FFFF)
- Sign extension and x86-64 instruction encodings make the lowest and highest
2GB of the 64-bit space efficient locations for program instructions and
objects
- 0x0000'0000'0000'0000–0x0000'0000'7FFF'FFFF
- 0xFFFF'FFFF'8000'0000–0xFFFF'FFFF'FFFF'FFFF
- So on Chickadee,
- Process code is linked in the low 2GB, but can use all of low-canonical memory
- Kernel code is linked in the uppermost 2GB, which we call kernel text
Physical memory map
- Kernels need an easy way to access any page of physical memory
- On 32-bit machines this was difficult: there were more physical addresses than virtual addresses, so kernels changed mappings to access some pages
- On 64-bit machines, virtual address space is abundant
- Reserve some of virtual address space for a physical memory map
- The page with physical address \(P\) can be accessed by the kernel at virtual address \(\texttt{MAP} + P\)
- On Chickadee, the physical memory map is located in high-canonical addresses
- Page with physical address \(P\) is accessed at address \(\texttt{0xFFFF'8000'0000'0000} + P\)
Kernel text plus physical memory map
- Kernel code is linked (and accessible) in the uppermost kernel text address range
- Kernel code is also loaded into physical memory!
- Therefore, kernel code is accessible via multiple virtual addresses
- Example:
kernel_start
- Accessible at kernel-text address 0xFFFF'FFFF'8010'0000
- Loaded at physical address 0x10'0000
- Accessible via physical memory map at 0xFFFF'8000'0010'0000
The boot process
- x86-64 processors boot into an identity mapping
- Every virtual address maps to the numerically identical physical address
- The boot loader begins executing at address
0x7C00
- The boot loader loads the Chickadee kernel into virtual memory
- Specifically, kernel text addresses!
- Before loading, it must install a page table that maps kernel-text addresses to the appropriate physical memory
Boot-time page table constraints
- An instruction
movq REG, %cr3
instruction installs a new page table - That instruction itself is located in memory…
- The address of the instruction after it must work in both the old page table (identity-mapped) and the new page table!
The boot-time page table
- Maps physical addresses [0, 0x3FFF'FFFF) (i.e., first 1 GB) at…
- Low-canonical virtual addresses [0, 0x3FFF'FFFF)
- So the machine doesn’t crash with page faults after installing the page table: boot loader instruction pointer, stack pointer, etc. are in low-canonical space
- High-canonical virtual addresses [0xFFFF'8000'0000'0000, 0xFFFF'8000'3FFF'FFFF)
- A portion of the physical memory map
- Kernel text virtual addresses [0xFFFF'FFFF'8000'0000, 0xFFFF'FFFF'BFFF'FFFF)
- This is so the boot loader can load the kernel at its linked virtual address, and execute kernel instructions where they expect
- Low-canonical virtual addresses [0, 0x3FFF'FFFF)
Boot sequence 1: Firmware loads boot loader with identity mapping (green)
Boot sequence 2: Boot loader installs boot page table
Boot sequence 3: Boot loader loads kernel
- Instruction pointer is still near 0x7C00
- Disk told to load into kernel text addresses
- Most instructions starting at 0xFFFF'FFFF'8010'0000 (a few at 0xFFFF'FFFF'8000'4000)
How is the boot page table represented? (Discussion question)
- Check out
bootentry.S
!
Huge page table mappings!
- In x86-64, a single 3rd-level page table entry can map 1 GB of physical memory
- So the boot page table can be represented in only 2 physical pages!
Kernel’s early page table
- The kernel constructs its own version of the boot page table, called the early page table
- This early page table is a superset of the boot page table
- How does it differ?