Types of address space
- Hardware physical memory (sometimes called machine memory)
- Actual chips
- Managed by the VMM
- Guest physical memory
- That is, memory the guest OS thinks is physical
- Allocated to a guest by the VMM
- Managed by the guest
- Virtual memory
- Allocated by a guest OS to a guest application
Types of page table
- Guest maintains virtual to guest-physical page table in x86-64 page table format (V→GP)
- Processor needs virtual to hardware-physical page table in x86-64 page table format (V→HP)
- Sometimes called “shadow page table”
- How to compute it?
- Virtual machine monitor maintains a guest-physical to hardware-physical table (GP→HP)
- Can have any format
- VMM computes V→HP = GP→HP ∘ V→GP
- But when?
Events that should trigger V→HP change
- Guest executes
movq PT, %cr3
(PT
is a GP address)- This instruction is privileged and will trap to the VMM
- The VMM can examine the memory starting at
PT
, check it for validity, compose it with GP→HP, create V→HP in VMM-only memory, and install it
- Guest modifies any memory that’s part of a V→GP page table
- Modifications must be transferred immediately to the corresponding V→HP table
- If modifications aren’t transferred, what can go wrong?
- Solutions?
Implementing V→HP
- Dynamic translation: Translate the whole kernel
- Detect memory writes to V→GP page table pages
- Update V→HP tables accordingly
- Paravirtualization (Xen): Guest OS constructs V→HP, VMM validates it
- VMM informs guest which physical pages it may access
- Guest OS constructs V→HP table, installs it
- Xen validates V→HP table (guest cannot access other memory)
- Xen marks V→HP pages read-only
- Any updates to V→HP table require hypercalls (batched for efficiency)
V→HP and Intel VT-x
- When VT-x was new, it had worse performance than dynamic binary translation!
- Intially lacked support for MMU virtualization
- Meant VMM had to use Xen-style paravirtualization, or write-protect all V→GP page table
- Either way, lots of #vmexit events
- Needed: Some way for the hardware to allow guests to change page tables—without compromising VMM safety
Extended Page Tables and MMU virtualization
- Intel EPT
(AMD had a prior version) introduces hardware support for the GP→HP table
- A per-guest EPTP register (Extended Page Table Pointer) holds the top-level page for this table
- The processor’s V→HP lookup accesses both the V→GP table and the GP→HP table
- How many memory accesses are required to compute one V→HP mapping?
Kernel virtual machines
-
Initial VMMs (VMware, Xen) were separate software code bases
-
Can we use a normal OS kernel as a VMM to simplify management?
-
KVM is a Linux kernel feature that allows a Linux kernel to behave as a VMM
-
QEMU provides device support, translating host I/O (system calls) to guest I/O (fake devices)
-
QEMU has multiple millions of lines of code
-
Can we do better?
-
Amazon Firecracker
Memory virtualization for devices
- VMM must check all uses of physical addresses
- Does this prevent guests from accessing memory-mapped I/O?
- Example: AHCI requests include physical memory
- VMM must validate or abstract
- Performance costs
- Hardware solution: IOMMU (e.g., Intel VT-d)
- Introduce a page-table structure for devices
- Facilitates direct device-to-user communication too!
- DPDK