This lecture had some lecture material (notes by Alisha Ukani) and some questions we worked through in groups.
Notes
Pset 2 preview
- General purpose allocator that can allocate multiples of a page (buddy allocators)
- Scheduling and waiting, which allows processes to exit, free memory, and wait on a condition (which involves context switching)
- C++ nonsense (we'll get a linked list library, so need to get used to using libraries)
Entry points (pset 1)
- Entry point = C++ code gets control from Assembly for the first time, basically the C++ code is starting from scratch (an empty stack)
- If C++ code were resuming, then its stack would be aligned from before. BUT here we are restarting
k-exception.S
is handwritten Assembly code, so it can still call C++ code to accomplish the main body of what needs to be done%gs:0
points to acpustate
objectmovq %gs:(0), %rdi
stores a pointer to the currentcpustate
in%rdi
(because the first thing in acpustate
is a pointer to itself)leaq CPUSTACK_SIZE(%rdi), %rsp
sets%rsp
to be the end of the CPU stack- Any code that runs on the CPU stack doesn't return, also interrupts are turned off (the only function that really runs here is
schedule()
) - We use a CPU stack instead of a process stack because a process may be destroyed, but the CPU won't be
Context switches
- Interrupts are involuntary context switches so we need to save ALL the registers, system calls are voluntary context switches so we can ask the user process to save some registers according to a calling convention
- For system calls, we don't need to restore caller-saved registers* before returning to user process (*there is an exception we will talk about later in the course)
- We also don't need to restore the callee-saved because we never change its value, and then we go to kernel C++ code, which obeys calling conventions, so the C++ compiler is guaranteeing that the callee-saved registers will be preserved
this->yield()
must save enough of state so that the process can be resumed later- It must save callee-saved registers and rflags
- Also save
%rsp
to theyieldstate
so that when we resume, we know where the registers are and we can restore them - Then it disables interrupts and switches to cpustack
Yield mechanisms
The timer interrupt case (INT_IRQ + INT_TIMER
in kernel.cc
) calls
proc::yield_noreturn()
. Change it to call proc::yield()
instead.
The SYSCALL_YIELD
case (in kernel.cc
) calls proc::yield()
. Change it to
call proc::yield_noreturn()
instead.
Changing the timer interrupt to call
proc::yield()
is simple:// this->regs_ = regs; // this->yield_noreturn(); this->yield(); break;
There’s no need to store an explicit resume point. When
proc::exception
returns (afterthis->yield()
resumes and returns), the interrupted process will resume.To use
proc::yield_noreturn()
inSYSCALL_YIELD
, we must set the system call’s return value by modifyingregs
explicitly. System calls, unlike general exceptions, have return values, and it’s important to get them right.// this->yield(); // return 0; -- sets reg_rax this->regs_ = regs; regs->reg_rax = 0; this->yield_noreturn(); // NB does not return break;
System calls and information leaks
The syscall_entry
implementation can leak information from the kernel.
Explain how, and explain whether and why this is a problem. Find a good
reference online to a similar issue in Linux or another kernel.
The return sequence in
syscall_entry
does not restore registers from the pushed versions on the stack. Instead, it skips over them with anaddq
to%rsp
:addq $(8 * 19), %rsp swapgs iretq
Thus, when the user-level process resumes, it will observe the register values left by the kernel.
For callee-saved registers like
%rbx
, there’s no problem. The C++ compiler will save and restore those registers, so when the calling process regains control, it sees no changes.%rax
isn’t a problem either; it contains the system call return value. But the other registers could contain information this process shouldn’t be allowed to see, such as system call arguments from other processes! For example, the following sequence could happen:
- Process 1 makes a system call with 5 arguments, which are passed in registers
%rdi
,%rsi
,%rdx
,%rcx
, and%r8
.- The kernel task for process 1 yields without changing
%r8
.- The kernel task for process 2 resumes, also without changing
%r8
.- Process 2 resumes. Its value for
%r8
equals Process 1’s system call argument!This could be a disaster if that argument was a secret of some kind.
Here’s a reference to an issue almost exactly like this that affected an older version of Linux. The leak allowed a process running in 32-bit mode to view 64-bit registers from other processes.
https://jon.oberheide.org/blog/2009/10/04/linux-kernel-x86-64-register-leak/