Notes by Alisha Ukani
Segments
%gs
and%fs
are segment prefixes. They are system registers that contain a number that is added to the address%gs
= add the contents of the current "GS_BASE" (a 64-bit system register) to the address- We can pretend that there’s a global
uintptr_t gs_base
for illustrative purposes
- The instruction
movq %gs:(8), %rsp
has the meaninguintptr_t rsp = *(uintptr_t *) (gs_base + 8)
in “fake C”- We have to cast the address
gs_base + 8
touintptr_t*
oruint64_t*
to model the fact thatmovq
moves 8 bytes
- We have to cast the address
- Why have a segment prefix?
- Threads have their own registers but share the same view of virtual memory
- In a multi-threaded program, a variable might have a different value per thread (thread-specific variables, e.g. thread ID)
ERRNO
is a thread-specific variable
- The linker changes variable names to addresses, so without segment prefixes, each thread would have a reference to the same address! Since threads have the same view of memory, we wouldn't be able to access the different values we need
- Instead, we could have a region of storage for all threads' variables, and each variable has the same location but at a different location (e.g.
return per_thread.storage[curthread]->threadid
) but that requires us to already know what the current thread is - So, we have a register (
%gs
) to refer to the storage for the current thread
v gs base for T1 = 0x1000
|------------------------------------------------------------ ...
| thread 1's TLS | | thread 2's TLS | (virtual memory space)
|------------------------------------------------------------ ...
^ gs base for T2 = 0x3000
The compilers and the linker create a
struct
for every thread's storage. Example:struct thread_storage { int threadid; void* spare_memory; ... }
- Then, getting the current thread ID would be
movl %gs:(0), %eax
- BUT that means threads can access each other's memory
- Then, getting the current thread ID would be
Context switching
- Kernel sets up some system registers:
- Entry
%rip
(first code that kernel will run) - Entry privilege level (to say that kernel will be running)
- Entry
%rsp
- Entry
%rflags
(contains info like are interrupts enabled? Is I/O access enabled?) - Entry segments (just for backwards compatibility)
- Each CPU has these registers, and those can have different values. Still, everyting but
%rsp
are the same for each CPU- The stack pointer MUST be different for each core because we don't want a race condition if multiple cores received an interrupt and tried to save states at the exact same
- Entry
- Processor pushes the following onto the entry stack so the kernel can restore them later: intr_time
%rip
, intr_time%cs
, intr_time%rsp
, intr_time%rflags
, intr_time%ss
- Weensy: After pushing the registers onto the stack, the kernel saves those registers into the that process's descriptor because we may need to switch to another process
- When process makes a system call (voluntarily gives control to the kernel), we don't need to save all the registers like we need to with interrupts.
- Intel
syscall
doesn't save anything to memory and modifies as little as possible- Has to change
%rip
, the privilege level,%rflags
(need to disable interrupts)- We save
%rsp
to a per-CPU region of memory using%gs
. The kernel first restores%gs
to a known, safe value (bc user can change the value) which is per-CPU and defined by kernel. It points to acpustate
object for that CPU. We put%rsp
in ascratchspace
pointer in thecpustate
object. We then load the current process in thecpustate
object into%rsp
, and add the size of the kernel stack because we want the stack to grow down. Then, we can push the current process state onto the stack
- We save
- We need to save the old values, so we put them in registers.
%rip
goes in%rcx
,%rflags
goes in%r11
- We don't need to save the privilege level because we're going from unprivileged code to privilege and back to unprivileged
- Has to change
- Compiler will save caller-saved registers +
%rcx
and%r11