A web server process runs for 10ms on Core 2, gets preempted, and the scheduler migrates it to Core 5. What is the primary performance cost of this migration?
ACore 5 runs at a lower clock speed than Core 2
BThe process must rebuild its working set in Core 5's cache, fetching data from slower shared cache or main memory
CMigrating a process requires copying its memory to Core 5's local memory bank
DThe scheduler takes longer to dispatch on Core 5 because it must load new CPU state
When a process runs on Core 2, Core 2's L1 and L2 caches fill with the process's recently accessed data — its cache is 'warm.' When the process migrates to Core 5, Core 5's cache has no knowledge of this process and is 'cold.' The process must re-fetch all its working data from the shared L3 cache or DRAM, which is orders of magnitude slower than L1/L2. The migration itself (saving and restoring register state) is fast; the cache cold-start penalty is the real cost.
Question 2 Multiple Choice
On a 2-socket NUMA server, a database process is pinned to cores on socket 0 but its data buffers were allocated in socket 1's memory. What is the consequence?
AThe process cannot access socket 1's memory and will crash
BEvery memory access must cross the inter-socket interconnect, incurring 2–3x the latency of local memory access
CThe OS will automatically migrate the data to socket 0's memory over time
DPerformance is identical because modern NUMA systems use cache coherence to hide the difference
On NUMA systems, each socket has its own memory bank. Memory accesses to local memory are fast; accesses to a remote socket's memory must cross the interconnect (QPI, Infinity Fabric, etc.), which adds significant latency — typically 2–3x slower than local access. Cache coherence ensures *correctness* across sockets but does not eliminate the latency penalty. This is why NUMA-aware memory allocation (ensuring threads and their data live on the same socket) is as important as CPU binding.
Question 3 True / False
Processor affinity improves performance by preventing the OS scheduler from migrating a process to a CPU whose cache does not contain the process's working set.
TTrue
FFalse
Answer: True
This is precisely the mechanism: the CPU's hardware cache builds up a warm working set for a process over time. If the scheduler migrates the process to a different core, that cache warmth is lost and must be rebuilt from scratch. Processor affinity — whether soft (preference) or hard (restriction) — keeps the process on the core whose cache is already warm, reducing expensive cache misses. The hardware built the locality; affinity prevents the scheduler from discarding it.
Question 4 True / False
Hard affinity is generally preferable to soft affinity because it guarantees the process generally runs on a warm cache.
TTrue
FFalse
Answer: False
Hard affinity trades scheduling flexibility for cache locality. If pinned cores are busy and other cores sit idle, the scheduler cannot use those idle cores even when the pinned threads are waiting. This can cause load imbalance and hurt overall throughput. Soft affinity achieves most of the cache benefit — it tries to keep processes on their home core — while preserving the freedom to migrate when load balancing requires it. Hard affinity is the right choice for latency-sensitive applications (real-time audio, HFT) but often the wrong default for general workloads.
Question 5 Short Answer
Why is processor affinity described as preventing the scheduler from 'undoing' something the hardware has already built up? What has the hardware built up, and how does migration undo it?
Think about your answer, then reveal below.
Model answer: The hardware — specifically the CPU's L1 and L2 caches — builds up a warm working set for a running process over time. As the process accesses memory, the cache hierarchy loads frequently used data into fast local cache. When the OS migrates the process to a different core, that core's cache contains no data relevant to the process; the process must re-fetch everything from the slower shared L3 cache or DRAM. The work the cache hierarchy did — anticipating the process's memory needs — is discarded. Processor affinity prevents this by keeping the process on the same core whose cache already contains its working set.
This framing clarifies that processor affinity is a cache management strategy, not a CPU speed-up or priority mechanism. The processor itself isn't faster; you're simply avoiding the penalty of cold-cache restarts. It also explains why soft affinity captures most of the benefit: rare migrations don't eliminate the advantage, only frequent ones do.