Thread Models: User-Level and Kernel Threads

College Depth 63 in the knowledge graph I know this Set as goal
Unlocks 1 downstream topic
threading lightweight-concurrency kernel-vs-user

Core Idea

Threads are lightweight execution units sharing an address space within a process. User-level threads are scheduled by user-space libraries, reducing kernel overhead but limiting parallelism to one thread per process. Kernel threads are scheduled by the OS, enabling true parallelism. Hybrid models (M:N) attempt to balance overhead and parallelism.

Explainer

You already know that a process has its own address space, registers, and resources. A thread is a way to have multiple streams of execution within a single process, all sharing the same memory and open files but each with its own program counter, register set, and stack. If a process is like a house, threads are the people living in it — they share the kitchen and living room (memory, file handles) but each has their own to-do list (instruction pointer) and personal workspace (stack). The critical question is: who manages these threads?

User-level threads are managed entirely by a library in user space — the operating system doesn't even know they exist. The thread library handles creation, scheduling, and switching between threads, all without making system calls. This makes thread operations extremely fast: creating a user-level thread or switching between them might take microseconds rather than the tens of microseconds a kernel call would cost. The tradeoff is severe, though. Because the OS sees only one process, it schedules that process onto one CPU core. If you have four user-level threads, they take turns running on that one core — you get concurrency (interleaved execution) but not parallelism (simultaneous execution). Even worse, if one user-level thread makes a blocking system call (like a disk read), the entire process blocks, freezing all threads, because the kernel doesn't know there are other threads that could keep running.

Kernel-level threads solve these problems by making the OS aware of each thread. The kernel schedules threads individually, so different threads in the same process can run on different CPU cores simultaneously — true parallelism. If one thread blocks on I/O, the kernel simply schedules another thread from the same process. The cost is overhead: every thread creation, destruction, and context switch requires a system call and kernel data structures. On modern systems, this overhead is small enough that kernel threads are the dominant model. Linux, for example, implements threads as lightweight processes (via `clone()`) that share address space, making kernel thread operations fast enough for most applications.

The hybrid M:N model maps M user-level threads onto N kernel threads, trying to get the best of both worlds: fast user-space switching for threads that don't need parallelism, and kernel threads to provide actual parallel execution across cores. The user-space scheduler multiplexes many user threads onto fewer kernel threads. When a user thread blocks, the scheduler can swap in another user thread on the same kernel thread. In theory, this is ideal. In practice, M:N threading is complex to implement correctly — the user scheduler and kernel scheduler can make conflicting decisions, leading to priority inversion and subtle bugs. Most modern systems have abandoned M:N models in favor of 1:1 (one user thread per kernel thread), accepting the modest kernel overhead in exchange for simplicity and predictability. Go's goroutine scheduler is a notable modern exception that successfully implements an M:N-like approach.

Practice Questions 5 questions

Prerequisite Chain

Counting to 10Counting to 20Understanding ZeroThe Number ZeroCounting to FiveOne-to-One CorrespondenceCombining Small Groups Within 5Addition Within 10Addition Within 20Two-Digit Addition Without RegroupingTwo-Digit Addition with RegroupingAddition Within 100Repeated Addition as MultiplicationMultiplication Facts Within 100Division as Equal SharingDivision as Grouping (Measurement Division)Division: Grouping (Repeated Subtraction) ModelDivision: Fair Sharing ModelDivision as Equal SharingDivision as GroupingBasic Division FactsDivision Facts Within 100Two-Digit by One-Digit DivisionDivision with RemaindersRemainders and Quotients in DivisionDivision Word ProblemsIntroduction to Long DivisionFactors and MultiplesPrime and Composite NumbersEquivalent FractionsRelating Fractions and DecimalsDecimal Place ValueReading and Writing DecimalsComparing and Ordering DecimalsAdding and Subtracting DecimalsMultiplying DecimalsDividing DecimalsDividing FractionsMixed Number ArithmeticOrder of OperationsOperators and ExpressionsArithmetic Operators and Operator PrecedenceComparison Operators and Boolean TestsLogical Operators and Boolean AlgebraBoolean Algebra and Fundamental LawsCombinational Circuit DesignFlip-Flops and LatchesBinary Counters: Design and AnalysisBinary ArithmeticFixed-Point Number RepresentationTwo's Complement RepresentationOverflow and Underflow DetectionBinary Adders: Half-Adders and Full-AddersFull Adder and Carry PropagationCarry Lookahead Adder DesignHalf Adder Circuit DesignMultiplication Circuit DesignSequential Circuit DesignRegisters and Register FilesInstruction Set Architecture (ISA)Kernel Architecture and OS StructureSystem Calls and User/Kernel ModeProcesses and the Process Control BlockThread Models: User-Level and Kernel Threads

Longest path: 64 steps · 237 total prerequisite topics

Prerequisites (1)

Leads To (1)