Garbage Collection Algorithms

Graduate Depth 65 in the knowledge graph I know this Set as goal
Unlocks 2 downstream topics
garbage-collection memory-management runtime-system

Core Idea

Garbage collection automatically reclaims memory of unreachable objects, freeing programmers from manual deallocation. Reachability is determined from root references (stack, globals). Common algorithms include mark-and-sweep (mark reachable objects, sweep unreachable), generational (younger objects collected more often), and copying (move live objects to a new space). GC adds overhead but prevents memory leaks.

Explainer

From your study of memory management and runtime systems, you know that dynamically allocated objects live on the heap and that someone must decide when to free them. Manual memory management (as in C) puts this burden on the programmer, leading to use-after-free bugs and memory leaks. Garbage collection automates this decision using a simple principle: an object is garbage if no chain of references from any live variable (the root set) can reach it. The root set includes local variables on the call stack, global variables, and CPU registers — anything the running program can directly access. If no path of pointers leads from any root to an object, the program cannot possibly use that object again, so its memory can be safely reclaimed.

The most intuitive GC algorithm is mark-and-sweep. In the mark phase, the collector starts from every root and follows all pointer chains, marking each reachable object. In the sweep phase, it scans the entire heap and frees every unmarked object. This is conceptually simple — it is just a graph traversal from the root set — but it has two costs: it must pause the program during collection (a "stop-the-world" pause), and it leaves the heap fragmented because freed objects leave gaps of various sizes. Mark-and-compact extends this by sliding surviving objects together after sweeping, eliminating fragmentation but adding the cost of updating all pointers to moved objects.

Copying collection takes a different approach: divide the heap into two equal halves (fromspace and tospace). Allocate objects in fromspace until it fills up, then copy all reachable objects into tospace, compacting them in the process, and swap the roles of the two spaces. Allocation becomes trivially fast — just increment a pointer — and fragmentation is eliminated. The trade-off is that half the heap is always wasted as reserve space. Copying collection shines when most objects are short-lived, because copying only live objects means the cost is proportional to what survives, not what dies.

This observation — that most objects die young — motivates generational collection, the strategy used by nearly every modern runtime (JVM, .NET, V8, Python's CPython). The heap is divided into generations: a small nursery for new objects and one or more older generations. The nursery is collected frequently and cheaply (using copying collection), and only objects that survive several nursery collections are promoted to the older generation, which is collected less often. Since most objects are allocated and discarded quickly (temporary strings, loop variables, intermediate results), generational GC concentrates effort where it pays off most. The key engineering challenge is tracking pointers from old objects to young objects (via write barriers), so that nursery collections do not need to scan the entire old generation to find roots.

Practice Questions 5 questions

Prerequisite Chain

Counting to 10Counting to 20Understanding ZeroThe Number ZeroCounting to FiveOne-to-One CorrespondenceCombining Small Groups Within 5Addition Within 10Addition Within 20Two-Digit Addition Without RegroupingTwo-Digit Addition with RegroupingAddition Within 100Repeated Addition as MultiplicationMultiplication Facts Within 100Division as Equal SharingDivision as Grouping (Measurement Division)Division: Grouping (Repeated Subtraction) ModelDivision: Fair Sharing ModelDivision as Equal SharingDivision as GroupingBasic Division FactsDivision Facts Within 100Two-Digit by One-Digit DivisionDivision with RemaindersRemainders and Quotients in DivisionDivision Word ProblemsIntroduction to Long DivisionFactors and MultiplesPrime and Composite NumbersEquivalent FractionsRelating Fractions and DecimalsDecimal Place ValueReading and Writing DecimalsComparing and Ordering DecimalsAdding and Subtracting DecimalsMultiplying DecimalsDividing DecimalsDividing FractionsMixed Number ArithmeticOrder of OperationsOperators and ExpressionsArithmetic Operators and Operator PrecedenceComparison Operators and Boolean TestsLogical Operators and Boolean AlgebraBoolean Algebra and Fundamental LawsCombinational Circuit DesignFlip-Flops and LatchesBinary Counters: Design and AnalysisBinary ArithmeticFixed-Point Number RepresentationTwo's Complement RepresentationOverflow and Underflow DetectionBinary Adders: Half-Adders and Full-AddersFull Adder and Carry PropagationCarry Lookahead Adder DesignHalf Adder Circuit DesignMultiplication Circuit DesignSequential Circuit DesignRegisters and Register FilesInstruction Set Architecture (ISA)Assembly Language BasicsMemory Organization and AddressingMemory HierarchyMemory Management FundamentalsActivation Records and Stack FramesGarbage Collection Algorithms

Longest path: 66 steps · 239 total prerequisite topics

Prerequisites (2)

Leads To (1)