A topic in the Open Knowledge Graph — a free, open map of 15,290 topics and the order to learn them in.

Atomic Operations and Compare-and-Swap

College Depth 99 in the knowledge graph ☐ I know this ☆ Set as goal

2topics build on this

364prerequisites beneath it

The Critical Section Problem and Race Conditions Kernel Mode and Privilege Levels→→Spinlocks and Busy-Waiting Synchronization Test-and-Set and Atomic Primitives

Core Idea

Atomic operations execute indivisibly without interruption, enabling lock-free synchronization primitives. Compare-and-swap (CAS) atomically compares a memory location's value and conditionally updates it in a single operation. Lock-free algorithms using CAS can improve concurrency and reduce context switch overhead but are notoriously difficult to implement and reason about correctly.

Explainer

From the synchronization problem, you know that concurrent threads sharing memory can produce incorrect results when their operations interleave. The root cause is that ordinary operations like "read a variable, add one, write it back" are not indivisible — another thread can sneak in between the read and the write. Atomic operations solve this by making certain operations execute as a single, uninterruptible step from the perspective of all other threads. No thread can ever observe an atomic operation "half-done."

The most important atomic operation is compare-and-swap (CAS). It takes three arguments: a memory address, an expected value, and a new value. In a single atomic step, it checks whether the memory location currently holds the expected value. If it does, CAS replaces it with the new value and reports success. If it does not (because another thread changed it), CAS does nothing and reports failure. The calling thread can then re-read the current value, recompute its desired update, and try again. This "read-compute-CAS-retry" loop is the fundamental pattern of lock-free programming. For example, to atomically increment a counter, a thread reads the current value (say, 5), computes 6, then executes CAS(address, 5, 6). If another thread incremented it to 6 in the meantime, the CAS fails, the thread re-reads 6, computes 7, and retries.

CAS is implemented in hardware — the CPU provides instructions like `CMPXCHG` on x86 or `LDREX/STREX` on ARM that execute atomically with respect to all cores. This hardware support is essential: you cannot build correct atomic operations out of ordinary loads and stores alone because the CPU and memory system can reorder or interleave them in ways that break any software-only protocol. The kernel-mode privilege you studied earlier is relevant here because these hardware instructions are available in user space — unlike many OS features, threads do not need to trap into the kernel to use CAS, which is one reason lock-free code can be faster than mutex-based code for short critical sections.

The appeal of CAS-based lock-free algorithms is that no thread ever blocks: if a CAS fails, the thread simply retries rather than sleeping. This eliminates problems like priority inversion (a high-priority thread waiting for a low-priority lock holder) and reduces context-switch overhead. However, lock-free programming is notoriously difficult. The ABA problem is a classic pitfall: a value changes from A to B and back to A between a thread's read and its CAS, so CAS succeeds even though the underlying state changed in ways the thread did not account for. Solutions include tagged pointers (appending a version counter to the value) and hazard pointers for memory reclamation. For most application code, mutexes remain the right choice — CAS-based lock-free structures are reserved for performance-critical infrastructure like concurrent queues, memory allocators, and reference counters where the complexity is justified.

Practice Questions 5 questions

Prerequisite Chain

Understanding Zero → The Number Zero → Counting to Five → Counting to 10 → Counting to 20 → Counting a Set of Objects Up to 20 → Cardinality: The Last Number Counted → Matching Numerals to Quantities → Subitizing Small Quantities → Addition Within 10 → Number Bonds to 10 → Addition Within 20 → Doubles and Near Doubles → Doubles Facts Within 10 → Near Doubles Facts Within 20 → Mental Math Strategies for Addition → Mental Math: Adding and Subtracting Tens → Addition Within 100 → Repeated Addition as Multiplication → Multiplication as Equal Groups → Multiplication: Arrays → Basic Multiplication Facts (0s, 1s, 2s, 5s, 10s) → Multiplication Facts Within 100 → Division as Equal Sharing → Division as Grouping (Measurement Division) → Division: Grouping (Repeated Subtraction) Model → Division: Fair Sharing Model → Division as Equal Sharing → Division as Grouping → Basic Division Facts → Division Facts Within 100 → Multiplication and Division Fact Families → Relationship Between Multiplication and Division → Division Facts as Inverse of Multiplication → Remainders and Quotients in Division → Division Word Problems → Multi-Step Word Problems → Solving Multi-Step Word Problems → Multiplication Word Problems → Division Word Problems → Introduction to Long Division → Factors and Multiples → Prime and Composite Numbers → Equivalent Fractions → Relating Fractions and Decimals → Decimal Place Value → Integers and the Number Line → Comparing and Ordering Integers → Absolute Value → Adding Integers → Subtracting Integers → Multiplying Integers → Introduction to Exponents → Order of Operations → Integer Order of Operations → Variable Expressions → The Distributive Property → Variables and Expressions Review → Introduction to Polynomials → Adding and Subtracting Polynomials → Multiplying Polynomials → Factorial → Permutations → Combinations → Counting Principles: Addition and Multiplication Rules → Introduction to Graph Theory → Propositional Logic Foundations → Logical Equivalences → Boolean Algebra → Boolean Type and Truth Values → Comparison Operators and Boolean Tests → Logical Operators and Boolean Algebra → Boolean Algebra and Fundamental Laws → Logic Gates Fundamentals → Implementing Boolean Functions with Gates → Karnaugh Map Simplification → Combinational Circuit Design → Flip-Flops and Latches → Binary Counters: Design and Analysis → Binary Arithmetic → Fixed-Point Number Representation → Two's Complement Representation → Overflow and Underflow Detection → Binary Adders: Half-Adders and Full-Adders → Full Adder and Carry Propagation → Carry Lookahead Adder Design → Half Adder Circuit Design → Multiplication Circuit Design → Sequential Circuit Design → Registers and Register Files → Instruction Set Architecture (ISA) → Kernel Architecture and OS Structure → System Calls and User/Kernel Mode → Processes and the Process Control Block → Process Creation: fork() and exec() → Process Termination and Resource Cleanup → Process States and State Transitions → Threads and Concurrency → The Critical Section Problem and Race Conditions → Atomic Operations and Compare-and-Swap

Longest path: 100 steps · 364 total prerequisite topics

Prerequisites (2)

The Critical Section Problem and Race Conditionshard Kernel Mode and Privilege Levelssoft

Leads To (2)

Spinlocks and Busy-Waiting Synchronizationsoft Test-and-Set and Atomic Primitivessoft