Two-Phase Commit Protocol

Graduate Depth 68 in the knowledge graph I know this Set as goal
Unlocks 2 downstream topics
transactions commit protocol coordinator

Core Idea

Two-phase commit (2PC) coordinates distributed transactions: in the prepare phase, a coordinator asks all participants if they can commit (they lock resources and say yes/no); in the commit phase, it tells them to commit or abort. It ensures atomicity but blocks resources during the prepare phase and becomes unavailable if the coordinator crashes during commit.

How It's Best Learned

Trace through a successful 2PC and a failure scenario (coordinator crashes after prepare, before commit decision). Understand why participants must log before responding 'yes' and why the coordinator must log the commit decision before sending commit messages.

Common Misconceptions

Explainer

From your study of the consensus problem, you know that getting distributed nodes to agree on a value is fundamentally difficult — the FLP impossibility result shows that no deterministic protocol can guarantee agreement in an asynchronous system with even one crash. The two-phase commit protocol (2PC) sidesteps this impossibility by accepting a specific tradeoff: it guarantees atomicity (all commit or all abort) but sacrifices availability when the coordinator fails. Understanding this tradeoff is the key to understanding both when 2PC is the right tool and when it is not.

The protocol has two phases, each named for what the coordinator does. In the prepare phase, the coordinator sends a "prepare" message to every participant. Each participant must decide whether it can commit — it acquires locks on all relevant resources, writes a prepare record to its local log (so it can recover after a crash), and responds with either "yes" (I promise I can commit if asked) or "no" (I cannot commit). A "yes" vote is a binding promise: the participant has guaranteed that it can commit no matter what happens next. This is why logging before responding is critical — if the participant crashes after voting yes, it must be able to honor that vote after recovery.

In the commit phase, the coordinator collects all votes. If every participant voted yes, the coordinator writes a commit record to its own log, then sends "commit" to all participants. If any participant voted no (or timed out), the coordinator writes an abort record and sends "abort." Each participant, upon receiving the decision, applies or discards the transaction and releases its locks. The coordinator's log entry is the single point of truth — once the commit record is written, the transaction is committed regardless of subsequent failures.

The vulnerability of 2PC lies in the window between a participant voting yes and receiving the coordinator's decision. During this interval, the participant has promised to commit but does not yet know the outcome. If the coordinator crashes, the participant is blocked — it cannot commit (because maybe another participant voted no) and it cannot abort (because it promised to commit if asked). It must hold its locks and wait for the coordinator to recover. This blocking window is the fundamental weakness of 2PC, and it is why the protocol is unsuitable for long-running transactions or environments where coordinator failure is likely. The three-phase commit protocol attempts to address this by adding an intermediate phase, though it introduces its own complexity.

Despite this limitation, 2PC remains the workhorse protocol for distributed transactions in traditional relational databases. When the coordinator is a highly available database engine and transactions last milliseconds, the blocking window is brief and the risk is manageable. The protocol is simple, well-understood, and provides true ACID atomicity across multiple resource managers — a guarantee that weaker alternatives like sagas cannot match.

Practice Questions 5 questions

Prerequisite Chain

Counting to 10Counting to 20Understanding ZeroThe Number ZeroCounting to FiveOne-to-One CorrespondenceCombining Small Groups Within 5Addition Within 10Addition Within 20Two-Digit Addition Without RegroupingTwo-Digit Addition with RegroupingAddition Within 100Repeated Addition as MultiplicationMultiplication Facts Within 100Division as Equal SharingDivision as Grouping (Measurement Division)Division: Grouping (Repeated Subtraction) ModelDivision: Fair Sharing ModelDivision as Equal SharingDivision as GroupingBasic Division FactsDivision Facts Within 100Two-Digit by One-Digit DivisionDivision with RemaindersRemainders and Quotients in DivisionDivision Word ProblemsIntroduction to Long DivisionFactors and MultiplesPrime and Composite NumbersEquivalent FractionsRelating Fractions and DecimalsDecimal Place ValueReading and Writing DecimalsComparing and Ordering DecimalsAdding and Subtracting DecimalsMultiplying DecimalsDividing DecimalsDividing FractionsMixed Number ArithmeticOrder of OperationsOperators and ExpressionsArithmetic Operators and Operator PrecedenceComparison Operators and Boolean TestsLogical Operators and Boolean AlgebraBoolean Algebra and Fundamental LawsCombinational Circuit DesignFlip-Flops and LatchesBinary Counters: Design and AnalysisBinary ArithmeticFixed-Point Number RepresentationTwo's Complement RepresentationOverflow and Underflow DetectionBinary Adders: Half-Adders and Full-AddersFull Adder and Carry PropagationCarry Lookahead Adder DesignHalf Adder Circuit DesignMultiplication Circuit DesignSequential Circuit DesignRegisters and Register FilesInstruction Set Architecture (ISA)Kernel Architecture and OS StructureSystem Calls and User/Kernel ModeProcesses and the Process Control BlockLogical Clocks and Event OrderingVector Clocks and Capturing CausalityHappened-Before Relation and Causal OrderingConsistency Models in Distributed SystemsReplication Strategies and Trade-offsTwo-Phase Commit Protocol

Longest path: 69 steps · 252 total prerequisite topics

Prerequisites (2)

Leads To (2)