Consistency models define what values a read can return after a write in a replicated system. Strong models (linearizability, sequential consistency) provide intuitive semantics but require coordination overhead. Weaker models (eventual consistency, causal consistency) improve availability and latency by tolerating temporary disagreement and concurrent write conflicts.
From your study of distributed systems, you know that replication — storing the same data on multiple nodes — is fundamental to fault tolerance and scalability. But replication introduces a problem: what happens when a client writes to one replica and reads from another? What value should the read return? Consistency models are the formal answer to this question. They define the contract between the system and its clients about which behaviors are observable.
The strongest commonly-used model is linearizability (sometimes called atomic consistency). It requires that every operation appear to execute instantaneously at a single point in time, and that this point respects real-world clock order. If a write completes before a read begins, the read must see the new value. This is the intuition most programmers have about a variable — you write 5, then you read 5. Achieving linearizability across replicas requires coordination: the system must ensure that before a read returns, it has contacted enough replicas to guarantee it has the latest value (typically a quorum).
Sequential consistency is slightly weaker. It requires that all clients observe operations in the same global order, but that order does not need to match wall-clock time. Client A's write might appear to happen "before" client B's write in the global sequence even if B's write completed earlier in real time. This relaxation removes the real-time constraint and can allow more efficient implementations, but the semantics are harder to reason about.
Eventual consistency is much weaker and much more practical for geographically distributed systems. It guarantees only that, if no new writes occur, all replicas will eventually converge to the same value. There is no bound on *when* — a read may see a stale value for seconds or minutes. This allows replicas to accept writes locally and synchronize asynchronously, yielding very low write latency. Systems like DNS and shopping cart databases often use eventual consistency.
Between these extremes, causal consistency offers a useful middle ground: if operation A causally precedes operation B (e.g., you read a message, then reply to it), then any client that sees B must also have seen A. This preserves the logical flow of causally related events without requiring global coordination. The choice of consistency model is ultimately an engineering tradeoff between correctness guarantees, latency, and availability — and it must match the actual requirements of the application.