Questions: Read Repair and Anti-Entropy Mechanisms
5 questions to test your understanding
Score: 0 / 5
Question 1 Multiple Choice
A key storing a rarely-accessed configuration value was updated on replica A but missed by replica B during a network partition. Six months pass with no client ever reading this key. Which mechanism would have corrected the inconsistency during that time?
ARead repair, because it detects inconsistencies whenever a client reads the key from multiple replicas
BAnti-entropy, because it proactively compares and repairs replicas regardless of whether the key is read
CBoth mechanisms would have corrected it within seconds of the partition healing
DNeither — eventual consistency only guarantees convergence for actively read data
Read repair is opportunistic: it only fires when a client actually reads the key and triggers a comparison across replicas. If no client reads the key for six months, read repair never runs on it. Anti-entropy is a background process that systematically compares replicas on a schedule — hourly, daily, or as configured — and repairs divergent data regardless of access patterns. This is precisely why the two mechanisms complement each other: read repair handles hot data, anti-entropy handles cold data.
Question 2 Multiple Choice
When a client reads a key and the coordinator receives different versions from two replicas, what does read repair do?
AReturns the most recent version to the client and logs the discrepancy for later background repair
BReturns the most recent version to the client and immediately writes it back to the stale replica
CAborts the read and waits for the replicas to converge before retrying
DReturns both versions to the client and lets the application choose which to use
Read repair is synchronous with the read operation: upon detecting a version mismatch, the coordinator identifies the most recent version (via vector clocks or timestamps), returns it to the client, and writes it back to the stale replica before completing the request. This is the 'repair on read' pattern. Option A describes a lazy/deferred approach that is not what read repair does. Option D would push conflict resolution to the application, which is a different design choice (as in DynamoDB's eventual model with application-side resolution).
Question 3 True / False
Read repair can only fix inconsistencies for keys that clients actually read, leaving cold (infrequently accessed) data potentially inconsistent indefinitely if no background repair process exists.
TTrue
FFalse
Answer: True
This is the fundamental limitation of read repair as a standalone consistency mechanism. It piggybacks on client reads, so it only runs when data is accessed. Keys that are rarely or never read — archived records, configuration values, audit logs — can remain in a divergent state indefinitely. Anti-entropy exists precisely to fill this gap by repairing all data systematically, regardless of access frequency.
Question 4 True / False
Anti-entropy is expected to run very frequently — at least nearly every few seconds — to maintain eventual consistency guarantees in production systems.
TTrue
FFalse
Answer: False
Eventual consistency only promises that replicas will *eventually* converge — it sets no bound on how quickly. Anti-entropy can run on schedules as infrequent as once per hour or once per day in many production systems (Cassandra, Dynamo), and the guarantee still holds. The tradeoff is between repair latency (how long data stays inconsistent) and resource usage (CPU, I/O, network). Systems choose the frequency based on their consistency requirements and operational constraints, not a hard minimum.
Question 5 Short Answer
Why is read repair alone insufficient to guarantee eventual consistency, and what does anti-entropy add to the picture?
Think about your answer, then reveal below.
Model answer: Read repair only triggers when a client reads a key, so data that is rarely or never accessed can remain inconsistent indefinitely — the guarantee of eventual convergence breaks down for cold data. Anti-entropy adds a background sweep that compares all replicas on a schedule and repairs divergent keys regardless of whether anyone has read them. Together, the two mechanisms cover the full dataset: read repair handles the hot path quickly, anti-entropy handles the cold path on a schedule.
The complementary design is deliberate. Read repair is cheap (it piggybacks on operations already happening) but coverage-limited. Anti-entropy is thorough (scans everything) but expensive if run too frequently. Using Merkle trees, anti-entropy can efficiently identify *which* key ranges differ without comparing every key-value pair, making it practical even for large datasets. The combination — read repair for freshness on hot data, anti-entropy for completeness on cold data — gives eventual consistency systems a practical convergence guarantee with tunable cost.