A distributed database places all three of its replicas on different machines within the same rack. Which failure scenario will this strategy NOT survive?
AA single disk failure on one machine
BA software bug that corrupts data on one node
CA top-of-rack network switch failure that cuts off the entire rack
DA network partition that isolates one machine from the others
Placing replicas on different machines within the same rack tolerates individual machine failures — each machine holds an independent copy. However, the top-of-rack switch is a shared single point of failure for all machines in that rack. A switch failure, a power distribution unit failure, or a cooling failure affecting the rack will take out all machines simultaneously, losing all replicas at once. Rack-aware placement spreads replicas across different racks so that any single rack failure leaves at least one replica accessible.
Question 2 Multiple Choice
A team needs a quorum of 2 out of 3 replicas to acknowledge each write and wants to minimize write latency. Which placement strategy best achieves their goal?
APlace all 3 replicas in geographically distant regions (US, EU, Asia) to maximize fault tolerance
BPlace 2 replicas in the same local region and 1 replica in a remote region, so quorum writes stay local
CPlace all 3 replicas in the same datacenter on different racks for minimum latency
DPlace 1 replica per region — any region can serve reads, reducing global write coordination
A quorum of 2 out of 3 requires waiting for the 2 fastest replicas to acknowledge. If 2 replicas are in the same local region, quorum writes complete with local round-trip latency — only the local replicas need to respond. The third replica in a distant region provides disaster recovery but is not in the critical path for achieving quorum. If all 3 replicas are on different continents, every write must wait for at least 2 cross-region round trips, adding hundreds of milliseconds. Placement directly controls whether quorum operations are fast or slow.
Question 3 True / False
Geographic distribution of replicas across multiple datacenters typically reduces read latency, because clients can typically read from the nearest replica.
TTrue
FFalse
Answer: False
Geographic distribution can reduce read latency for clients near a replica, but it increases write latency — cross-region acknowledgments add round-trip delays measured in hundreds of milliseconds. Furthermore, if the replication protocol requires a quorum for reads (not just writes), geographic distribution may increase read latency if the quorum must span regions. The tradeoff is explicit: geographic distribution improves disaster recovery and regional read latency but worsens write latency and cross-region consistency operations. No single placement strategy optimizes all dimensions simultaneously.
Question 4 True / False
Load-aware placement can be applied independently of fault-domain-aware placement — a system can optimize for load balance without considering rack or geographic topology.
TTrue
FFalse
Answer: False
Load-aware placement and fault-domain-aware placement interact and must be applied together. A purely load-aware strategy might assign multiple replicas to the same rack because those nodes happen to be underloaded, inadvertently concentrating data in a single fault domain and defeating the purpose of replication. Effective placement strategies combine both dimensions: fault-domain diversity (rack, datacenter) sets hard constraints on replica distribution, within which load-aware placement can optimize for performance. These are a joint optimization problem, not independent concerns.
Question 5 Short Answer
Explain why replica placement involves inherent tradeoffs, and describe one specific tradeoff a system architect must accept when choosing geographic distribution.
Think about your answer, then reveal below.
Model answer: Replica placement optimizes along multiple dimensions that conflict: fault tolerance, write latency, read latency, and cost. Geographic distribution across datacenters maximizes fault tolerance — the system survives an entire datacenter failure — but it directly increases write latency because writes must replicate across a wide-area network before quorum is achieved. A write requiring 2 of 3 replicas to acknowledge, where replicas are on different continents, always pays at least one trans-oceanic round-trip (roughly 80–150ms) for every write. The architect accepts higher write latency in exchange for datacenter-level fault tolerance. This tradeoff cannot be eliminated — it is imposed by the speed of light.
The key insight is that there is no universally optimal placement strategy. The right answer depends on which failure modes the application must survive and what latency it can tolerate. HDFS rack-aware placement is optimized for a single-datacenter deployment where rack failures are the dominant risk. Global databases like Google Spanner accept high write latency in exchange for multi-continental durability. Understanding what failures you are designing for is the prerequisite to any placement decision.