Replica placement determines where copies of data are stored in the system. Strategies include: geographic distribution to minimize latency and enable survivability across datacenters, rack-awareness to tolerate correlated failures, and load-aware placement to avoid hot nodes. Placement decisions affect availability guarantees, network usage, and read latency.
You already understand from primary-backup replication and quorum-based replication *why* we replicate data and *how* replicas coordinate. Replica placement answers the next question: *where* should those copies physically live? This decision has enormous consequences for latency, fault tolerance, and cost — and the right answer depends on what failures you need to survive.
The simplest placement strategy puts all replicas on different machines in the same rack. This tolerates individual machine failures but not rack-level events — a top-of-rack switch failure or power unit failure takes out every replica simultaneously. Rack-aware placement addresses this by spreading replicas across racks within a datacenter. HDFS, for example, places the first replica on the local node, the second on a different rack, and the third on yet another node in that second rack. This survives any single rack failure while keeping one replica nearby for fast reads.
Geographic placement extends this logic to datacenter-level failures. Placing replicas in different regions (US-East, EU-West, Asia-Pacific) means your data survives even if an entire datacenter goes offline — but cross-region replication adds significant latency to writes. If your quorum requires a majority of replicas to acknowledge a write, and those replicas are spread across continents, every write pays a round-trip penalty measured in hundreds of milliseconds. This is why many systems offer tunable placement: you might keep two replicas in your primary region for fast writes and a third in a remote region for disaster recovery, accepting that the remote replica lags slightly behind.
Load-aware placement adds a dynamic dimension. Even with perfect geographic and rack distribution, some nodes may become hot spots if popular data concentrates on them. Load-aware strategies monitor CPU, disk, and network utilization and route new replica assignments to underloaded nodes. This interacts with your replication protocol: if you are using quorum reads, placing replicas on overloaded nodes increases tail latency even when the system is nominally healthy. The best placement strategies combine all three dimensions — fault domain diversity, geographic distribution, and load balancing — weighted according to the application's specific requirements for latency, durability, and availability.
No topics depend on this one yet.