A topic in the Open Knowledge Graph — a free, open map of 15,290 topics and the order to learn them in.

Consistent Hashing

Graduate Depth 87 in the knowledge graph ☐ I know this ☆ Set as goal

3topics build on this

476prerequisites beneath it

Hash Tables Introduction to Distributed Systems +1 more→→Data Sharding and Partitioning Strategies Distributed Hash Tables and DHT

Core Idea

Consistent hashing maps both keys and nodes to a ring; a key is assigned to the nearest node clockwise. When a node joins or leaves, only keys in a contiguous range need reassignment, minimizing data movement. This enables dynamic scaling without disrupting unaffected keys and is used in caches (Memcached), CDNs, and DHTs.

Explainer

You already know how hash tables work: hash the key, compute an index with modular arithmetic (`hash(key) % n`), and store the value at that index. This works beautifully on a single machine. But in a distributed system with *n* server nodes, the same approach — `hash(key) % n` to pick a server — has a fatal flaw. When you add or remove a server, *n* changes, and nearly every key maps to a different server. If you go from 10 to 11 servers, roughly 90% of your keys need to move. For a distributed cache, that means a near-total cache miss storm; for a storage system, it means massive data migration.

Consistent hashing solves this by arranging the hash space into a ring (imagine the numbers 0 through 2³² - 1 wrapped into a circle). Both keys and nodes are hashed onto this ring using the same hash function. To find which node owns a key, you start at the key's position on the ring and walk clockwise until you hit a node — that node is responsible for the key. The elegant consequence is that when a node joins, it takes over only the keys in the arc between it and the next node counterclockwise. When a node leaves, only its keys need to be reassigned to the next node clockwise. In both cases, the vast majority of keys stay exactly where they are.

The naive version has a practical problem: with only a few nodes, the arcs between them can be very uneven, leading to severe load imbalance — one node might own 60% of the key space while another owns 5%. The standard fix is virtual nodes (vnodes): instead of placing each physical node at one point on the ring, you place it at many points (say, 100-200) by hashing variations of its identifier (e.g., "nodeA-1", "nodeA-2", ...). This spreads each node's responsibility across many small arcs, and the law of large numbers smooths out the distribution. When a physical node leaves, its load is distributed across many other nodes rather than dumped onto a single successor.

Consistent hashing is foundational infrastructure in distributed systems. Amazon's Dynamo uses it to partition data across storage nodes. Memcached and Redis Cluster use it to distribute cache keys. CDNs use it to route requests to edge servers. The core insight is simple but profound: by decoupling the hash space from the number of nodes, you make the system elastic — nodes can come and go with minimal disruption, which is exactly the property you need in systems designed to scale horizontally and tolerate failures.

Practice Questions 5 questions

Prerequisite Chain

Understanding Zero → The Number Zero → Counting to Five → Counting to 10 → Counting to 20 → Counting a Set of Objects Up to 20 → Cardinality: The Last Number Counted → Matching Numerals to Quantities → Subitizing Small Quantities → Addition Within 10 → Number Bonds to 10 → Addition Within 20 → Doubles and Near Doubles → Doubles Facts Within 10 → Near Doubles Facts Within 20 → Mental Math Strategies for Addition → Mental Math: Adding and Subtracting Tens → Addition Within 100 → Repeated Addition as Multiplication → Multiplication as Equal Groups → Multiplication: Arrays → Basic Multiplication Facts (0s, 1s, 2s, 5s, 10s) → Multiplication Facts Within 100 → Division as Equal Sharing → Division as Grouping (Measurement Division) → Division: Grouping (Repeated Subtraction) Model → Division: Fair Sharing Model → Division as Equal Sharing → Division as Grouping → Basic Division Facts → Division Facts Within 100 → Multiplication and Division Fact Families → Relationship Between Multiplication and Division → Division Facts as Inverse of Multiplication → Remainders and Quotients in Division → Division Word Problems → Multi-Step Word Problems → Solving Multi-Step Word Problems → Multiplication Word Problems → Division Word Problems → Introduction to Long Division → Factors and Multiples → Prime and Composite Numbers → Equivalent Fractions → Relating Fractions and Decimals → Decimal Place Value → Integers and the Number Line → Comparing and Ordering Integers → Absolute Value → Adding Integers → Subtracting Integers → Multiplying Integers → Introduction to Exponents → Order of Operations → Integer Order of Operations → Variable Expressions → The Distributive Property → Variables and Expressions Review → Introduction to Polynomials → Adding and Subtracting Polynomials → Multiplying Polynomials → Factorial → Permutations → Combinations → Counting Principles: Addition and Multiplication Rules → Introduction to Graph Theory → Propositional Logic Foundations → Logical Equivalences → Boolean Algebra → Boolean Type and Truth Values → Comparison Operators and Boolean Tests → Logical Operators and Boolean Algebra → Conditional Statements → Defining and Calling Functions → Functions: Decomposing Problems → Function Parameters and Argument Passing → Return Values → Variable Scope → Introduction to Classes → Objects and Instances → Methods and Attributes → Algorithm Design Basics → Tree Structure and Node Properties → Binary Trees → Binary Tree Properties: Height, Balance, Completeness → Amortized Analysis → Hash Tables → Consistent Hashing

Longest path: 88 steps · 476 total prerequisite topics

Prerequisites (3)

Hash Tableshard Introduction to Distributed Systemssoft Modular Arithmetic and Congruencessoft

Leads To (2)

Data Sharding and Partitioning Strategieshard Distributed Hash Tables and DHThard