← Graph View All Domains

A topic in the Open Knowledge Graph — a free, open map of 15,290 topics and the order to learn them in.

Hash Tables: Collision Resolution by Chaining

College Depth 88 in the knowledge graph ☐ I know this ☆ Set as goal

6topics build on this

471prerequisites beneath it

See this on the map →

Hash Function Design: Properties and Requirements Hash Tables→→Universal and Perfect Hashing

Core Idea

Chaining stores colliding keys in a linked list at each bucket. Search/insert/delete is O(1 + α) expected, where α = n/m is the load factor. High α increases average chain length; rehashing when α > threshold maintains performance.

Explainer

From your study of hash functions, you know that a hash function maps keys to array indices (buckets), enabling O(1) expected-time lookups. But no matter how good the hash function is, collisions — two different keys mapping to the same bucket — are inevitable once the number of stored keys approaches or exceeds the number of buckets. The chaining strategy handles collisions in the most intuitive way: each bucket holds not a single key, but a linked list (or other collection) of all keys that hash to that index. When a collision occurs, the new key is simply appended to the list at that bucket.

The operations under chaining are straightforward. To insert a key, compute its hash to find the bucket, then prepend the key to that bucket's list — O(1) time. To search for a key, hash it to find the bucket, then walk the linked list comparing keys until you find a match or reach the end. To delete, search for the key and remove it from the list. The cost of search and delete depends on the length of the chain at the target bucket. If keys are distributed uniformly across m buckets and there are n total keys, the expected chain length is α = n/m, called the load factor. This means search takes O(1 + α) expected time: O(1) to compute the hash and access the bucket, plus O(α) to traverse the chain.

The load factor is the single most important number governing a chained hash table's performance. When α is small (say, 0.5 to 1.0), most chains are very short — zero or one elements — and operations are effectively O(1). As α grows, chains lengthen and performance degrades toward O(n) in the extreme case where all keys land in the same bucket. To prevent this, implementations rehash when α crosses a threshold: allocate a new, larger array (typically double the size), recompute the hash for every existing key, and insert them into the new array. This is an O(n) operation, but it happens infrequently enough that the amortized cost of insertion remains O(1). The choice of rehash threshold balances memory usage against chain length — a lower threshold wastes more space but keeps chains shorter.

Chaining has several practical advantages over its main alternative, open addressing (where colliding keys probe for empty slots within the array itself). Chaining degrades gracefully as the load factor increases — performance worsens linearly rather than catastrophically. Deletion is simple and does not create complications like "tombstone" markers. And the chains can use any collection type: a linked list is simplest, but high-performance implementations sometimes use balanced BSTs (as Java's HashMap does for long chains) or dynamic arrays for better cache locality. The tradeoff is that each chain node requires a pointer, adding memory overhead and reducing cache friendliness compared to open addressing for low load factors. Understanding chaining gives you a clear mental model of how hash tables handle the collision problem, and sets the stage for studying open addressing as the alternative approach.

Practice Questions 5 questions

Prerequisite Chain

Understanding Zero → The Number Zero → Counting to Five → Counting to 10 → Counting to 20 → Counting a Set of Objects Up to 20 → Cardinality: The Last Number Counted → Matching Numerals to Quantities → Subitizing Small Quantities → Addition Within 10 → Number Bonds to 10 → Addition Within 20 → Doubles and Near Doubles → Doubles Facts Within 10 → Near Doubles Facts Within 20 → Mental Math Strategies for Addition → Mental Math: Adding and Subtracting Tens → Addition Within 100 → Repeated Addition as Multiplication → Multiplication as Equal Groups → Multiplication: Arrays → Basic Multiplication Facts (0s, 1s, 2s, 5s, 10s) → Multiplication Facts Within 100 → Division as Equal Sharing → Division as Grouping (Measurement Division) → Division: Grouping (Repeated Subtraction) Model → Division: Fair Sharing Model → Division as Equal Sharing → Division as Grouping → Basic Division Facts → Division Facts Within 100 → Multiplication and Division Fact Families → Relationship Between Multiplication and Division → Division Facts as Inverse of Multiplication → Remainders and Quotients in Division → Division Word Problems → Multi-Step Word Problems → Solving Multi-Step Word Problems → Multiplication Word Problems → Division Word Problems → Introduction to Long Division → Factors and Multiples → Prime and Composite Numbers → Equivalent Fractions → Relating Fractions and Decimals → Decimal Place Value → Integers and the Number Line → Comparing and Ordering Integers → Absolute Value → Adding Integers → Subtracting Integers → Multiplying Integers → Introduction to Exponents → Order of Operations → Integer Order of Operations → Variable Expressions → The Distributive Property → Variables and Expressions Review → Introduction to Polynomials → Adding and Subtracting Polynomials → Multiplying Polynomials → Factorial → Permutations → Combinations → Counting Principles: Addition and Multiplication Rules → Introduction to Graph Theory → Propositional Logic Foundations → Logical Equivalences → Boolean Algebra → Boolean Type and Truth Values → Comparison Operators and Boolean Tests → Logical Operators and Boolean Algebra → Conditional Statements → Defining and Calling Functions → Functions: Decomposing Problems → Function Parameters and Argument Passing → Return Values → Variable Scope → Introduction to Classes → Objects and Instances → Methods and Attributes → Algorithm Design Basics → Tree Structure and Node Properties → Binary Trees → Binary Tree Properties: Height, Balance, Completeness → Amortized Analysis → Hash Tables → Hash Function Design: Properties and Requirements → Hash Tables: Collision Resolution by Chaining

Longest path: 89 steps · 471 total prerequisite topics

Prerequisites (2)

Hash Function Design: Properties and Requirementshard Hash Tablessoft

Leads To (1)

Universal and Perfect Hashingsoft