A topic in the Open Knowledge Graph — a free, open map of 15,290 topics and the order to learn them in.

Hash Tables: Collision Resolution by Open Addressing

College Depth 88 in the knowledge graph ☐ I know this ☆ Set as goal

471prerequisites beneath it

Hash Function Design: Properties and Requirements Hash Tables→

Core Idea

Open addressing probes for an empty slot when collision occurs. Linear probing (i+1, i+2, ...) is simple but suffers clustering. Quadratic probing (i+1², i+2², ...) and double hashing (second hash function) reduce clustering. Load factor α must stay low (< 0.5–0.75).

Explainer

From your study of hash functions and hash tables, you know that a hash function maps keys to array indices and that collisions — two different keys mapping to the same index — are inevitable when the key space is larger than the table. The question is what to do when a collision occurs. In open addressing, the answer is: look for another empty slot *within the same array*. Unlike chaining (which stores colliding keys in linked lists), open addressing keeps everything in a single contiguous array, which gives it excellent cache performance since the CPU can prefetch nearby slots.

Linear probing is the simplest scheme: if slot h(k) is occupied, try h(k)+1, then h(k)+2, and so on (wrapping around at the end). It is cache-friendly because the probe sequence accesses consecutive memory locations. However, it suffers from primary clustering — occupied slots tend to clump together into long runs. Once a cluster forms, any new key that hashes into any position within the cluster will extend it further, making the cluster grow faster than expected. As the table fills up, these clusters merge into massive contiguous blocks, and the expected number of probes per operation grows sharply.

Quadratic probing addresses clustering by spacing out the probe sequence: try h(k)+1², h(k)+2², h(k)+3², and so on. Because the jumps grow larger, keys that collide at the same initial slot spread across the table rather than piling up in adjacent cells. This eliminates primary clustering but introduces secondary clustering — keys with the same hash value still follow the same probe sequence, so they compete with each other. Double hashing goes further by using a second, independent hash function to determine the probe step: the sequence is h₁(k), h₁(k)+h₂(k), h₁(k)+2·h₂(k), and so on. Since different keys (even those with the same h₁ value) will typically have different h₂ values, both primary and secondary clustering are eliminated. The probe sequences are effectively unique per key.

The load factor α = n/m (number of stored keys divided by table size) is the critical performance parameter. For linear probing with a good hash function, the expected number of probes for a successful search is roughly (1 + 1/(1-α)²)/2. At α = 0.5, this is about 2.5 probes — fast. At α = 0.9, it jumps to about 50 probes — unacceptable. This is why open-addressing tables resize (typically doubling) when α exceeds a threshold, commonly 0.5 for linear probing or 0.75 for double hashing. Deletion is also tricky: you cannot simply empty a slot, because that would break probe sequences for keys that were inserted after and probed past that slot. Instead, deleted slots are marked with a tombstone sentinel value that tells searches to keep probing but allows insertions to reuse the space. Accumulating too many tombstones degrades performance, which is another reason periodic resizing (which eliminates tombstones) is important.

Practice Questions 5 questions

Prerequisite Chain

Understanding Zero → The Number Zero → Counting to Five → Counting to 10 → Counting to 20 → Counting a Set of Objects Up to 20 → Cardinality: The Last Number Counted → Matching Numerals to Quantities → Subitizing Small Quantities → Addition Within 10 → Number Bonds to 10 → Addition Within 20 → Doubles and Near Doubles → Doubles Facts Within 10 → Near Doubles Facts Within 20 → Mental Math Strategies for Addition → Mental Math: Adding and Subtracting Tens → Addition Within 100 → Repeated Addition as Multiplication → Multiplication as Equal Groups → Multiplication: Arrays → Basic Multiplication Facts (0s, 1s, 2s, 5s, 10s) → Multiplication Facts Within 100 → Division as Equal Sharing → Division as Grouping (Measurement Division) → Division: Grouping (Repeated Subtraction) Model → Division: Fair Sharing Model → Division as Equal Sharing → Division as Grouping → Basic Division Facts → Division Facts Within 100 → Multiplication and Division Fact Families → Relationship Between Multiplication and Division → Division Facts as Inverse of Multiplication → Remainders and Quotients in Division → Division Word Problems → Multi-Step Word Problems → Solving Multi-Step Word Problems → Multiplication Word Problems → Division Word Problems → Introduction to Long Division → Factors and Multiples → Prime and Composite Numbers → Equivalent Fractions → Relating Fractions and Decimals → Decimal Place Value → Integers and the Number Line → Comparing and Ordering Integers → Absolute Value → Adding Integers → Subtracting Integers → Multiplying Integers → Introduction to Exponents → Order of Operations → Integer Order of Operations → Variable Expressions → The Distributive Property → Variables and Expressions Review → Introduction to Polynomials → Adding and Subtracting Polynomials → Multiplying Polynomials → Factorial → Permutations → Combinations → Counting Principles: Addition and Multiplication Rules → Introduction to Graph Theory → Propositional Logic Foundations → Logical Equivalences → Boolean Algebra → Boolean Type and Truth Values → Comparison Operators and Boolean Tests → Logical Operators and Boolean Algebra → Conditional Statements → Defining and Calling Functions → Functions: Decomposing Problems → Function Parameters and Argument Passing → Return Values → Variable Scope → Introduction to Classes → Objects and Instances → Methods and Attributes → Algorithm Design Basics → Tree Structure and Node Properties → Binary Trees → Binary Tree Properties: Height, Balance, Completeness → Amortized Analysis → Hash Tables → Hash Function Design: Properties and Requirements → Hash Tables: Collision Resolution by Open Addressing

Longest path: 89 steps · 471 total prerequisite topics

Prerequisites (2)

Hash Function Design: Properties and Requirementshard Hash Tablessoft

Leads To (0)

No topics depend on this one yet.