Hash Indexes

College Depth 64 in the knowledge graph I know this Set as goal
Unlocks 9 downstream topics
hash index equality lookup hash table index structure point query

Core Idea

Hash indexes use a hash function to map key values to bucket locations, enabling O(1) average-case equality lookups that are faster than B-tree traversals for point queries. However, because the hash function destroys key ordering, hash indexes cannot support range queries, prefix matching, or ordered scans. Dynamic hashing schemes (extendible hashing, linear hashing) allow the index to grow and shrink without a full rebuild as the dataset changes.

How It's Best Learned

Implement a simple extendible hash index on a mock dataset and test equality lookups vs. range scans. Compare execution plans between a B-tree-indexed column and a hash-indexed column for both equality and range queries.

Common Misconceptions

Explainer

You already know the two core ideas behind hash indexes from your prerequisites: indexing speeds up queries by avoiding full table scans, and hash tables map keys to positions using a hash function. A hash index applies this same principle to database storage — instead of scanning every row to find where `email = '[email protected]'`, the database hashes the search key, jumps directly to the corresponding bucket, and finds matching rows in O(1) average time.

The mechanics work like an in-memory hash table, adapted for disk. A hash function takes the indexed column's value and produces a bucket number. Each bucket stores pointers to the actual rows on disk (or the rows themselves in some implementations). When you execute `SELECT * FROM users WHERE email = '[email protected]'`, the database hashes `'[email protected]'`, looks up the bucket, and follows the pointer to the row. No tree traversal, no binary search — just a direct lookup. For equality queries (exact match on a key), this is faster than a B-tree, which requires traversing multiple levels of a tree structure.

The critical limitation is that hashing destroys ordering. Two keys that are adjacent in sort order (like `'alice'` and `'bob'`) might hash to completely different buckets. This means hash indexes are useless for range queries (`WHERE age > 30`), prefix searches (`WHERE name LIKE 'Ali%'`), or anything requiring sorted output (`ORDER BY`). A B-tree preserves key order in its leaf nodes and handles all these operations naturally. This is why B-trees remain the default index type in most databases — they're slightly slower for equality lookups but vastly more versatile.

As your dataset grows, a static hash table with a fixed number of buckets becomes inefficient — too many entries per bucket degrade lookups. Dynamic hashing schemes solve this. Extendible hashing uses a directory of pointers to buckets, doubling the directory and splitting only the overflowing bucket when needed. Linear hashing splits buckets one at a time in a round-robin fashion, spreading the cost of growth evenly over insertions. Both approaches let the index grow smoothly without the stop-the-world cost of rehashing every key at once — an important property for databases that must remain responsive during writes.

Practice Questions 5 questions

Prerequisite Chain

Counting to 10Counting to 20Understanding ZeroThe Number ZeroCounting to FiveOne-to-One CorrespondenceCombining Small Groups Within 5Addition Within 10Addition Within 20Two-Digit Addition Without RegroupingTwo-Digit Addition with RegroupingAddition Within 100Repeated Addition as MultiplicationMultiplication Facts Within 100Division as Equal SharingDivision as Grouping (Measurement Division)Division: Grouping (Repeated Subtraction) ModelDivision: Fair Sharing ModelDivision as Equal SharingDivision as GroupingBasic Division FactsDivision Facts Within 100Two-Digit by One-Digit DivisionDivision with RemaindersRemainders and Quotients in DivisionDivision Word ProblemsIntroduction to Long DivisionFactors and MultiplesPrime and Composite NumbersEquivalent FractionsRelating Fractions and DecimalsDecimal Place ValueReading and Writing DecimalsComparing and Ordering DecimalsAdding and Subtracting DecimalsMultiplying DecimalsDividing DecimalsDividing FractionsMixed Number ArithmeticOrder of OperationsInteger Order of OperationsVariable ExpressionsCombining Like TermsOne-Step EquationsTwo-Step EquationsSolving Multi-Step EquationsEquations with Variables on Both SidesLiteral EquationsSlope-Intercept FormPoint-Slope FormWriting Linear EquationsParallel and Perpendicular Line SlopesGraphing Linear EquationsPiecewise FunctionsStep FunctionsComposition of FunctionsInverse FunctionsRadical Functions and GraphsRational ExponentsExponential Functions and GraphsLogarithms IntroductionTime and Space ComplexityAmortized AnalysisHash TablesHash Indexes

Longest path: 65 steps · 345 total prerequisite topics

Prerequisites (2)

Leads To (2)