A topic in the Open Knowledge Graph — a free, open map of 15,290 topics and the order to learn them in.

DISTINCT: Eliminating Duplicate Rows

College Depth 73 in the knowledge graph ☐ I know this ☆ Set as goal

334prerequisites beneath it

SQL: SELECT Statement and Basic Queries DELETE Statements: Removing Rows with Conditions→

Core Idea

The DISTINCT keyword removes duplicate rows from query results, keeping only unique combinations of the selected columns. It is useful for exploratory analysis to understand the range of values in a dataset.

How It's Best Learned

Start with simple single-column DISTINCT queries, then apply it to multi-column selects to understand how uniqueness is determined.

Common Misconceptions

DISTINCT does not affect the underlying data—it only filters the result set. Using DISTINCT with ORDER BY requires the ordering columns to be in the SELECT list (in some databases).

Explainer

When you run a SELECT query, the result set can contain duplicate rows — especially after joins or when selecting a subset of columns. If you select just the `city` column from a million-row customer table, you might get the same city name thousands of times. DISTINCT tells the database to collapse these duplicates, returning only one row for each unique combination of values in your selected columns.

The key insight is that DISTINCT operates on the entire row of your result set, not on a single column. If you write `SELECT DISTINCT city, state FROM customers`, a row is considered a duplicate only if both the city and state match. Portland, Oregon and Portland, Maine are distinct rows even though the city name is the same. This means adding more columns to a DISTINCT query generally produces more rows, not fewer, because there are more ways for combinations to be unique.

DISTINCT is most valuable during exploratory analysis — when you want to understand what values exist in a column before writing more complex queries. "What departments do we have?" (`SELECT DISTINCT department FROM employees`) or "Which product-category combinations exist?" are natural DISTINCT questions. It is also useful for quick sanity checks: if `SELECT COUNT(*)` returns 10,000 rows but `SELECT COUNT(DISTINCT customer_id)` returns only 8,500, you know some customers appear multiple times.

A common antipattern is using DISTINCT as a band-aid to hide a query bug. If a JOIN produces unexpected duplicates, slapping DISTINCT on the SELECT hides the symptom without fixing the cause — usually a missing join condition or an unintended many-to-many relationship. When you find yourself reaching for DISTINCT to "fix" duplicate rows, pause and ask whether the duplicates indicate a problem in your query logic rather than a legitimate need for deduplication. Also be aware that DISTINCT has a performance cost: the database must sort or hash the entire result set to identify duplicates, which can be expensive on large datasets.

Practice Questions 5 questions

Prerequisite Chain

Understanding Zero → The Number Zero → Counting to Five → Counting to 10 → Counting to 20 → Counting a Set of Objects Up to 20 → Cardinality: The Last Number Counted → Matching Numerals to Quantities → Subitizing Small Quantities → Addition Within 10 → Number Bonds to 10 → Addition Within 20 → Doubles and Near Doubles → Doubles Facts Within 10 → Near Doubles Facts Within 20 → Mental Math Strategies for Addition → Mental Math: Adding and Subtracting Tens → Addition Within 100 → Repeated Addition as Multiplication → Multiplication as Equal Groups → Multiplication: Arrays → Basic Multiplication Facts (0s, 1s, 2s, 5s, 10s) → Multiplication Facts Within 100 → Division as Equal Sharing → Division as Grouping (Measurement Division) → Division: Grouping (Repeated Subtraction) Model → Division: Fair Sharing Model → Division as Equal Sharing → Division as Grouping → Basic Division Facts → Division Facts Within 100 → Multiplication and Division Fact Families → Relationship Between Multiplication and Division → Division Facts as Inverse of Multiplication → Remainders and Quotients in Division → Division Word Problems → Multi-Step Word Problems → Solving Multi-Step Word Problems → Multiplication Word Problems → Division Word Problems → Introduction to Long Division → Factors and Multiples → Prime and Composite Numbers → Equivalent Fractions → Relating Fractions and Decimals → Decimal Place Value → Integers and the Number Line → Comparing and Ordering Integers → Absolute Value → Adding Integers → Subtracting Integers → Multiplying Integers → Introduction to Exponents → Order of Operations → Integer Order of Operations → Variable Expressions → The Distributive Property → Variables and Expressions Review → Introduction to Polynomials → Adding and Subtracting Polynomials → Multiplying Polynomials → Factorial → Permutations → Combinations → Counting Principles: Addition and Multiplication Rules → Introduction to Graph Theory → Propositional Logic Foundations → Logical Equivalences → Set Operations: Union, Intersection, and Complement → Relational Algebra → SQL: SELECT Statement and Basic Queries → SQL: WHERE Clause and Filtering → DELETE Statements: Removing Rows with Conditions → DISTINCT: Eliminating Duplicate Rows

Longest path: 74 steps · 334 total prerequisite topics

Prerequisites (2)

SQL: SELECT Statement and Basic Querieshard DELETE Statements: Removing Rows with Conditionssoft

Leads To (0)

No topics depend on this one yet.