The comparative method is the principal technique for establishing genetic relationships among languages and reconstructing their common ancestor (proto-language). It works by identifying cognates — words in related languages descended from a shared ancestral form — and establishing systematic sound correspondences between them. Because sound change is regular (the Neogrammarian hypothesis), each proto-sound yields predictable reflexes in each daughter language, allowing linguists to work backward from attested forms to reconstruct unattested proto-forms. The method produces not just individual reconstructed words but an entire reconstructed phonological system, and in favorable cases, aspects of morphology and syntax. Proto-Indo-European, reconstructed through two centuries of comparative work, remains the most thoroughly developed proto-language.
Work through a simplified reconstruction exercise with data from three or four related languages — align cognate sets, identify the regular sound correspondences, and reconstruct the proto-forms. Compare the Romance reflexes of Latin words (e.g., Latin "noctem" > Spanish "noche," French "nuit," Italian "notte") to see regular correspondences in a family where the proto-language is actually attested. Practice distinguishing true cognates from loanwords and chance resemblances.
From your work in historical linguistics and sound change, you know two foundational facts: languages evolve over time from earlier forms, and phonological change is regular — a given sound shifts in the same direction across all words in a given phonetic environment. The comparative method is the technique that converts those two facts into a reconstruction engine. If regular change produces predictable differences, then working backward from attested differences should let us infer the earlier form that produced them. That is exactly what the method does.
The procedure has four steps. First, identify cognates — words in different languages that share form and meaning because they descend from a common ancestor, not because of borrowing or coincidence. Latin "pater," Greek "patér," Sanskrit "pitár," and English "father" are cognates; they are not merely similar-sounding but show the same root meaning across geographically distinct language families. Second, align the cognate sets in columns and compare them systematically. Third, identify sound correspondences — the regular, predictable relationship between the sounds in each language at each position. Latin /p/ consistently corresponds to English /f/ at the start of words: pater/father, ped-/foot, piscis/fish. Fourth, posit a proto-sound that would have evolved into each attested reflex under known sound change rules. In this case, the proto-Indo-European initial consonant was likely *\*p*, which English underwent a shift away from (Grimm's Law).
The crucial principle is the Neogrammarian hypothesis: sound change is exceptionless. When you find a word that appears to violate your correspondence table, the options are not "the sound change was irregular" but rather "this word was borrowed after the change," "it was borrowed from a dialect that underwent the change differently," or "I have misidentified the cognate." The rigidity of this assumption is what makes reconstruction possible — it constrains the hypothesis space enough that a system can be recovered.
What the method ultimately produces is not a recording of the proto-language but a system of correspondences — a set of reconstructed phonemes (written with an asterisk to indicate they are not attested, only inferred) that stand in predictable relationships to attested forms. Proto-Indo-European \*\*bher-\* (to carry) is attested nowhere but predicts Sanskrit "bhar-", Greek "pher-", Latin "fer-", and English "bear" — and it does predict them, correctly. The reconstructed form is less a claim about actual sounds spoken thousands of years ago than a formula that encodes the relationship between daughter languages. Its power is not realism but predictive precision.