Syntacticians use empirical diagnostics to determine which words form grammatical units. Substitution (replacing a phrase with a pronoun), clefting (it's ____ that...), movement (can the phrase move?), and coordination reveal hidden phrase structure. These tests provide objective criteria for parsing ambiguous strings and uncovering constituents not marked by prosody or word boundaries.
Apply multiple tests to the same sentence; a phrase that passes one test should pass others. Contrast with sequences that fail most tests to sharpen intuitions.
From your work with syntactic structure, you know that sentences are organized into hierarchically nested phrases — not just a flat string of words. But how do syntacticians actually *discover* that structure? They can't look inside a speaker's head. Instead, they run linguistic experiments on the surface string: constituency tests are those experiments.
The core insight is that grammatical constituents behave differently from arbitrary word sequences. A constituent — a noun phrase, verb phrase, prepositional phrase — is a unit that the grammar treats as a single element. Non-constituent sequences are just accidental adjacencies. The tests exploit this distinction by probing whether a word string can do things only units can do.
Substitution (or pronominalization) is the most intuitive test: replace the candidate string with a pronoun or pro-form. "The old man from the corner store laughed" → "He laughed." The string "the old man from the corner store" substitutes cleanly for a single pronoun, confirming it's a constituent (a noun phrase). If the substitution produces grammatical nonsense, the string is likely not a constituent. Clefting (the "It's ___ that..." construction) highlights a constituent: "It was the old man from the corner store that laughed." Try clefting a non-constituent: "It was the old man from that laughed the corner store" — the result is ungrammatical, which is the diagnostic signal.
Movement tests work because syntax typically allows whole constituents to front, topicalize, or undergo heavy-NP shift — but not arbitrary word chunks. "From the corner store, the old man laughed" — the prepositional phrase moved to the front, confirming its constituent status. Coordination is perhaps the most powerful test: only parallel constituents of the same type can be conjoined. "[The old man from the corner store] and [the woman with the red umbrella] laughed" — two NPs coordinated, confirming both are constituents of the same type.
The methodological principle behind all of this is converging evidence: a genuine constituent should pass multiple tests, while a non-constituent should fail most of them. When tests diverge — one passes, another fails — it signals either that word order or phonology is blocking the test (not a failure of constituency) or that you're dealing with a subtler case that requires additional analysis. Constituency testing transforms syntax from a set of labels to apply into an empirical practice of discovering hidden structure.