Stylometry uses computational analysis to identify authorial 'style' by measuring linguistic features (word frequencies, sentence length, punctuation patterns) across texts. Stylometric methods can solve attribution problems, identify ghostwriting, and reveal patterns invisible to close reading. In comparative literature, stylometry enables large-scale analysis of stylistic variation across languages, periods, and traditions. However, stylometry raises philosophical questions: Can style be meaningfully quantified? What is hidden when literature becomes numerical data?
Run stylometric analysis on a corpus of texts and interpret the results. Compare algorithmic findings with interpretive readings. Consider what stylometric evidence supports and what it obscures.
That stylometry reveals objective truth about texts. Stylometric measures are interpretive choices (which features to measure?), and their meaning depends on theoretical framing. Quantification doesn't ensure objectivity.
You know from Moretti's distant reading that literary scholarship can operate on large corpora rather than individual texts, using aggregation to reveal patterns invisible to close reading. Stylometry is one of the most developed quantitative methods within this tradition, and it applies a specific wager: that authors leave measurable traces in the surface features of their prose — word frequencies, function word distributions, sentence length patterns, punctuation habits — and that these traces are stable enough to identify authorship even when content varies. The analogy is forensic: just as handwriting has distinctive features even when the message changes, writing style carries authorial fingerprints.
The best-known application is authorship attribution: determining who wrote a disputed or anonymous text. The Federalist Papers case is canonical — statistical analysis of function word frequencies (words like "the," "of," "by") supported the attribution of disputed papers to Madison rather than Hamilton, because function words are largely unconscious and therefore harder to fake than content words. The technique has been applied to Shakespeare's collaborators, Elena Ferrante's identity, and the detection of ghostwritten books. The insight is that style is not just what you consciously choose to say — it is also the unconscious rhythms of how you say it.
Stylometry raises immediate philosophical questions that any serious practitioner must engage. What features to measure? Choosing word frequency over sentence rhythm, or including punctuation versus ignoring it, are not neutral decisions — they encode assumptions about what constitutes "style." Different feature sets can yield different authorship conclusions for the same texts. Stylometric analysis is therefore not the mechanical production of truth; it is a series of interpretive choices about what counts as evidence, followed by computation, followed by more interpretation of what the numbers mean. Quantification does not remove the interpreter — it embeds the interpreter's assumptions in the algorithm.
The deeper question is: what is "style" when made computational? Close reading assumes that style is meaningful — Hemingway's short sentences carry thematic weight, Faulkner's long ones enact consciousness. Stylometry treats style as a byproduct of cognitive habit, largely unconscious and content-independent. These are genuinely different theories of what literary style is and does. The most sophisticated work in the field holds both: using computational methods to identify large-scale patterns and then returning to close reading to interpret what those patterns mean. The numbers answer "who?" and point toward "what pattern?"; interpretation answers "so what?" Distance and close reading are not rivals — they are sequential tools, each doing what the other cannot.
Topics in reflective domains aren't scored by quiz answers. Read, reflect, and mark when you've thought it through.
No topics depend on this one yet.