Preregistration involves publicly specifying hypotheses, variables, design, and analytical approach before data collection, creating accountability and distinguishing confirmatory hypothesis tests from exploratory post-hoc analyses. Preregistration reduces researcher degrees of freedom (p-hacking) while enabling transparent exploration properly labeled as such. This practice improves reproducibility and protects against selective reporting.
Your background in research ethics and research question formulation establishes what good research is supposed to be: a transparent test of a specific prediction. Preregistration is the mechanism that enforces that standard at the moment it is most vulnerable — when the researcher sits down with data and has countless small decisions to make.
The core problem is researcher degrees of freedom: the large number of legitimate-looking analytic choices that face a researcher after data are collected. Which participants to exclude? Which covariates to include? Which of several outcome measures to report as primary? Should you transform that skewed variable? When should you stop collecting data? Each choice seems defensible in isolation. But when every choice is made after observing how it affects the results — consciously or not — the researcher is no longer testing a hypothesis. They are searching the data for a pattern and then reporting it as if it were predicted. This process, known as p-hacking, inflates false positive rates far above the nominal 5% level. The replication crisis in psychology was in large part a consequence of widespread, often unconscious, researcher degrees of freedom.
Preregistration closes this loophole by requiring researchers to commit publicly — before data collection — to their hypotheses, primary variables, sample size, exclusion criteria, and analysis plan. The commitment is filed on a registry such as OSF (Open Science Framework), timestamped, and retrievable. When a paper is later published, readers and reviewers can inspect what was predicted in advance. Analyses that match the preregistered plan are confirmatory: they constitute a genuine hypothesis test with interpretable error rates. Analyses that deviate from or go beyond the plan are exploratory: they generate hypotheses for future study but do not confirm them.
The key clarification is that preregistration does not prohibit exploration — it requires that exploration be *labeled* as such. Curiosity and hypothesis generation are essential to science; the problem was never exploration itself, but presenting exploratory findings as confirmatory. A preregistered study that finds something unexpected in an unplanned analysis has discovered something interesting and worth pursuing — but that finding requires its own confirmatory test before it counts as established knowledge.