The replication crisis in psychology and medicine revealed a specific statistical manipulation that peer review failed to catch. What is 'p-hacking'?
Think about your answer, then reveal below.
Model answer: P-hacking refers to manipulating the analysis of a dataset -- adding or removing variables, stopping data collection when results are significant, reporting only significant subgroup analyses -- until a p-value below 0.05 is achieved, then reporting the result as if the analysis had been pre-specified. Because peer reviewers rarely see raw data or analysis code, they cannot detect this. The result is published findings that appear statistically rigorous but are actually exploratory patterns inflated by selective reporting. Large replication studies, such as the 2015 Open Science Collaboration project that reproduced only 39% of published psychology findings, exposed how widespread this problem was.
P-hacking is not necessarily conscious fraud -- researchers may genuinely believe the manipulation is justified. The problem is structural: publication bias rewards positive results, peer review cannot detect exploratory data mining, and statistical thresholds (p < 0.05) create a bright line that incentivizes crossing it by any means.