Questions: Scientific Method and Empirical Inquiry in Psychology
5 questions to test your understanding
Score: 0 / 5
Question 1 Multiple Choice
A researcher notices an unexpected correlation in her existing dataset and then runs a significance test on that same data, reporting p < 0.05. Why is this result weaker evidence than it appears?
AThe same data both generated the hypothesis and tested it, inflating the false positive rate well beyond what the p-value suggests
Bp < 0.05 is never sufficient evidence in psychology regardless of how the hypothesis was formed
CExploratory analyses cannot use significance tests — only qualitative methods are appropriate for pattern detection
DThe result is only weak because the sample size is probably too small
This is the exploratory/confirmatory confusion at the heart of the replication crisis. When you notice a pattern in data and then test that same pattern on the same data, you are double-dipping: the data generated the hypothesis and 'confirmed' it simultaneously. This practice dramatically inflates false positive rates above the nominal p-value level — the threshold is calibrated for pre-specified hypotheses on new data, not post-hoc patterns on existing data. A genuinely confirmatory result requires pre-registration and fresh data.
Question 2 Multiple Choice
Which of the following claims is most clearly falsifiable in the scientific sense?
AAdults who exercise aerobically 3 times per week for 8 weeks score higher on a validated attention task than matched sedentary controls
BPeople generally feel better when they live in accordance with their values
CThe mind has hidden depths that conscious introspection cannot access
DKindness makes the world a better place
Falsifiability requires specifying what observations would count against the claim. Option A does this explicitly: a specific procedure, measurable outcome, and comparison group that would produce a null or negative result if the claim were wrong. Options B–D are too vague to be testable — 'feeling better' and 'better place' are undefined, and 'hidden depths' cannot be falsified by any observable data. A claim is falsifiable if and only if you can describe what data would refute it.
Question 3 True / False
A single well-designed study with statistically significant results is sufficient to establish a psychological finding as scientifically reliable.
TTrue
FFalse
Answer: False
A single study provides evidence under one set of conditions with one sample, one operationalization, and one set of analysis choices. Many factors can produce a significant result that does not replicate — sampling variation, analytic flexibility, demand characteristics, or publication bias. The replication crisis demonstrated that many 'established' findings from well-designed studies failed independent replication. Scientific reliability requires replication across different laboratories, samples, and methods — one study, no matter how well designed, cannot establish a finding on its own.
Question 4 True / False
A claim that can in principle be shown to be wrong by observable evidence is more scientifically useful than one that is consistent with all possible observations.
TTrue
FFalse
Answer: True
This is Popper's falsifiability criterion. A claim consistent with every possible outcome carries no information — it cannot be tested and cannot guide inquiry. If a hypothesis would be confirmed no matter what data are collected, the data are doing no work. Falsifiability is what makes a claim testable, and testability is what makes science capable of updating its beliefs in response to evidence. Unfalsifiable claims may be meaningful in other ways, but they are not scientific.
Question 5 Short Answer
What is the difference between exploratory and confirmatory research, and why does this distinction matter when interpreting a statistically significant result?
Think about your answer, then reveal below.
Model answer: Exploratory research examines data to generate hypotheses — it is valuable but cannot also serve as the test of those hypotheses. Confirmatory research pre-specifies a hypothesis and analysis plan before data collection, then tests it on new data. The distinction matters because a significant result from exploratory analysis on the data that generated the hypothesis is not a genuine test — the false positive rate is much higher than the p-value suggests. Presenting exploratory findings as confirmatory tests is the primary methodological error underlying many failed replications.
Pre-registration — publicly committing to hypothesis and analysis plan before data collection — is the main tool for maintaining this distinction. It prevents the analyst from unconsciously (or consciously) adjusting the hypothesis to fit the data after seeing results. When every 'significant' finding is actually an exploratory pattern dressed up as a confirmatory test, the published literature fills with spurious discoveries that do not replicate.