Questions: Applied Sociology and Program Evaluation
5 questions to test your understanding
Score: 0 / 5
Question 1 Multiple Choice
A nonprofit reports that 80% of its mentorship program participants graduated high school, compared to a citywide average of 65%. The program director concludes the program is effective. What is the central methodological problem with this conclusion?
AThe sample size is too small to draw any conclusion about program effectiveness
BGraduation rates are not a valid outcome measure for mentorship programs
CProgram participants likely differ systematically from the citywide average — those who sought out mentorship may already have been more motivated — making the comparison invalid as evidence of causation
DThe evaluation needed a longer follow-up period before comparing outcomes
This is the selection bias problem at the heart of impact evaluation. People who enroll in programs are not random samples — they are typically more motivated, more connected, or already on a different trajectory than those who don't enroll. Comparing them to a population average cannot establish causation; it conflates the program effect with the pre-existing differences. A valid impact evaluation needs a credible counterfactual: what would have happened to these participants if the program hadn't existed? This requires a comparison group matched on relevant characteristics or, ideally, random assignment.
Question 2 Multiple Choice
A youth violence prevention program produces a statistically significant reduction of 1.2 arrests per 100 participants annually at a cost of $8,000 per participant. A policy director must decide whether to scale it up. What additional information is most essential for that decision?
AWhether the study received IRB approval and followed ethical research protocols
BWhether the p-value was below 0.01 rather than just 0.05
CWhether a 1.2-arrest reduction is practically meaningful and cost-effective relative to alternative uses of the funds
DWhether the program used a randomized controlled trial or a quasi-experimental design
Statistical significance tells you only that the effect is unlikely to be zero — it says nothing about whether the effect is large enough to matter in practice or justify the cost. A 1.2-arrest reduction per 100 participants might be transformative or trivial depending on baseline rates, severity of those arrests, and what $8,000 per participant could achieve through alternative programs. Applied evaluation requires translating effect sizes into concrete, actionable terms and comparing them against costs and alternatives — the gap between statistical and practical significance is a core professional judgment in applied sociology.
Question 3 True / False
A randomized controlled trial (RCT) is the gold standard for impact evaluation, but it is often infeasible or ethically problematic in social program settings, leading applied sociologists to use quasi-experimental designs.
TTrue
FFalse
Answer: True
RCTs eliminate selection bias through random assignment, making them the most defensible design for causal inference. But randomly assigning people to receive or not receive social services (housing, medical care, legal aid) raises ethical objections; programs with limited slots may not have enough applicants to randomize; and political and organizational contexts often resist random denial of services. Applied sociologists therefore develop alternatives — difference-in-differences, regression discontinuity, matched comparison groups — that approximate the logic of randomization using available data.
Question 4 True / False
If program participants show improved outcomes after completing a program, this before-and-after comparison is sufficient to conclude that the program caused the improvement.
TTrue
FFalse
Answer: False
Before-and-after comparisons are almost always confounded. Outcomes may have improved anyway due to time trends (the economy improved, crime declined citywide), regression to the mean (people seek programs when things are at their worst and naturally improve afterward), or maturation (participants would have developed these skills regardless). Without a credible comparison group — people similar to participants who did not receive the program — there is no way to separate program effect from these alternative explanations. This is why impact evaluation is distinct from outcome evaluation: outcome evaluation asks 'did things improve?'; impact evaluation asks 'did the program cause them to improve?'
Question 5 Short Answer
Why is it important to distinguish between process evaluation and impact evaluation when assessing what a program is actually accomplishing?
Think about your answer, then reveal below.
Model answer: Process evaluation asks whether the program is operating as designed — are the right participants being reached, are staff following the protocol, are activities happening as planned? Impact evaluation asks whether the program is causing the intended changes in participants. Without distinguishing them, an organization can confuse implementation success with effectiveness: a program may be running perfectly (high process fidelity) while producing no impact, or may produce impact despite chaotic implementation. Separating the two questions clarifies whether a null result reflects a bad theory (the program activities don't cause the desired outcomes) or bad implementation (the program wasn't actually delivered as intended).
This distinction also matters for learning and improvement. If a program shows no impact, process evaluation data reveals whether the problem is implementation failure or theory failure — with very different remedies. If the program is reaching the wrong population (process failure), fix the outreach. If it is reaching the right population and doing everything right but outcomes aren't changing (theory failure), reconsider the underlying logic of how the intervention is supposed to work.