Questions: Machine Learning Applications in Social Science

5 questions to test your understanding

Score: 0 / 5
Question 1 Multiple Choice

A criminal justice agency builds an ML model achieving 89% accuracy at predicting recidivism, using features including prior convictions, neighborhood, and race. A critic argues the model is problematic even given its accuracy. What is the most substantive articulation of this critique?

AML models are never accurate enough for high-stakes decisions — a 99% threshold should be required
BThe model encodes historical patterns of discriminatory policing and sentencing; deploying it for parole decisions embeds those patterns into future outcomes and perpetuates the discrimination
CParole decisions should always be made by human judges regardless of algorithmic accuracy
DThe model violates privacy by using demographic data in its predictions
Question 2 Multiple Choice

A social scientist uses a gradient-boosted tree model to predict voter turnout with 91% accuracy and announces 'we now understand why people vote.' What is wrong with this conclusion?

A91% accuracy is below the threshold required for scientific conclusions about social behavior
BGradient-boosted trees cannot be applied to binary outcomes like voting
CHigh predictive accuracy demonstrates that the model identifies causes, not just correlates, of voting
DPredictive accuracy means the model finds patterns that forecast outcomes — it says nothing about the causal mechanisms that produce those outcomes
Question 3 True / False

A machine learning model that achieves high overall accuracy on a prediction task can be assumed to be performing fairly across most demographic subgroups.

TTrue
FFalse
Question 4 True / False

In social science, ML methods are most appropriately used for pattern discovery and generating hypotheses to investigate further, rather than as the final word on causal mechanisms.

TTrue
FFalse
Question 5 Short Answer

A researcher builds an ML model using neighborhood characteristics to predict mortgage default with 90% accuracy. Why can't this model be taken as evidence that neighborhood characteristics *cause* mortgage default?

Think about your answer, then reveal below.