The Implicit Association Test (IAT) measures automatic associations between social categories and attributes by recording response latencies when paired concepts are compared. While IAT reliably detects implicit biases faster than explicit self-report measures, its test-retest reliability is moderate, and predictive validity for discriminatory behavior is modest, limiting its use as a direct measure of behavioral bias.
Learn the IAT's psychometric properties, limitations, and proper interpretation; examine meta-analyses of IAT predictive validity, understand relationships between implicit and explicit biases, and consider alternative measures of implicit bias.
Students think the IAT perfectly measures racism or sexism and strongly predicts behavior; actually, implicit and explicit biases are partly independent, IAT effects are moderate-sized, and predictive validity varies substantially across contexts and outcomes.
You already know from your study of stereotyping that people hold automatic associations linking social categories (race, gender, age) to attributes (competent/incompetent, dangerous/safe, warm/cold), and that these associations can influence behavior even when people sincerely endorse egalitarian values. The challenge for measurement is that people can't (or won't) accurately report these associations on a self-report questionnaire — either because the associations operate below conscious access, or because social desirability suppresses honest reporting. The Implicit Association Test was designed to get around this problem by measuring associations indirectly, through the one thing that is hard to control: response speed.
The logic of the IAT is elegant. Participants sort items into categories using two response keys. In a race IAT, one key might be paired with "Black faces + pleasant words" and the other with "White faces + unpleasant words"; in the compatible block, it's reversed. The core assumption is that when two categories are strongly associated in memory, sorting them to the same key is easier — faster and more accurate — than when they are not associated. A person who has strong automatic positive associations with White faces and negative associations with Black faces should be faster in the White+pleasant / Black+unpleasant pairing. The difference in reaction time between the two blocks (the D-score) is the measure of implicit bias.
The IAT's strengths are real: it is hard to fake, produces reliable group-level differences in the expected directions (most participants in majority-White countries show implicit preference for White over Black faces), and correlates only modestly with explicit self-report measures — meaning it captures something different. Its weaknesses, however, matter enormously for how it should and should not be used. Test-retest reliability is moderate (around .40–.50 for the race IAT), meaning that an individual's score fluctuates substantially across sessions. This limits its use as a stable individual difference measure. More importantly, predictive validity for actual discriminatory behavior — hiring decisions, medical treatment recommendations, police use of force — is modest (meta-analytic correlations around .15 to .25) and varies substantially across contexts.
The most important interpretive caution is about levels of analysis. The IAT reliably detects associations at the group level — in samples of thousands, IAT scores predict behavior better than chance. But the individual-level inference is much weaker. Telling a specific person "your IAT score shows you are biased and will discriminate" overstates the evidence dramatically. Current consensus is that IAT scores reflect cultural exposure to stereotypes as much as personal prejudice — nearly everyone raised in a society with certain associations picks them up to some degree. What varies is whether those associations are endorsed, controlled, and allowed to influence behavior. This dissociation between implicit association and deliberate discrimination is why bias-reduction research has increasingly focused on structural interventions and behavioral constraints rather than on changing individual IAT scores.