Measurement error in exposures or outcomes creates bias in risk estimates; the direction and magnitude depend on whether misclassification is random (non-differential) or systematic (differential). Random exposure misclassification typically biases risk estimates toward the null; outcome misclassification may bias toward or away from null depending on exposure status. Understanding the error mechanism (random vs. differential, classical vs. Berkson) predicts bias direction. Validation studies and sensitivity analyses quantify impact.
Use simulation to demonstrate differential vs. non-differential misclassification effects; conduct validation substudies estimating sensitivity and specificity.
Measurement error always biases toward the null (differential error biases away). Validation substudies must be simple random samples of the full cohort.
Every epidemiologic measure — a dietary recall, a self-reported exposure, a disease code in administrative data — is an imperfect proxy for the true quantity of interest. Measurement error is the systematic or random gap between what you measured and what you wanted to measure. From your study of information bias, you already know that errors in data collection can distort risk estimates. This topic formalizes the underlying mechanisms, letting you predict not just *whether* bias will occur but *which direction* it will push your estimate.
The first and most important distinction is non-differential versus differential misclassification. Non-differential (random) misclassification means the measurement error occurs equally in both exposure groups — exposed and unexposed are misclassified at the same rate, or case and control misclassification is unrelated to exposure status. The effect on a binary exposure is almost always bias toward the null: exposed people are sometimes misclassified as unexposed and vice versa, which blurs the boundary between groups and makes their risk estimates converge. In a 2×2 table, non-differential exposure misclassification reduces the apparent relative risk or odds ratio toward 1.0, even when the true association is strong. This is sometimes called attenuation bias and implies that studies with noisy exposure measurement tend to *underestimate* effect sizes — a real-world consequence of measurement sloppiness.
Differential misclassification occurs when error rates differ by disease status (in case-control studies) or by exposure status (in cohort studies). Cases who experienced the outcome may recall exposures more vividly than controls — recall bias is a classic example. Here the error is systematic in one group but not the other, and the direction of bias can go either way: toward or away from the null. If cases over-report exposure, the odds ratio is artificially inflated. If controls over-report, it's deflated. There is no reliable rule of thumb; you must reason through the specific error mechanism in your study.
The classical versus Berkson error taxonomy is also useful. Classical error is what most people imagine: the measured value equals the true value plus random noise (X_measured = X_true + ε). This is typical of self-report data. Berkson error arises when the true value equals the measured value plus noise — common when exposure is assigned from group-level data (e.g., pollution measurements from a monitoring station applied to everyone in a zip code). These two error types have different statistical properties and require different correction approaches. Quantifying the impact requires a validation substudy: a random sample of your cohort in whom you measure both the imperfect proxy and a gold-standard measure. From the sensitivity and specificity (or, for continuous data, the reliability coefficient), you can estimate how much your observed association has been attenuated — and correct for it using regression calibration or simulation extrapolation (SIMEX). The correction reveals the likely magnitude of the true effect behind the noisy measurement.