Parallel forms have identical true scores and error variances for all examinees; tau-equivalent forms have identical true scores but potentially different error variances. These assumptions enable alternate-form reliability and test equating. Strictly parallel forms rarely exist in practice, making tau-equivalence a more realistic assumption for most testing applications.
Your earlier study of domain sampling theory established that any observed test score is composed of a true score — the stable underlying ability the test is trying to capture — and measurement error that varies randomly from administration to administration. When a testing program needs to give different students different versions of an exam (to prevent cheating), or to retest the same person over time (to avoid practice effects), a critical question arises: are the two forms actually measuring the same thing with the same precision? This is what the theory of parallel and tau-equivalent forms is designed to answer.
Strictly parallel forms are the most demanding standard. Two forms are parallel if, for every examinee in the population, (1) their true score on Form A equals their true score on Form B, and (2) the error variance on Form A equals the error variance on Form B. The first condition means both forms measure exactly the same underlying construct with the same difficulty. The second means neither form is more or less consistent than the other — measurement noise is identical. Under strict parallelism, the two forms should have the same observed mean, the same observed variance, and the same correlations with any external criterion. In practice, strict parallelism is rarely achievable: even carefully matched test forms differ in item difficulty, wording effects, and the particular sample of domain content they happen to cover.
Tau-equivalent forms relax one assumption. The true scores must still be identical across forms — both forms capture the same underlying ability — but the error variances are allowed to differ. Form A might be slightly more precise than Form B because its items happen to have less ambiguity, even if both forms rank examinees in exactly the same order with the same average difficulty. This is a more realistic assumption for real test development. Essentially tau-equivalent forms relax the constraint further still, allowing true scores on the two forms to differ by an additive constant (one form might be consistently harder), while still sharing a common underlying trait. Cronbach's alpha, which you will encounter in reliability theory, technically assumes essential tau-equivalence — a fact that matters when interpreting what alpha does and does not guarantee.
The practical consequence of these distinctions shows up in test equating — the statistical procedures used to put scores from different forms on the same scale so that a score of 70 on Form A means the same thing as a score of 70 on Form B. Equating is only justifiable when the forms are measuring the same construct (at minimum, essentially tau-equivalent). If that assumption fails — if Form A and Form B are actually tapping somewhat different skills — then equating produces scores that appear comparable but are not, undermining the fairness of any high-stakes decision based on those scores. The formal taxonomy of parallel, tau-equivalent, and essentially tau-equivalent forms gives test developers a principled framework for deciding which statistical procedures are appropriate, and for being transparent about the assumptions their score comparisons depend on.