Why was the introduction of R-free considered one of the most important methodological advances in protein crystallography?
Think about your answer, then reveal below.
Model answer: Before R-free, the R-factor was the primary measure of model quality, but R can always be reduced by adding parameters to the model — more atoms, alternative conformations, high B-factors, water molecules — regardless of whether these additions represent real structural features. There was no independent check on whether the model genuinely explained the diffraction data versus merely fitting noise. R-free (Brunger, 1992) introduced cross-validation to crystallography by withholding a random subset of reflections from refinement and computing the agreement only against this unseen test set. This immediately exposed overfitting: models with artificially low R but genuinely poor structure showed high R-free. The impact was transformative — it changed how refinement was practiced (favoring restraints and conservative parameterization), how structures were evaluated by journals and the PDB, and how the community detected errors in published structures.
R-free fundamentally changed the culture of crystallographic refinement from R-factor minimization to genuine model improvement. Before R-free, some practitioners would add waters and alternative conformations aggressively to reduce R, producing models that appeared good by R but contained many spurious features. R-free penalizes overfitting, aligning the statistical incentive with the scientific goal of an accurate model.