4 questions to test your understanding
Why does crystallographic refinement require geometric restraints, and what happens if they are removed?
The data-to-parameter ratio is the fundamental issue. At 2.0 A resolution, a protein structure has roughly 1 observation per parameter (the ratio improves at higher resolution). Without restraints, the optimization has too much freedom — it can reduce R by moving atoms to positions that fit the noise in the data, producing unrealistic bond lengths and angles. Geometric restraints (derived from high-resolution small-molecule crystal structures where bond lengths and angles are determined to 0.001 A precision) act as additional 'observations' that regularize the optimization, preventing overfitting. At very high resolution (< 1.0 A, where the data-to-parameter ratio exceeds ~5), restraints can be loosened because the data are sufficient to determine atomic positions independently — this is why small-molecule structures are refined with minimal or no restraints.
Real-space refinement in COOT and reciprocal-space refinement in REFMAC/PHENIX optimize the same objective function.
Answer: False
They optimize different but complementary objective functions. Reciprocal-space refinement (REFMAC, phenix.refine) minimizes a target function in reciprocal space — the difference between observed and calculated structure factor amplitudes (|F_obs| - |F_calc|), typically using maximum likelihood targets that weight observations by their estimated uncertainty. It adjusts all atomic parameters simultaneously through gradient-based optimization. Real-space refinement (COOT, phenix.real_space_refine) optimizes the fit of the model to the electron density map in real space — maximizing the correlation between model density and observed density, typically for local regions (a few residues at a time) during manual rebuilding. The crystallographer uses COOT to correct errors (flipped peptides, wrong rotamers, misplaced loops) that reciprocal-space refinement cannot fix because they require large-scale coordinate changes that cross energy barriers in the optimization landscape. The two approaches are complementary: reciprocal-space refinement optimizes globally, real-space rebuilding fixes local errors.
Describe the typical iterative workflow of model refinement and explain why multiple rounds are usually required.
A typical refinement might involve 5-20 rounds, depending on the initial model quality and resolution. Automated pipelines (AutoBuild in PHENIX, Buccaneer in CCP4) can handle many rebuilding tasks computationally, but challenging regions (crystal contacts, active sites, ligand binding poses, disordered loops) usually require expert manual intervention in COOT. The quality of the final model depends critically on the crystallographer's skill in map interpretation.
What is the difference between an Fo-Fc difference map and a 2Fo-Fc map, and why are both needed during refinement?
The Fo-Fc map is particularly important for identifying ligands and water molecules: a strong positive Fo-Fc peak (> 3 sigma) in the active site after refining the protein alone indicates a bound molecule. The shape of this peak, combined with the biochemical context, guides ligand identification and placement. Many errors in published structures stem from insufficient attention to Fo-Fc maps during refinement.