Questions: Quantile Regression and Distributional Effects
5 questions to test your understanding
Score: 0 / 5
Question 1 Multiple Choice
A researcher runs OLS on wages vs. education and finds a coefficient of 8%. She then runs quantile regression and finds coefficients of 3% at the 10th percentile, 7% at the 50th percentile, and 18% at the 90th percentile. What should she conclude that OLS alone could not tell her?
AOLS is biased in this dataset and should be discarded in favor of quantile regression
BThe return to education is much larger for high earners than for low earners — education disproportionately benefits those already near the top of the conditional wage distribution
CHigh-earning workers received more years of education on average, explaining the higher coefficient
DThe 90th percentile estimate is driven by outliers in the data, which is why it differs from OLS
The pattern of increasing coefficients across quantiles reveals distributional heterogeneity that OLS averages away. It means that, controlling for other factors, an additional year of education raises wages by 3% for workers near the bottom of the conditional distribution but 18% near the top. OLS's 8% is a weighted average of these. This finding has real policy implications — if education mainly benefits already-advantaged workers, it may widen inequality even while raising average wages. Option D is wrong: quantile regression minimizes asymmetric loss functions and is itself more robust to outliers than OLS.
Question 2 Multiple Choice
A policymaker needs to know whether a job training program 'lifts the floor' for low-wage workers or primarily benefits middle-earners. Which approach best answers this question?
AOLS regression of wages on program participation, controlling for demographics
BMedian regression only, since the median is more robust than the mean
CQuantile regression at multiple quantiles (e.g., 10th, 25th, 50th, 75th), comparing coefficients across the distribution
DSplit the sample into low-wage and high-wage workers and run separate OLS regressions
The question is specifically about whether effects differ across the wage distribution — this is precisely what quantile regression reveals. Running QR at multiple quantiles shows whether the program coefficient is larger at the 10th percentile (lifting the floor) or larger at the 50th (helping the middle). OLS gives only an average effect. Median regression (option B) is better than OLS for robustness but still only estimates one point in the distribution. Splitting the sample (option D) compares different subgroups of workers, not points in the conditional distribution — a fundamentally different question.
Question 3 True / False
The 75th percentile estimated by quantile regression of wages on education gives the 75th percentile of wages in the overall population.
TTrue
FFalse
Answer: False
This is the most important clarification about quantile regression: it estimates *conditional* quantiles, not unconditional ones. The 75th percentile coefficient tells you about wages at the 75th percentile of the distribution of wages *given the education level*. A worker at the 75th conditional quantile for one education level is not necessarily at the 75th percentile of all wages in the population — sorting effects mean the unconditional distribution reflects both the conditional distributions and who has each education level. This distinction matters for policy interpretation: QR tells you about effects within education groups, not about ranking individuals in the full population.
Question 4 True / False
Median regression (quantile regression at the 50th percentile) is more robust to outliers in Y than OLS.
TTrue
FFalse
Answer: True
True. OLS minimizes the sum of *squared* residuals, which means extreme values of Y receive quadratically larger weight. A single very high or very low observation can dramatically shift the OLS fit. Median regression minimizes the sum of *absolute* deviations, giving outliers only linear weight. As a result, extreme Y values influence the median much less than they influence the mean. This robustness property is especially valuable when the response variable has a skewed distribution or contains measurement errors that produce occasional extreme values — common in wage, income, and expenditure data.
Question 5 Short Answer
Why should quantile regression be used as a complement to OLS rather than a replacement? What specific question does each method answer?
Think about your answer, then reveal below.
Model answer: OLS estimates the effect of X on the *conditional mean* of Y — the average outcome for people with given characteristics. Quantile regression estimates the effect of X on a specific *conditional quantile* of Y. They answer different questions. OLS answers: 'On average, how does Y change with X?' Quantile regression answers: 'How does the relationship between X and Y differ across the distribution of Y?' If QR coefficients are similar across quantiles, OLS captures the full story efficiently. If they differ substantially, OLS hides important distributional heterogeneity — education may raise average wages while primarily benefiting top earners, or a policy may help the poor but hurt the middle class, patterns that only QR reveals.
Neither method is universally superior. OLS is more efficient (lower variance) when effects are truly homogeneous across the distribution. QR provides richer information when they are not. The complementary approach — run both and compare — tests whether the OLS average is misleading. If QR coefficients are flat across quantiles, the OLS estimate is a complete characterization. If they vary systematically, it signals that distributional heterogeneity is present and economically significant.