A dataset has the five-number summary: min=10, Q1=20, median=25, Q3=50, max=90. What does this suggest about the distribution's shape?
ASymmetric — the quartiles are evenly spaced from the median
BLeft-skewed — the lower tail is longer than the upper tail
CRight-skewed — the gap above the median (25 to 90) is larger than the gap below (10 to 25)
DBimodal — the large range between Q3 and max indicates two clusters
The spacing between the five values reveals skewness. Below the median, the spread is 25−10 = 15 (min to median). Above the median, the spread is 90−25 = 65 (median to max). The upper tail is much larger, so the distribution is right-skewed (long upper tail). The large Q3-to-max gap (40 units) versus the Q1-to-min gap (10 units) confirms this. Skewness is read from the relative spacing of the summary values, not from the values themselves. No formula is needed.
Question 2 Multiple Choice
A researcher reports salary data for 500 employees, including a CEO earning $12 million while most employees earn $50,000–$80,000. Which summary is most useful, and why?
AMean and standard deviation — they use all data points and are more precise
BFive-number summary — the median and quartiles are resistant to the CEO's extreme salary
CMean and five-number summary both — they give identical pictures with large samples
DFive-number summary — but only after removing the outlier from the dataset
The mean and standard deviation are pulled strongly by the $12M outlier, giving a misleadingly high 'average' salary. The median and IQR are based on rank, not magnitude, so the CEO's salary only moves the maximum — it cannot shift Q1, the median, or Q3. The five-number summary gives an honest picture of the typical employee's salary. Option D reflects a common mistake: you should not remove outliers just because they distort the mean — the five-number summary handles this without deletion.
Question 3 True / False
In the five-number summary, each of the four intervals [min, Q1], [Q1, median], [median, Q3], and [Q3, max] contains the same number of data points.
TTrue
FFalse
Answer: True
This is the defining property of quartiles. Q1 is the 25th percentile, meaning 25% of data falls below it. The median is the 50th percentile. Q3 is the 75th percentile. So exactly 25% of observations fall in each of the four intervals. The intervals may differ widely in width (range of values) — a skewed distribution will have unequal widths — but the frequency in each interval is equal. Confusing width with frequency is the central misconception about the five-number summary.
Question 4 True / False
Adding one very large outlier to a dataset will significantly shift the median and Q1 upward.
TTrue
FFalse
Answer: False
The median and quartiles are resistant (robust) to outliers because they are based on rank position, not magnitude. Adding one extreme value changes the rank ordering only slightly — in a large dataset, it shifts the median's rank by at most one position. For Q1 to move significantly, many values near Q1 would need to change. An outlier primarily affects the maximum (and possibly Q3 in small datasets) but leaves the middle of the distribution intact. This robustness is precisely why the five-number summary is preferred when outliers are possible.
Question 5 Short Answer
How does the five-number summary reveal the skewness of a distribution without computing any formula?
Think about your answer, then reveal below.
Model answer: By comparing the spacing between the five values. If the gap from the median to Q3 (and from Q3 to the max) is larger than the gap from the median to Q1 (and from Q1 to the min), the distribution is right-skewed. If the gap below the median is larger, it is left-skewed. Equal spacing suggests symmetry. Since each interval contains the same number of data points, unequal spacing means some quarter of the data is stretched out over a larger range of values — which is exactly what skewness means.
This is the power of a rank-based summary: skewness shows up directly as asymmetric spacing rather than requiring calculation of a skewness statistic. A right-skewed distribution bunches data on the left and trails far to the right, so the upper intervals are wide while the lower intervals are narrow — visible immediately from the five numbers.