Questions: Descriptive Statistics: Summarizing Data
5 questions to test your understanding
Score: 0 / 5
Question 1 Multiple Choice
A company reports its employees' average salary as $85,000, but a typical employee earns only $52,000. What most likely explains this gap?
AThe mean was calculated incorrectly
BA few very high executive salaries pull the mean far above the median
CThe median is not a valid measure of center for salary data
DThe standard deviation is unusually low, compressing the mean
In right-skewed distributions like income, a small number of very large values pull the mean upward while the median stays near the bulk of the data. $85,000 is the mean (sensitive to extreme values); $52,000 is the median (resistant). This divergence is a diagnostic signal that the distribution is skewed, and the median better represents a typical employee's salary.
Question 2 Multiple Choice
A dataset of 10 values has a mean of 50 and a standard deviation of 3. One additional value of 500 is added. What happens?
ABoth the mean and median increase substantially
BThe mean increases substantially, but the median increases only slightly
CThe median increases substantially, but the mean increases only slightly
DNeither the mean nor median changes because 500 is an outlier and is excluded
The mean is the balance point of all values, so the extreme outlier of 500 dramatically pulls it upward. The median, however, is determined only by rank order — adding one large value shifts the middle position by at most one spot, causing a minor change. This illustrates the key contrast: mean is sensitive to outliers, median is resistant. Option D is wrong — outliers are not automatically excluded from standard calculations.
Question 3 True / False
For a strongly right-skewed distribution, the mean is typically greater than the median.
TTrue
FFalse
Answer: True
In a right-skewed distribution, the long tail stretches toward high values. Those extreme values pull the mean — which balances all observations — upward toward the tail. The median is determined only by rank order and is resistant to extreme values, so it stays closer to the bulk of the data. The rule of thumb: mean > median signals right skew; mean < median signals left skew.
Question 4 True / False
Dividing by n (rather than n − 1) when computing sample variance gives an unbiased estimate of the population variance.
TTrue
FFalse
Answer: False
Dividing by n produces a biased estimator that systematically underestimates population variance. This happens because the sample mean x̄ is computed from the same data, causing the sum of squared deviations to be slightly smaller than it would be around the true population mean μ. Dividing by n − 1 (Bessel's correction) compensates for this bias. Division by n is appropriate only when computing variance for a complete population, not a sample.
Question 5 Short Answer
Why is the standard deviation preferred over the variance as a reported measure of spread, and what does a 'large' standard deviation actually mean?
Think about your answer, then reveal below.
Model answer: Standard deviation is the square root of variance, which restores the original units of measurement. Variance is in squared units (e.g., dollars² for income data), making it hard to interpret directly. A 'large' standard deviation means observations are widely dispersed around the mean; a 'small' one means they cluster tightly. Whether a standard deviation is 'large' is always context-dependent — a SD of 5 is large if the mean is 10, but negligible if the mean is 10,000.
This targets practical understanding: statistics are tools for communication, and units matter. The transition from variance to standard deviation is not cosmetic — it is what makes the number interpretable in the domain's original units. The follow-up about 'large' targets the common error of evaluating spread in isolation from the scale of the data.