A density histogram has two adjacent bins. Bin A has width 5 and height 0.04. Bin B has width 10 and height 0.04. Which bin contains more data?
ABin A, because it has the same height but is narrower, indicating more concentrated data
BBoth bins contain the same proportion of data because they have the same height
CBin B, because its area (10 × 0.04 = 0.40) is twice Bin A's area (5 × 0.04 = 0.20)
DCannot be determined without knowing the total sample size
In a density histogram, the AREA of each bar (width × height) represents the proportion of data in that bin — not the height alone. Bin A: area = 5 × 0.04 = 0.20 (20% of data). Bin B: area = 10 × 0.04 = 0.40 (40% of data). Bin B contains twice as much data despite having the same height. This is the critical insight: equal heights do NOT mean equal frequencies when bin widths differ. Density = frequency / bin width, so height is a rate, not a count.
Question 2 Multiple Choice
A histogram of household incomes in a city is strongly right-skewed. Which statement best describes the relationship between the mean and median?
AThe mean equals the median, because both measure the center of the distribution
BThe median exceeds the mean, because more families are below average than above
CThe mean exceeds the median, because a few very high incomes pull the mean upward
DThey cannot be compared without knowing the exact distribution
In a right-skewed distribution, a small number of very large values drag the mean to the right of the bulk of the data. The median, which depends only on rank order, is resistant to these extreme values and better represents a 'typical' income. This is why median household income is typically reported rather than mean income — the mean is distorted by billionaires. The long right tail of income data is a classic example of right skew.
Question 3 True / False
Changing the bin width of a histogram built from the same data can make the distribution appear symmetric or skewed.
TTrue
FFalse
Answer: True
Bin width is a major determinant of apparent distribution shape. Too few wide bins compress variation and can mask skewness. Too many narrow bins create jagged noise that obscures the underlying shape. With intermediate bins, the same dataset can appear roughly symmetric or noticeably skewed depending on where bin boundaries fall. This is why analysts try multiple bin widths before drawing conclusions about shape. The choice of bin width is a modeling decision, not an objective fact about the data.
Question 4 True / False
In a density histogram, the height of each bar represents the proportion of data values in that bin.
TTrue
FFalse
Answer: False
In a density histogram, the AREA of each bar (height × width) represents the proportion, not the height alone. Height represents density (proportion per unit of measurement). This distinction only matters when bins have unequal widths — if all bins are the same width, height is proportional to area and the distinction disappears. The density scaling ensures that all bar areas sum to 1.0, making density histograms directly comparable to probability density functions.
Question 5 Short Answer
Why does a gap in a histogram have a specific and meaningful interpretation, unlike gaps in a bar chart for categorical data?
Think about your answer, then reveal below.
Model answer: A histogram represents a continuous numerical variable divided into adjacent bins covering a contiguous range. Because the bins are adjacent and cover every value in the range without overlap, a gap — a bin with zero height — means no observations fell in that interval. It is not a display artifact or a missing category; it is a real absence of data in that range. In a bar chart for categorical data, categories are unordered and bars are separated by convention, so gaps carry no information about the data. In a histogram, the spatial position of each bar on the number line is meaningful.
This is why histograms have touching bars (no spaces between them) while categorical bar charts typically have spaces. The touching-bar convention signals that the x-axis is continuous and that any visible gap genuinely represents a data-free interval.