A bag contains 5 red and 5 blue marbles. You draw 3 marbles one at a time without replacement. Can the binomial distribution model the number of red marbles you draw?
AYes — there are a fixed number of trials (3) and two outcomes (red or not red) per draw
BNo — drawing without replacement changes the probability of success on each trial, violating the independence requirement
CYes — as long as you count only two outcomes per trial, binomial applies
DNo — binomial requires more than 3 trials to produce a valid distribution
The binomial distribution requires both a fixed number of trials AND independent trials with constant success probability p. Drawing without replacement violates both of these last conditions: the probability of drawing red changes with each draw (after drawing one red, there are only 4 red of 9 remaining), and the draws are not independent. The correct model for sampling without replacement from a finite population is the hypergeometric distribution. This is one of the most common errors in applying the binomial: the setup looks right (binary outcomes, fixed draws) but the dependence created by sampling without replacement disqualifies it.
Question 2 Multiple Choice
In the binomial PMF, P(X = k) = C(n,k) × p^k × (1−p)^(n−k), what does the binomial coefficient C(n,k) count?
AThe probability that the first k trials all succeed
BThe number of distinct arrangements of k successes among n trials
CThe expected number of successes in n trials with probability p
DThe ratio of the probability of k successes to the probability of k failures
C(n,k) — 'n choose k' — counts the number of distinct ways to arrange k successes among n positions. Each specific sequence of k successes and (n−k) failures has probability p^k × (1−p)^(n−k) by the multiplication rule for independent events. Since there are C(n,k) such sequences and each has the same probability (because trials are independent and identically distributed), we multiply. C(n,k) is not a probability itself — it is a count of equally likely arrangements. Understanding why C(n,k) appears is the key to deriving the formula from first principles rather than memorizing it.
Question 3 True / False
The variance of a binomial distribution with parameters n and p is largest when p = 0.5.
TTrue
FFalse
Answer: True
The binomial variance is np(1−p). For fixed n, this is maximized by maximizing p(1−p). Taking the derivative with respect to p and setting to zero gives p = 0.5. Intuitively, this makes sense: when p is near 0 or 1, you are nearly certain of the outcome on each trial (almost always failure or almost always success), so there is little uncertainty and thus low variance. When p = 0.5, each trial is maximally uncertain, and the distribution of successes is most spread out. This is also why the Bernoulli distribution has maximum variance at p = 0.5.
Question 4 True / False
The variance of a binomial distribution with n trials and success probability p equals np.
TTrue
FFalse
Answer: False
The variance is np(1−p), not np. np alone is the mean. Students often confuse these because both involve n and p. The (1−p) factor reflects that variance is reduced when outcomes are more predictable: as p approaches 0 or 1, (1−p) shrinks toward 0 and so does the variance. The mean np grows with p (more likely successes means more expected successes), but the variance peaks at p = 0.5 and falls to zero at both extremes. A useful check: at p = 1, every trial succeeds with certainty, so variance must be 0. np = n at p = 1, which is wrong; np(1−p) = 0 at p = 1, which is correct.
Question 5 Short Answer
Why must trials be independent for the binomial distribution to apply, and what happens to the probability calculation if they are not?
Think about your answer, then reveal below.
Model answer: Independence is required because the binomial PMF is derived by multiplying probabilities across trials: the probability of a specific sequence of k successes is p^k × (1−p)^(n−k) only if each trial's outcome does not affect the others. If trials are dependent, the probability of each trial's outcome changes based on previous outcomes, and you cannot simply multiply p and (1−p) fixed values — you would need conditional probabilities that differ for each trial. The formula breaks down and the resulting probability is incorrect. Sampling without replacement is the canonical case: each draw changes the composition of the population, so subsequent probabilities shift, and the hypergeometric distribution applies instead.
The mathematical derivation makes the requirement explicit: we write P(X = k) = C(n,k) × p^k × (1−p)^(n−k) by multiplying C(n,k) sequences each with probability p^k × (1−p)^(n−k). This multiplication step assumes independent events. When trials are dependent, the joint probability of a sequence is not the product of the marginals, and the entire counting argument collapses. Always verify independence (and constant p) before applying the binomial.