A joint probability table shows P(X=1, Y=2) = 0.12 and P(Y=2) = 0.30. What is P(X=1 | Y=2)?
A0.036 — multiply joint by marginal to condition
B0.40 — divide joint probability by the marginal of Y
C2.5 — divide marginal of Y by joint probability
D0.12 — the joint probability already represents the conditional
The definition of conditional probability is P(X=x | Y=y) = P(X=x, Y=y) / P(Y=y). Here that's 0.12 / 0.30 = 0.40. Option A reverses the operation (multiplication instead of division). Option D confuses the joint probability with the conditional — the joint has not been normalized to the subpopulation where Y=2.
Question 2 Multiple Choice
You compute the conditional distribution of X given Y=y for three different values of y, and find that all three conditional distributions are identical. What can you conclude?
AYou made an error — if all conditionals are the same, the joint table must be uniform
BX and Y are independent — knowing the value of Y provides no information about X's distribution
CY must be a constant random variable taking only one value
DX must be a constant random variable
Independence means the conditional distribution of X given Y=y equals the marginal distribution of X for every value y. If all conditional distributions are identical to each other, knowing Y changes nothing about our picture of X — this is precisely the definition of independence. Option A is wrong: the joint can assign unequal probabilities to cells while still having identical conditional distributions (just scale each row by its marginal). Options C and D are unjustified — neither variable need be constant.
Question 3 True / False
The conditional distribution P(X | Y = y) is typically well-defined for any value y that Y can take.
TTrue
FFalse
Answer: False
The conditional distribution requires dividing by P(Y=y). If P(Y=y) = 0, this division is undefined. In the discrete case, this means conditioning on a value y that has zero probability. In the continuous case, every individual value has probability zero, which is why continuous conditional distributions are defined via density ratios f(x,y)/f_Y(y) — but this still requires f_Y(y) > 0.
Question 4 True / False
If X and Y are independent random variables, then the conditional distribution of X given Y = y is identical to the marginal distribution of X.
TTrue
FFalse
Answer: True
Independence means P(X=x, Y=y) = P(X=x) × P(Y=y). Dividing both sides by P(Y=y) gives P(X=x | Y=y) = P(X=x). The conditional distribution collapses to the marginal — knowing Y tells you nothing new about X. This is equivalent to saying all 'columns' of the joint table, after normalization, look the same.
Question 5 Short Answer
Why must you divide by the marginal probability P(Y=y) when computing the conditional distribution P(X | Y=y), and what does this normalization represent intuitively?
Think about your answer, then reveal below.
Model answer: Dividing by P(Y=y) rescales the joint probabilities so they sum to 1 within the subpopulation where Y=y. Intuitively, conditioning on Y=y means you zoom in to only those outcomes where Y took value y. Within that restricted population, the relative probabilities of different X values are given by the joint probabilities in that slice — but those slice probabilities don't sum to 1 on their own (they sum to P(Y=y)). Dividing by P(Y=y) restores the sum-to-1 property, making the result a valid probability distribution for X within the Y=y subpopulation.
This normalization is the same operation as in elementary conditional probability P(A|B) = P(A∩B)/P(B) — the denominator ensures the conditional measure is a proper probability. Without it, P(X=x | Y=y) across all x would sum to P(Y=y) instead of 1, which is not a valid probability distribution.