Questions: Conditional Distributions of Random Variables
5 questions to test your understanding
Score: 0 / 5
Question 1 Multiple Choice
The joint PDF of (X, Y) is f(x, y) = 2 for 0 < x < y < 1. A student wants the conditional distribution of Y given X = 0.3. Which approach is correct?
AUse f(0.3, y) directly as the conditional PDF for all valid y
BCompute f_{Y|X}(y | 0.3) = f(0.3, y) / f_X(0.3), normalizing the slice at X = 0.3 to integrate to 1
CRestrict the marginal f_Y(y) to values near y = 0.3
DThe conditional distribution equals the joint because a conditional is just a restriction
The conditional PDF is f_{Y|X}(y|x) = f(x,y) / f_X(x). You take the joint density evaluated at the fixed x-value (the 'slice'), then divide by the marginal f_X(x) to normalize it into a proper distribution that integrates to 1 over y. Using f(0.3, y) unnormalized (option A) gives a function that does not integrate to 1 — it is proportional to the conditional but not a valid PDF. The marginal f_Y(y) (option C) ignores the conditioning information entirely.
Question 2 Multiple Choice
X and Y are independent random variables. Which of the following is always true about their conditional distribution?
AThe conditional distribution f_{Y|X}(y|x) equals the joint distribution f(x,y)
BThe conditional distribution f_{Y|X}(y|x) equals the marginal f_Y(y) for every x
CThe conditional distribution f_{Y|X}(y|x) equals the marginal f_X(x)
DConditioning on X = x always reduces the variance of Y
Independence means knowing X provides zero information about Y. Formally, the joint factors as f(x,y) = f_X(x)·f_Y(y), so f_{Y|X}(y|x) = f(x,y)/f_X(x) = f_Y(y). The conditional equals the marginal — conditioning changed nothing. This is the cleanest characterization of independence in terms of conditional distributions: two variables are independent if and only if every conditional distribution equals the corresponding marginal.
Question 3 True / False
The conditional PDF f_{Y|X}(y|x) must integrate to 1 over all y for each fixed value of x.
TTrue
FFalse
Answer: True
A conditional distribution is a proper probability distribution in its own right — it describes the behavior of Y in the restricted world where X = x is certain. Like any PDF, it must integrate to 1. The division by f_X(x) in the formula f(x,y)/f_X(x) is exactly what ensures this normalization: it converts the unnormalized joint 'slice' at x into a valid distribution. Without this normalization, you have a function proportional to the conditional but not a probability distribution.
Question 4 True / False
The conditional distribution of Y given X = x is simply the joint distribution restricted to the region where X is near x, with no need for renormalization.
TTrue
FFalse
Answer: False
Restriction alone does not produce a probability distribution. The 'slice' f(x,y) for fixed x does not integrate to 1 over y — its total mass depends on how probable that x-value is (captured by f_X(x)). Dividing by f_X(x) renormalizes the slice into a proper distribution that accounts for the fact that we are now working in the conditional world where X = x is certain. Skipping normalization gives a function with the right shape but the wrong total mass.
Question 5 Short Answer
Explain geometrically what the formula f_{Y|X}(y|x) = f(x,y) / f_X(x) is doing.
Think about your answer, then reveal below.
Model answer: The joint density f(x,y) defines a surface over the (x,y) plane. Fixing X = x means taking a vertical slice through this surface — you get a one-dimensional curve over y showing how the joint density behaves at that x. This curve is proportional to the conditional distribution of Y given X = x, but it doesn't integrate to 1 because the joint density is spread across all x values. Dividing by f_X(x) — the total 'mass' of that slice — rescales it into a proper PDF.
Geometrically, you are zooming into the cross-section of the joint density at x and renormalizing so that this cross-section represents a complete probability story for Y in the restricted world where X = x is given. The marginal f_X(x) is the height of the joint surface when integrated over y — it tells you 'how much' of the joint density lives at this x-value. Dividing by it removes that overall scale factor, leaving only the shape of Y's distribution given X = x.