A joint distribution has X and Y each taking values {1, 2}. Two students construct different joint distributions that both give X and Y the same marginals (X and Y each uniform on {1,2}). Student A makes X and Y independent; Student B makes X = Y with probability 1. What does this demonstrate?
AThe two students must have made an error, since identical marginals force the same joint distribution
BDifferent joint distributions can share identical marginals — the marginals do not determine the joint, because they contain no information about the relationship between variables
CThe marginals are incorrect, because X = Y with probability 1 contradicts a uniform marginal for X
DThis situation is impossible; joint distributions with the same marginals must be identical
Both constructions are valid. In Student A's version, P(X=1,Y=1) = P(X=1,Y=2) = P(X=2,Y=1) = P(X=2,Y=2) = 1/4. In Student B's version, P(X=1,Y=1) = P(X=2,Y=2) = 1/2 and cross terms = 0. Both have the same marginals: P(X=1) = P(X=2) = P(Y=1) = P(Y=2) = 1/2. Yet one has perfectly correlated variables and the other has independent variables. The joint distribution contains strictly more information than either marginal.
Question 2 Multiple Choice
Given a continuous joint density f(x, y), how do you compute the marginal density f_X(x)?
ASet y = 0 and evaluate: f_X(x) = f(x, 0)
BDivide by the marginal of Y: f_X(x) = f(x, y) / f_Y(y)
CIntegrate over all values of y: f_X(x) = ∫ f(x, y) dy
DAverage over the range of x: f_X(x) = (1/(b−a)) ∫ₐᵇ f(x, y) dx
Marginalization works by 'integrating out' the variable you don't care about. You fix x and integrate over all possible y values, weighting each by how likely that y value is — but since f(x,y) already encodes this, you simply integrate over y. The result is a function of x alone that sums up all the probability at x regardless of what y does. Setting y = 0 (option A) would give the density only along the x-axis, not the marginal. Option D is circular.
Question 3 True / False
If you know the marginal distributions of X and Y separately, you cannot in general reconstruct their joint distribution.
TTrue
FFalse
Answer: True
The joint distribution encodes the full relationship between X and Y, including any dependencies. The marginals discard this relational information, keeping only each variable's individual behavior. Knowing that X is uniform on [0,1] and Y is uniform on [0,1] is consistent with X and Y being independent, perfectly correlated, negatively correlated, or related in any number of other ways. Recovery of the joint is only possible if you additionally know the dependency structure — for example, that X and Y are independent, which allows the factorization P(X,Y) = P(X)·P(Y).
Question 4 True / False
Two random variables are independent if and only if their joint distribution equals the product of their marginal distributions at every point.
TTrue
FFalse
Answer: True
This is the formal definition of independence for random variables. Independence means that knowing the value of X gives no information about Y (and vice versa). Mathematically, this is equivalent to the factorization P(X=x, Y=y) = P(X=x)·P(Y=y) for all x and y (discrete case), or f(x,y) = f_X(x)·f_Y(y) for all x and y (continuous case). When this factorization fails, the variables are dependent, and the joint contains information not present in either marginal.
Question 5 Short Answer
Why does knowing both marginal distributions of X and Y not tell you whether X and Y are positively correlated, negatively correlated, or independent?
Think about your answer, then reveal below.
Model answer: The marginal distributions describe each variable in isolation by summing/integrating out the other. This process discards all information about how X and Y co-vary. Correlation and dependence live in the joint distribution — specifically in whether the probability of a (x, y) pair differs from the product of the individual probabilities. Two completely different joint distributions can produce identical marginals, so the marginals are insufficient to determine any aspect of the relationship between the variables.
A helpful analogy: the marginal of X is like the row-sums of a probability table, and the marginal of Y is the column-sums. Many different tables can have the same row-sums and column-sums. The structure *inside* the table — which cells have high or low probability — is exactly what encodes correlation and dependence. The marginals see only the totals, not the interior.