You are given the marginal distributions of X and Y completely. What can you determine about their joint distribution p(x,y)?
AThe joint is fully determined — the marginals contain all information about the pair
BThe joint can be recovered by multiplying the two marginals together
CThe joint cannot be fully determined — many different joints share the same marginals
DThe joint is determined only when X and Y take the same set of values
Marginals tell you what each variable does in isolation — they are the 'shadows' of the joint, obtained by summing out the other variable. But the same marginals are consistent with many different joint distributions: X and Y could be positively correlated, negatively correlated, or independent, all while sharing identical marginals. Option B describes exactly the independence case — if X and Y happen to be independent, then the joint does equal the product of marginals. But assuming independence when it hasn't been established is the key error to avoid.
Question 2 Multiple Choice
For discrete X and Y, you verify that p(x,y) = p_X(x)·p_Y(y) holds for 95% of the pairs (x,y) in the support. What can you conclude?
AX and Y are approximately independent
BX and Y are independent for practical purposes
CX and Y are not necessarily independent — independence requires the factoring to hold for all pairs
DX and Y are independent if the 5% of failing pairs have small probability
Independence is an all-or-nothing condition: p(x,y) = p_X(x)·p_Y(y) must hold for every pair (x,y) without exception. A single violating pair means the variables are dependent. There is no such thing as 'almost independent' in the formal sense — a distribution either factors exactly or it does not. Options A, B, and D reflect the intuition that 'close enough' should count, but in probability theory, dependence is determined by the structure of the full joint, not by a majority of pairs.
Question 3 True / False
Summing the joint PMF p(x,y) over all values of y yields the marginal PMF p_X(x).
TTrue
FFalse
Answer: True
This is the defining operation for computing marginals from a joint distribution. By summing out y, you ask: regardless of what Y is doing, what is the probability that X = x? The result — p_X(x) = Σ_y p(x,y) — is the marginal distribution of X. For continuous variables, the analogous operation is integration: f_X(x) = ∫ f(x,y) dy. The marginal is literally the 'margin' of the joint table — what you'd get by collapsing the table into a single column.
Question 4 True / False
Two random variables with identical marginal distributions is expected to have the same joint distribution.
TTrue
FFalse
Answer: False
This is the core misconception about marginals. Marginals describe each variable individually; the joint describes how they interact. Two entirely different dependency structures can produce identical marginals. For example, if X and Y are both uniform on {0,1}, you could have: (a) an independent joint where p(0,0)=p(0,1)=p(1,0)=p(1,1)=0.25, or (b) a perfectly correlated joint where p(0,0)=p(1,1)=0.5 and p(0,1)=p(1,0)=0. Both have uniform marginals, but completely different joints.
Question 5 Short Answer
Explain why you cannot reconstruct a joint distribution from its marginals alone. What does the joint tell you that the marginals do not?
Think about your answer, then reveal below.
Model answer: The marginals tell you the individual behavior of each variable separately, but say nothing about how the variables relate to each other. The joint distribution captures the dependency structure — whether high values of X tend to coincide with high or low values of Y, and how strongly. This relationship information is entirely absent from the marginals. To reconstruct the joint, you would need additional information such as the conditional distributions or the full covariance structure.
The gap between 'marginals' and 'joint' is precisely the concept of statistical dependence. If X and Y are independent, the joint is completely determined by the marginals (it's their product). But in all other cases, knowing each variable individually gives you no information about their interaction. This is why joint distributions are the fundamental object in multivariate probability — all correlation, regression, and conditional reasoning stems from the additional information the joint contains beyond what the marginals show.