Conditional probability P(A|B) is the probability of event A given that event B has occurred. It is defined as P(A|B) = P(A ∩ B) / P(B) when P(B) > 0. Conditioning on new information updates the sample space to only those outcomes where the conditioning event occurred, rescaling probabilities accordingly.
From probability-axioms, you know that every event has a probability between 0 and 1 and that all outcomes in the sample space sum to 1. Conditional probability extends this framework: when new information arrives, it eliminates outcomes that are no longer possible and forces us to rescale the remaining probabilities so they still sum to 1.
The formal definition is P(A|B) = P(A ∩ B) / P(B). To see why this makes sense, imagine rolling a fair six-sided die. The full sample space is {1, 2, 3, 4, 5, 6}. If you learn the result is even (event B = {2, 4, 6}), the outcomes 1, 3, 5 are impossible — your effective sample space shrinks to {2, 4, 6}. Now, what is the probability the result exceeds 4 (event A = {5, 6})? Among the even outcomes only 6 qualifies, so P(A|B) = 1/3. Verify with the formula: P(A ∩ B) = P({6}) = 1/6 and P(B) = 3/6 = 1/2, so P(A|B) = (1/6)/(1/2) = 1/3. The formula performs exactly the "shrink then rescale" operation geometrically.
A critical misconception to avoid: P(A|B) is generally not equal to P(B|A). Consider a diagnostic test: P(positive test | disease) might be 0.95, meaning the test is sensitive. But P(disease | positive test) — the probability you actually have the disease given a positive result — depends heavily on how rare the disease is in the population. If only 1 in 1000 people have it, most positives will be false alarms even with a 95%-sensitive test. This asymmetry is so counterintuitive that it surprises even trained professionals, and it is the engine behind Bayes' theorem, which you will encounter next.
Conditional probability also gives a formal definition of independence. Two events A and B are independent exactly when P(A|B) = P(A) — knowing B occurred gives no information about A. Equivalently, P(A ∩ B) = P(A)P(B). You may have encountered independence informally before; conditional probability is what makes the definition precise. When conditioning on B has no effect on A's probability, the events are truly unrelated in a probabilistic sense.