A statistical ensemble is a collection of all possible microstates consistent with given macroscopic constraints. The ensemble assigns probabilities to microstates; different constraints yield different ensembles. The fundamental postulate of statistical mechanics states that in equilibrium, all microstates consistent with the constraints are equally probable in the microcanonical ensemble, which justifies using ensemble averaging to compute macroscopic properties.
From kinetic theory and your study of entropy, you know that macroscopic systems consist of enormous numbers of particles whose exact microscopic state is unknowable and irrelevant. Statistical mechanics begins by acknowledging this ignorance explicitly. A microstate specifies the complete microscopic configuration — every particle's position and momentum in classical mechanics, or the quantum state of every particle in quantum mechanics. A macrostate specifies only the few measurable quantities we care about: total energy E, volume V, particle number N. For any macrostate, there are an astronomically large number of compatible microstates.
A statistical ensemble is the conceptual tool for handling this: imagine making a huge number of copies of your system, all prepared with the same macroscopic constraints but distributed over all compatible microstates. The ensemble assigns a probability to each microstate. Macroscopic observables are computed as ensemble averages — expectation values over this probability distribution. The choice of ensemble depends on what constraints you impose: which quantities are fixed (E, V, N, T, μ, P) and which can fluctuate. This is not a matter of taste; it reflects the actual physical situation.
The three fundamental ensembles correspond to three physical situations. The microcanonical ensemble describes an isolated system with fixed E, V, N. The fundamental postulate gives equal probability to every compatible microstate — entropy is S = k ln Ω where Ω is the number of microstates. The canonical ensemble describes a system in thermal contact with a heat bath at temperature T: E can fluctuate, but V and N are fixed. The bath enforces a Boltzmann distribution over microstates: P_i ∝ e^{−E_i/kT}. The grand canonical ensemble allows both energy and particle exchange with a reservoir at temperature T and chemical potential μ. Each ensemble is the right tool for a different experimental setup.
A key insight is that all three ensembles give identical predictions for macroscopic quantities in the thermodynamic limit (N → ∞) — they are equivalent. The fluctuations in E in the canonical ensemble are of order 1/√N relative to the mean, which is negligible for N ~ 10²³. The ensemble that is most convenient mathematically is therefore the right one to use regardless of the physical setup. The canonical ensemble's partition function Z = Σ e^{−βE_i} is typically the easiest starting point because it encodes all thermodynamic information: free energy F = −kT ln Z, and all thermodynamic quantities follow by differentiation. Building intuition for which ensemble to deploy and how to extract thermodynamics from partition functions is the core skill of statistical mechanics.