Multiple regression extends OLS to include several explanatory variables: y = β₀ + β₁x₁ + β₂x₂ + … + βₖxₖ + u. Each coefficient βⱼ represents the partial effect of xⱼ on y holding all other regressors constant — this 'ceteris paribus' interpretation is the central analytical payoff. In matrix form, the estimator is β̂ = (X'X)⁻¹X'y, which requires (X'X) to be invertible (no perfect multicollinearity). Adding control variables changes coefficient estimates if and only if those controls are correlated with both the dependent variable and the included regressors.
Compare simple and multiple regression estimates on the same dataset — seeing how the wage coefficient on education changes when experience is added illustrates what 'holding constant' means in practice.
You already know bivariate regression: a single explanatory variable x₁ predicts y via ŷ = β̂₀ + β̂₁x₁, with OLS minimizing the sum of squared residuals. Multiple regression extends this to k explanatory variables — y = β₀ + β₁x₁ + β₂x₂ + … + βₖxₖ + u — and the conceptual payoff is enormous. Including additional regressors allows each coefficient to represent a partial effect: β₁ is the estimated change in y for a one-unit increase in x₁ *holding all other regressors constant*. This "ceteris paribus" interpretation is what lets economists isolate the effect of one variable from the confounding influence of others.
The wage-education example makes the logic concrete. A bivariate regression of wages on education gives a coefficient that captures not just education's direct effect but also any correlation between education and other determinants of wages (like experience or family background). When you add experience to the model, the education coefficient changes — and that change is informative. It tells you that part of the original estimate was actually attributable to the correlation between education and experience. The new coefficient is the effect of education among workers with the same years of experience.
In matrix notation, the OLS estimator is β̂ = (X'X)⁻¹X'y, where X is the n × (k+1) matrix of regressors including the constant column, and y is the n × 1 outcome vector. This formula generalizes the bivariate formula and makes the required conditions explicit: (X'X) must be invertible, which fails under perfect multicollinearity. You have seen matrix inverses in your prerequisites; here the condition det(X'X) ≠ 0 is the non-redundancy requirement — no regressor can be an exact linear combination of the others.
The "more controls is always better" intuition is wrong and important to resist. Adding a variable changes coefficient estimates only if it is correlated with both the outcome and the included regressors. Adding a truly irrelevant variable leaves coefficients unchanged in expectation but inflates their standard errors, reducing your ability to detect real effects. Adding an endogenous variable — one caused by your regressor — can introduce bias that wasn't there before, a phenomenon you'll study deeply when you reach omitted variable bias and simultaneity.
Multiple regression is the workhorse of empirical economics. From here, you'll study how to test whether a group of coefficients is jointly significant (F-tests), what happens when you omit a relevant variable, and how to handle categorical variables with dummies. Every one of those topics is an extension of the partial-effect logic you are building here.