← Graph View All Domains

A topic in the Open Knowledge Graph — a free, open map of 15,290 topics and the order to learn them in.

Fairness in Machine Learning

Graduate Depth 83 in the knowledge graph ☐ I know this ☆ Set as goal

101topics build on this

399prerequisites beneath it

See this on the map →

AI Ethics, Fairness, and Bias→→Linear Regression in Machine Learning

Core Idea

Fairness addresses systematic bias that discriminates against protected groups. Fairness definitions include demographic parity (equal prediction rates), equalized odds (equal error rates across groups), and calibration (equal accuracy per group). Achieving fairness requires measuring bias, selecting appropriate fairness metrics, and retraining or post-processing models.

Explainer

From AI ethics, you understand that machine learning systems can perpetuate and amplify societal biases present in their training data. Fairness in machine learning takes this concern from the conceptual level to the technical: it provides formal mathematical definitions of what "fair" means, methods to measure whether a model meets those definitions, and interventions to correct unfairness. The challenge is that fairness is not a single concept — there are multiple competing definitions, and a landmark impossibility result shows that most of them cannot be satisfied simultaneously.

Demographic parity (also called statistical parity) requires that the model's positive prediction rate is equal across groups — for instance, that a hiring algorithm recommends the same proportion of male and female candidates. This sounds straightforward, but it has a serious limitation: it ignores the actual qualifications of individuals. If one group genuinely has higher rates of the target outcome (e.g., a medical condition), enforcing equal prediction rates means the model must either over-predict for one group or under-predict for another, reducing accuracy for everyone. Demographic parity is blind to whether the predictions are *correct* — it only looks at rates.

Equalized odds addresses this by requiring that the model's error rates (both false positive and false negative rates) are equal across groups. This is a more nuanced criterion: it allows overall prediction rates to differ if the groups genuinely differ in base rates, but demands that the model makes mistakes at the same rate for each group. A related criterion, equal opportunity, relaxes this to require only equal true positive rates — ensuring that qualified individuals in all groups have the same chance of receiving a positive prediction. Calibration requires that among all individuals who receive a predicted probability of, say, 70%, approximately 70% actually have the positive outcome regardless of group membership. The Chouldechova-Kleinberg impossibility theorem proves that when base rates differ across groups, you cannot simultaneously achieve calibration, equal false positive rates, and equal false negative rates — you must choose which form of fairness matters most for your application.

Interventions to improve fairness operate at three stages. Pre-processing modifies the training data to remove bias before the model ever sees it — techniques include resampling, reweighting, or transforming features to remove correlations with protected attributes. In-processing modifies the learning algorithm itself, adding fairness constraints or regularization terms to the objective function so the model optimizes for both accuracy and fairness simultaneously. Post-processing adjusts the model's predictions after training, applying group-specific thresholds to equalize the desired fairness metric. Each approach involves tradeoffs: pre-processing may discard useful information, in-processing makes training more complex, and post-processing can feel like a patch rather than a fix. The choice depends on what kind of fairness is most important in context — criminal justice, lending, healthcare, and hiring each prioritize different definitions because the costs of different types of errors differ dramatically across these domains.

Practice Questions 5 questions

Prerequisite Chain

Understanding Zero → The Number Zero → Counting to Five → Counting to 10 → Counting to 20 → Counting a Set of Objects Up to 20 → Cardinality: The Last Number Counted → Matching Numerals to Quantities → Subitizing Small Quantities → Addition Within 10 → Number Bonds to 10 → Addition Within 20 → Doubles and Near Doubles → Doubles Facts Within 10 → Near Doubles Facts Within 20 → Mental Math Strategies for Addition → Mental Math: Adding and Subtracting Tens → Addition Within 100 → Repeated Addition as Multiplication → Multiplication as Equal Groups → Multiplication: Arrays → Basic Multiplication Facts (0s, 1s, 2s, 5s, 10s) → Multiplication Facts Within 100 → Division as Equal Sharing → Division as Grouping (Measurement Division) → Division: Grouping (Repeated Subtraction) Model → Division: Fair Sharing Model → Division as Equal Sharing → Division as Grouping → Basic Division Facts → Division Facts Within 100 → Multiplication and Division Fact Families → Relationship Between Multiplication and Division → Division Facts as Inverse of Multiplication → Remainders and Quotients in Division → Division Word Problems → Multi-Step Word Problems → Solving Multi-Step Word Problems → Multiplication Word Problems → Division Word Problems → Introduction to Long Division → Factors and Multiples → Prime and Composite Numbers → Equivalent Fractions → Relating Fractions and Decimals → Decimal Place Value → Integers and the Number Line → Comparing and Ordering Integers → Absolute Value → Adding Integers → Subtracting Integers → Multiplying Integers → Introduction to Exponents → Order of Operations → Integer Order of Operations → Variable Expressions → The Distributive Property → Variables and Expressions Review → Introduction to Polynomials → Adding and Subtracting Polynomials → Multiplying Polynomials → Factorial → Permutations → Combinations → Counting Principles: Addition and Multiplication Rules → Introduction to Graph Theory → Propositional Logic Foundations → Logical Equivalences → Boolean Algebra → Boolean Type and Truth Values → Comparison Operators and Boolean Tests → Logical Operators and Boolean Algebra → Conditional Statements → Defining and Calling Functions → Functions: Decomposing Problems → Function Parameters and Argument Passing → Return Values → Variable Scope → Introduction to Classes → Objects and Instances → Methods and Attributes → Algorithm Design Basics → AI Ethics, Fairness, and Bias → Fairness in Machine Learning

Longest path: 84 steps · 399 total prerequisite topics

Prerequisites (1)

AI Ethics, Fairness, and Biashard

Leads To (1)

Linear Regression in Machine Learningsoft