Ethics & Fairness8 min read

ML Fairness

Biased data in, biased predictions out
data bias:Historical prejudice baked into training dataalgorithmic bias:Model amplifies existing patternsfairness metrics:Multiple definitions Β· no single right answer

Imagine a company wants to build an AI hiring tool. They feed it 50 years of hiring decisions β€” resumes that were accepted and resumes that were rejected. The model learns the patterns and starts making its own recommendations.

There's just one problem: those 50 years of decisions were made by humans with biases. The company historically hired mostly men for engineering roles. Resumes with women's colleges, women's sports teams, or even the word "women's" were less likely to be accepted β€” not because the candidates were worse, but because of historical discrimination.

The AI doesn't know this. It just sees a pattern: "resumes with these words tend to get rejected." So it learns the bias and automates it at scale. Instead of one biased manager, you now have a biased system rejecting thousands of candidates per day.

This isn't hypothetical β€” Amazon actually built this system in 2014 and had to scrap it when they discovered the bias.

Where does bias come from?

1. Historical bias

The data reflects past decisions that were themselves biased. A loan approval model trained on historically discriminatory lending practices will perpetuate that discrimination.

2. Representation bias

The training data doesn't represent everyone equally. Facial recognition trained mostly on light-skinned faces performs terribly on dark-skinned faces β€” not because the algorithm is racist, but because it never got to learn from diverse examples.

3. Measurement bias

The way data is collected systematically disadvantages certain groups. Using "number of arrests" as a proxy for "criminality" encodes policing patterns, not actual crime rates. Neighborhoods that are policed more heavily will show more arrests.

4. Aggregation bias

One model for everyone ignores that different groups may have different relationships in the data. A medical model trained on data that's 80% male may give poor recommendations for female patients.

Detecting Bias in Predictions

import numpy as np
# Simulated hiring model predictions
# Scores from 0-100, higher = more likely to be hired
np.random.seed(42)
# Group A (majority in historical data) β€” model learned to favor them
group_a_scores = np.random.normal(70, 10, 500).clip(0, 100)
# Group B (underrepresented) β€” model systematically scores lower
group_b_scores = np.random.normal(55, 10, 500).clip(0, 100)
threshold = 65 # "hire" threshold
hire_rate_a = (group_a_scores >= threshold).mean()
hire_rate_b = (group_b_scores >= threshold).mean()
print(f"Group A hire rate: {hire_rate_a:.1%}")
print(f"Group B hire rate: {hire_rate_b:.1%}")
print(f"Disparate impact ratio: {hire_rate_b / hire_rate_a:.2f}")
print(f"\n(4/5ths rule: ratio below 0.80 suggests adverse impact)")
print(f"Bias detected: {'YES' if hire_rate_b / hire_rate_a < 0.80 else 'NO'}")
Output
Group A hire rate: 69.2%
Group B hire rate: 16.4%
Disparate impact ratio: 0.24

(4/5ths rule: ratio below 0.80 suggests adverse impact)
Bias detected: YES

Defining fairness (it's harder than you think)

There are multiple mathematical definitions of fairness, and β€” here's the uncomfortable part β€” some of them contradict each other. You literally cannot satisfy all of them simultaneously.

  • Demographic parity: equal approval rates across groups. (But this might mean accepting less qualified candidates from one group.)
  • Equal opportunity: equal true positive rates across groups. (Qualified people from all groups are equally likely to be approved.)
  • Predictive parity: equal precision across groups. (If approved, you're equally likely to succeed regardless of group.)

There's no universally "right" definition. The choice depends on context, values, and the consequences of errors.

What can you do?

  1. Audit your data β€” check for representation gaps and historical biases
  2. Measure fairness β€” don't just check accuracy; check accuracy per group
  3. Mitigate β€” techniques like re-sampling, re-weighting, or adversarial debiasing
  4. Be transparent β€” document your model's limitations and who it was tested on
  5. Keep humans in the loop β€” for high-stakes decisions, AI should assist, not replace human judgment
Note: "The model is just learning from the data" is not a defense. If the data is biased, the model is biased β€” and deploying it automates discrimination at scale. The builders of ML systems are responsible for auditing and mitigating bias before deployment.

Key Metrics

πŸ“Š Data Audit
Catch bias at the source β€” cheapest time to fix it
Before training Check representation + labels
βš–οΈ Fairness Metrics
Overall accuracy hides group-level disparities
After training Accuracy per group
πŸ”§ Debiasing Techniques
Trade-offs: reducing bias may slightly reduce overall accuracy
During/after training Re-weight, re-sample, or constrain
πŸ” Ongoing Monitoring
Bias can emerge or shift as real-world data changes
After deployment Continuous fairness checks

Quick check

Amazon's hiring AI was biased against women. What was the root cause?
Challenge

Continue reading