ML Fairness
Imagine a company wants to build an AI hiring tool. They feed it 50 years of hiring decisions β resumes that were accepted and resumes that were rejected. The model learns the patterns and starts making its own recommendations.
There's just one problem: those 50 years of decisions were made by humans with biases. The company historically hired mostly men for engineering roles. Resumes with women's colleges, women's sports teams, or even the word "women's" were less likely to be accepted β not because the candidates were worse, but because of historical discrimination.
The AI doesn't know this. It just sees a pattern: "resumes with these words tend to get rejected." So it learns the bias and automates it at scale. Instead of one biased manager, you now have a biased system rejecting thousands of candidates per day.
This isn't hypothetical β Amazon actually built this system in 2014 and had to scrap it when they discovered the bias.
Where does bias come from?
1. Historical bias
The data reflects past decisions that were themselves biased. A loan approval model trained on historically discriminatory lending practices will perpetuate that discrimination.
2. Representation bias
The training data doesn't represent everyone equally. Facial recognition trained mostly on light-skinned faces performs terribly on dark-skinned faces β not because the algorithm is racist, but because it never got to learn from diverse examples.
3. Measurement bias
The way data is collected systematically disadvantages certain groups. Using "number of arrests" as a proxy for "criminality" encodes policing patterns, not actual crime rates. Neighborhoods that are policed more heavily will show more arrests.
4. Aggregation bias
One model for everyone ignores that different groups may have different relationships in the data. A medical model trained on data that's 80% male may give poor recommendations for female patients.
Detecting Bias in Predictions
Defining fairness (it's harder than you think)
There are multiple mathematical definitions of fairness, and β here's the uncomfortable part β some of them contradict each other. You literally cannot satisfy all of them simultaneously.
- Demographic parity: equal approval rates across groups. (But this might mean accepting less qualified candidates from one group.)
- Equal opportunity: equal true positive rates across groups. (Qualified people from all groups are equally likely to be approved.)
- Predictive parity: equal precision across groups. (If approved, you're equally likely to succeed regardless of group.)
There's no universally "right" definition. The choice depends on context, values, and the consequences of errors.
What can you do?
- Audit your data β check for representation gaps and historical biases
- Measure fairness β don't just check accuracy; check accuracy per group
- Mitigate β techniques like re-sampling, re-weighting, or adversarial debiasing
- Be transparent β document your model's limitations and who it was tested on
- Keep humans in the loop β for high-stakes decisions, AI should assist, not replace human judgment