Practical ML7 min read

Bias vs Variance

Too simple or too wobbly β€” find the sweet spot
high bias:Underfitting Β· model too simplehigh variance:Overfitting Β· model too complexsweet spot:Low bias + low variance

Imagine you're at a dartboard competition. Two of your friends are playing:

  • Alice throws every dart in a tight cluster β€” but the cluster is consistently to the left of the bullseye. She's precise but off-target.
  • Bob has darts scattered all over the board. Some hit near the center, others land in the wall. He's not biased toward any direction, but his throws are wildly inconsistent.

Alice has high bias, low variance. Bob has low bias, high variance. The goal? Throw like neither of them. You want your darts tight AND centered β€” that's the sweet spot in machine learning too.

What is bias?

Bias is when your model makes overly simplistic assumptions. It's like fitting a straight line through data that's clearly curved. No matter how much data you give it, the model just can't capture the real pattern. This is called underfitting.

Imagine using a ruler to trace the outline of a cloud. The ruler isn't flexible enough β€” it'll always give you a straight line, no matter how curvy the cloud is.

What is variance?

Variance is when your model is too sensitive to the training data. It memorizes every twist, bump, and noise in the data, so it performs amazingly on the training set but falls apart on new data. This is overfitting.

Imagine tracing that same cloud with a shaky hand and a super-fine pen. You capture every tiny turbulence, but your drawing looks completely different every time the wind shifts.

Seeing Bias vs Variance with Polynomial Fits

import numpy as np
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import PolynomialFeatures
from sklearn.metrics import mean_squared_error
# True pattern: y = x^2 (with noise)
np.random.seed(42)
X = np.random.uniform(0, 5, 20).reshape(-1, 1)
y = X.flatten()**2 + np.random.normal(0, 2, 20)
# High bias: fit a straight line (degree 1)
model_simple = LinearRegression()
model_simple.fit(X, y)
print(f"Linear MSE: {mean_squared_error(y, model_simple.predict(X)):.2f}")
# Sweet spot: fit a quadratic (degree 2)
poly2 = PolynomialFeatures(degree=2)
model_good = LinearRegression().fit(poly2.fit_transform(X), y)
print(f"Quadratic MSE: {mean_squared_error(y, model_good.predict(poly2.transform(X))):.2f}")
# High variance: fit degree 15 polynomial
poly15 = PolynomialFeatures(degree=15)
model_overfit = LinearRegression().fit(poly15.fit_transform(X), y)
print(f"Degree-15 MSE: {mean_squared_error(y, model_overfit.predict(poly15.transform(X))):.2f}")
Output
Linear MSE: 18.43
Quadratic MSE: 3.12
Degree-15 MSE: 0.01

The tradeoff

Here's the painful truth: reducing bias usually increases variance, and vice versa. It's a seesaw.

  • Make your model more complex β†’ bias drops, but variance rises
  • Make your model simpler β†’ variance drops, but bias rises

The art of machine learning is finding the balance point where total error (biasΒ² + variance) is minimized. This is called the bias-variance tradeoff.

How to spot the problem

SymptomProblemFix
Bad on training AND test dataHigh bias (underfitting)Use a more complex model, add features
Great on training, bad on testHigh variance (overfitting)Get more data, simplify model, regularize
Note: The degree-15 polynomial got near-zero training error β€” looks amazing! But try it on new data and it'll predict nonsense. Low training error doesn't mean your model is good. Always check test performance.

Key Metrics

🎯 High Bias Model
Consistently wrong β€” misses the real pattern
Fast to train Underfits the data
🌊 High Variance Model
Memorizes noise β€” fails on new data
Slow to train Overfits the data
βœ… Balanced Model
Captures the real pattern without memorizing noise
Moderate Generalizes well

Quick check

A model performs poorly on both training and test data. What is the likely problem?
Challenge

Continue reading