Foundations7 min read

Overfitting & Underfitting

Goldilocks and the three models β€” too simple, too complex, just right
underfitting:Model too simple Β· misses patternsoverfitting:Model too complex Β· memorizes noisejust right:Generalizes well Β· balances bias & variance

Remember the story of Goldilocks? One porridge was too hot, one was too cold, and one was just right. Machine learning has the exact same problem β€” but with models instead of porridge.

Imagine you're studying for a history exam:

  • Student A reads the chapter titles and calls it a day. "Something happened in 1776... America, I think?" Way too shallow β€” they underfitted the material.
  • Student B memorizes the textbook word-for-word, including page numbers and typos. When the exam asks a slightly different question, they freeze. They overfitted β€” they memorized instead of understanding.
  • Student C understands the key themes, cause-and-effect relationships, and can apply them to new questions. Just right.

Your ML model needs to be Student C.

Underfitting: "I barely tried"

An underfitting model is too simple to capture the patterns in the data. It performs poorly on both training data and test data.

Think of fitting a straight line through data that clearly curves. The line doesn't match the training points, and it certainly won't match new points either.

Signs of underfitting:

  • Low training accuracy
  • Low test accuracy
  • The model is "too dumb" for the problem

Common causes:

  • Model is too simple (e.g., linear model for a nonlinear problem)
  • Not enough features
  • Too much regularization (we'll cover this later)
  • Not trained long enough

Overfitting: "I memorized the textbook"

An overfitting model is too complex. It learns the training data perfectly β€” including the noise and random quirks that aren't real patterns. Then it bombs on new data because those quirks don't generalize.

Think of fitting a wild, squiggly curve that passes through every single training point. It looks perfect on paper, but it's capturing noise, not signal.

Signs of overfitting:

  • High training accuracy (often near perfect)
  • Much lower test accuracy
  • The gap between train and test scores is large

Common causes:

  • Model is too complex (too many parameters)
  • Not enough training data
  • Training for too long
  • No regularization

Seeing Overfit vs. Underfit in Action

import numpy as np
from sklearn.preprocessing import PolynomialFeatures
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error
# True pattern: y = 2x + noise
np.random.seed(42)
X = np.random.uniform(0, 10, 20).reshape(-1, 1)
y = 2 * X.ravel() + np.random.normal(0, 2, 20)
# Split
X_train, X_test = X[:15], X[15:]
y_train, y_test = y[:15], y[15:]
# Model 1: Too simple (constant β€” degree 0)
from sklearn.dummy import DummyRegressor
simple = DummyRegressor(strategy='mean')
simple.fit(X_train, y_train)
print("=== Underfitting (just predicts the mean) ===")
print(f"Train error: {mean_squared_error(y_train, simple.predict(X_train)):.1f}")
print(f"Test error: {mean_squared_error(y_test, simple.predict(X_test)):.1f}")
# Model 2: Just right (degree 1 β€” linear)
right = LinearRegression()
right.fit(X_train, y_train)
print("\n=== Just Right (linear) ===")
print(f"Train error: {mean_squared_error(y_train, right.predict(X_train)):.1f}")
print(f"Test error: {mean_squared_error(y_test, right.predict(X_test)):.1f}")
# Model 3: Too complex (degree 15 polynomial)
poly = PolynomialFeatures(degree=15)
X_train_poly = poly.fit_transform(X_train)
X_test_poly = poly.transform(X_test)
complex_model = LinearRegression()
complex_model.fit(X_train_poly, y_train)
print("\n=== Overfitting (degree-15 polynomial) ===")
print(f"Train error: {mean_squared_error(y_train, complex_model.predict(X_train_poly)):.1f}")
print(f"Test error: {mean_squared_error(y_test, complex_model.predict(X_test_poly)):.1f}")
Output
=== Underfitting (just predicts the mean) ===
Train error: 36.2
Test error:  32.8

=== Just Right (linear) ===
Train error: 3.4
Test error:  4.1

=== Overfitting (degree-15 polynomial) ===
Train error: 0.0
Test error:  9847.3

Key Metrics

🧊 Underfitting
Model is too simple β€” it can't even learn the training data
Low train, Low test High bias
πŸ”₯ Overfitting
Model memorized training data, fails on new data
High train, Low test High variance
βœ… Good Fit
Model learned real patterns that generalize
High train, High test Balanced
πŸ“ The Gap
Small gap = good. Large gap = overfitting.
Train score - Test score Key diagnostic

How to fix underfitting

  • Use a more complex model β€” switch from linear to polynomial, or from a shallow tree to a deeper one
  • Add more features β€” give the model more information to work with
  • Reduce regularization β€” let the model be more flexible
  • Train longer β€” the model might not have converged yet

How to fix overfitting

  • Get more training data β€” harder to memorize 100,000 examples than 100
  • Use a simpler model β€” fewer parameters means less room for memorization
  • Add regularization β€” penalize overly complex models (L1, L2, dropout)
  • Early stopping β€” stop training before the model starts memorizing
  • Cross-validation β€” evaluate on multiple train/test splits for a more robust estimate

The bias-variance tradeoff

This tension has a formal name: the bias-variance tradeoff.

  • Bias = how much the model's assumptions cause it to miss patterns (underfitting)
  • Variance = how much the model's predictions change when trained on different data (overfitting)

You want both to be low, but reducing one tends to increase the other. The sweet spot is in the middle.

Note: Here's a practical rule of thumb: if your training score is much higher than your test score, you're overfitting. If both scores are low, you're underfitting. Start simple, increase complexity gradually, and stop when the test score starts dropping β€” even if the training score keeps going up.

Quick check

Your model gets 99% accuracy on training data but 52% on test data. What's happening?
Challenge

Continue reading