Logistic Regression

Imagine you're a bouncer at an exclusive club. Every person who walks up has certain features — age, dress code score, VIP status. Your job isn't to rate them on a scale of 1 to 100. It's binary: you're in, or you're out.

But here's the thing — you don't just flip a coin. You have a gut feeling that's actually pretty mathematical. You weigh each factor, add them up, and if the total crosses a threshold... the velvet rope opens.

That's logistic regression. Despite the name, it's not about regression (predicting numbers). It's a classification algorithm — it predicts which category something belongs to.

From straight line to S-curve

Remember linear regression? It gives you a number: y = mx + b. But for yes/no problems, we need a probability — a number between 0 and 1.

Enter the sigmoid function. It takes any number (from negative infinity to positive infinity) and squishes it into the range (0, 1). The formula is:

sigmoid(z) = 1 / (1 + e^(-z))

Think of it like a dimmer switch that's been modified. No matter how far you turn the knob in either direction, the light never goes below 0% or above 100%. Turn it way to the left? It approaches 0 but never quite reaches it. Way to the right? Approaches 1.

If the output is above 0.5, we predict "yes" (class 1). Below 0.5? "No" (class 0).

The decision boundary

Here's where it gets cool. Logistic regression draws an invisible line (or plane, or hyperplane in higher dimensions) through your feature space. On one side: class 0. On the other side: class 1.

Imagine scattering red and blue marbles on a table. Logistic regression finds the best straight line that separates reds from blues. Points near the line? The model is less confident. Points far from the line? Very confident.

The model doesn't just say "spam" or "not spam" — it says "I'm 94% sure this is spam." That probability is incredibly useful.

Logistic Regression: Spam or Not?

import numpy as np

def sigmoid(z):
    return 1 / (1 + np.exp(-z))

# Features: [num_exclamation_marks, contains_free, num_links]
X = np.array([[1, 0, 1], [5, 1, 8], [0, 0, 2],
              [8, 1, 10], [1, 0, 0], [6, 1, 7]])
y = np.array([0, 1, 0, 1, 0, 1])  # 0=not spam, 1=spam

# Pretrained weights (normally learned via gradient descent)
weights = np.array([0.3, 1.5, 0.2])
bias = -1.5

# Predict
for i in range(len(X)):
    z = np.dot(X[i], weights) + bias
    prob = sigmoid(z)
    label = "SPAM" if prob > 0.5 else "not spam"
    print(f"Email {i+1}: {prob:.2f} → {label}")

Output

Email 1: 0.31 → not spam
Email 2: 0.92 → SPAM
Email 3: 0.21 → not spam
Email 4: 0.97 → SPAM
Email 5: 0.23 → not spam
Email 6: 0.91 → SPAM

Key Metrics

Training

Uses gradient descent, same as linear regression

Iterative O(n * d * iterations)

Prediction

Dot product + sigmoid — milliseconds

Very fast O(d)

Output

Not just a label — you get confidence scores

Probability 0 to 1

Decision Boundary

Can't capture curved boundaries without feature engineering

Linear Straight line/plane

From straight line to S-curve

The decision boundary

Logistic Regression: Spam or Not?

Key Metrics

Quick check