Logistic Regression
Imagine you're a bouncer at an exclusive club. Every person who walks up has certain features β age, dress code score, VIP status. Your job isn't to rate them on a scale of 1 to 100. It's binary: you're in, or you're out.
But here's the thing β you don't just flip a coin. You have a gut feeling that's actually pretty mathematical. You weigh each factor, add them up, and if the total crosses a threshold... the velvet rope opens.
That's logistic regression. Despite the name, it's not about regression (predicting numbers). It's a classification algorithm β it predicts which category something belongs to.
From straight line to S-curve
Remember linear regression? It gives you a number: y = mx + b. But for yes/no problems, we need a probability β a number between 0 and 1.
Enter the sigmoid function. It takes any number (from negative infinity to positive infinity) and squishes it into the range (0, 1). The formula is:
sigmoid(z) = 1 / (1 + e^(-z))
Think of it like a dimmer switch that's been modified. No matter how far you turn the knob in either direction, the light never goes below 0% or above 100%. Turn it way to the left? It approaches 0 but never quite reaches it. Way to the right? Approaches 1.
If the output is above 0.5, we predict "yes" (class 1). Below 0.5? "No" (class 0).
The decision boundary
Here's where it gets cool. Logistic regression draws an invisible line (or plane, or hyperplane in higher dimensions) through your feature space. On one side: class 0. On the other side: class 1.
Imagine scattering red and blue marbles on a table. Logistic regression finds the best straight line that separates reds from blues. Points near the line? The model is less confident. Points far from the line? Very confident.
The model doesn't just say "spam" or "not spam" β it says "I'm 94% sure this is spam." That probability is incredibly useful.