Neural Networks7 min read

Perceptron

One neuron β€” multiply, add, decide
training:O(n * d * epochs) β€” iterate through data multiple timesprediction:O(d) β€” one dot product + thresholdlimitation:Can only learn linearly separable patterns

Imagine a simple voting committee with three members. Each member has a different amount of influence (weight). Member A's vote counts double, Member B's vote counts triple, and Member C's vote counts once. They each cast a vote (1 for yes, 0 for no), you multiply each vote by their weight, add it all up, and if the total exceeds a threshold β€” the proposal passes.

That's a perceptron. It's the simplest possible neural network β€” just one single neuron. It takes inputs, multiplies each by a weight, sums them up, and if the sum crosses a threshold, it fires (outputs 1). Otherwise, it stays quiet (outputs 0).

The math: dead simple

Here's what a perceptron computes:

  1. Multiply each input by its weight: x1*w1, x2*w2, ...
  2. Add them all up (plus a bias): sum = x1*w1 + x2*w2 + ... + b
  3. Decide: if sum > 0, output 1. Otherwise, output 0.

That's it. Three steps. Multiply, add, decide.

How does it learn?

The perceptron learning algorithm is beautifully intuitive:

  1. Start with random weights
  2. Feed in a training example
  3. If the prediction is correct β€” do nothing
  4. If it predicted 0 but should be 1 β€” increase the weights (make it more likely to fire next time)
  5. If it predicted 1 but should be 0 β€” decrease the weights (make it less likely to fire)
  6. Repeat for all training examples, multiple times

It's like a teacher nudging a student: "You should have said yes β€” pay more attention to these features next time."

Perceptron Learning: AND Gate

import random
def perceptron_train(X, y, epochs=10, lr=0.1):
weights = [random.uniform(-1, 1) for _ in range(len(X[0]))]
bias = random.uniform(-1, 1)
for epoch in range(epochs):
errors = 0
for xi, yi in zip(X, y):
# Forward pass
total = sum(w * x for w, x in zip(weights, xi)) + bias
pred = 1 if total > 0 else 0
# Update if wrong
error = yi - pred
if error != 0:
errors += 1
weights = [w + lr * error * x
for w, x in zip(weights, xi)]
bias += lr * error
if errors == 0:
print(f"Converged at epoch {epoch + 1}!")
break
return weights, bias
# AND gate: both inputs must be 1
X = [[0,0], [0,1], [1,0], [1,1]]
y = [0, 0, 0, 1]
random.seed(42)
w, b = perceptron_train(X, y)
print(f"Weights: [{w[0]:.2f}, {w[1]:.2f}], Bias: {b:.2f}")
for xi in X:
total = sum(wi * xi_i for wi, xi_i in zip(w, xi)) + b
print(f"{xi} β†’ {1 if total > 0 else 0}")
Output
Converged at epoch 5!
Weights: [0.13, 0.20], Bias: -0.24
[0, 0] β†’ 0
[0, 1] β†’ 0
[1, 0] β†’ 0
[1, 1] β†’ 1

The XOR problem: the perceptron's fatal flaw

A perceptron can learn AND, OR, and NOT gates perfectly. But it cannot learn XOR (exclusive or: true when inputs differ).

Why? Because XOR isn't linearly separable. You can't draw a single straight line to separate the "on" cases from the "off" cases. Try it β€” plot (0,0)=0, (0,1)=1, (1,0)=1, (1,1)=0. No single line works.

This limitation was so devastating that it caused the first "AI winter" in the 1970s. The solution? Stack multiple perceptrons into layers β€” that gives you a neural network, which can solve XOR and much more.

Note: The perceptron is where deep learning begins. Every neuron in every neural network is essentially a perceptron with a fancier activation function. Understand the perceptron, and you understand the building block of all modern AI.

Key Metrics

Training
Guaranteed to converge if data is linearly separable
Fast O(n * d * epochs)
Prediction
One dot product and a comparison
Very fast O(d)
Expressiveness
Cannot solve XOR or any non-linearly separable problem
Limited Linear boundaries only
Historical impact
Every modern neural network neuron is a fancy perceptron
Huge Foundation of neural nets

Quick check

What are the three steps a perceptron performs?
Challenge

Continue reading