Neural Networks8 min read

Neural Network Basics

Layers of neurons wired together
forward pass:O(sum of w_i) — proportional to total weightsparameters:Can range from thousands to billionsexpressiveness:Can approximate any continuous function

Imagine a factory assembly line. Raw materials (steel, rubber, glass) enter at one end. At the first station, workers cut and shape them. At the second station, others assemble pieces together. At the third, someone paints and polishes. Out the other end rolls a finished car.

A neural network works the same way. Raw data enters the input layer. It flows through one or more hidden layers, where each layer transforms the data in some useful way. Finally, the output layer produces the prediction.

Each "worker" at each station? That's a neuron — a tiny perceptron that multiplies, adds, and applies an activation function.

The three types of layers

Input layer

This is just your raw data. If you're classifying a 28x28 pixel image, your input layer has 784 neurons (one per pixel). They don't compute anything — they just pass the data forward.

Hidden layers

This is where the magic happens. Each neuron in a hidden layer:

  1. Receives values from every neuron in the previous layer
  2. Multiplies each by a weight
  3. Sums them up + bias
  4. Passes the result through an activation function

The first hidden layer might learn to detect edges. The second might combine edges into shapes. The third might recognize those shapes as "cat ear" or "dog nose." Each layer builds on the previous one, learning increasingly abstract features.

Output layer

The final layer gives you the answer. For classification, you might have one neuron per class (cat, dog, bird), and the one that fires strongest is the prediction.

Why "deep" learning?

A network with two or more hidden layers is called a "deep" neural network. That's literally it — "deep" just means "has more layers."

Why do more layers help? Think about it this way. With one hidden layer, the network can learn simple patterns. But stacking layers lets it compose simple patterns into complex ones:

  • Layer 1: Detects edges (horizontal, vertical, diagonal lines)
  • Layer 2: Combines edges into textures and shapes (circles, corners)
  • Layer 3: Combines shapes into parts (eyes, noses, wheels)
  • Layer 4: Combines parts into objects (faces, cars, cats)

Each layer is like a level of abstraction. The deeper you go, the more sophisticated the features become.

A Tiny Neural Network from Scratch

import numpy as np
def sigmoid(x):
return 1 / (1 + np.exp(-x))
# Network: 2 inputs → 2 hidden neurons → 1 output
np.random.seed(42)
W1 = np.random.randn(2, 2) # input→hidden weights
b1 = np.zeros(2) # hidden biases
W2 = np.random.randn(2, 1) # hidden→output weights
b2 = np.zeros(1) # output bias
def forward(x):
# Hidden layer
hidden = sigmoid(np.dot(x, W1) + b1)
# Output layer
output = sigmoid(np.dot(hidden, W2) + b2)
return output, hidden
# Feed an input through the network
x = np.array([0.5, 0.8])
output, hidden = forward(x)
print(f"Input: {x}")
print(f"Hidden: [{hidden[0]:.3f}, {hidden[1]:.3f}]")
print(f"Output: {output[0]:.3f}")
Output
Input:  [0.5 0.8]
Hidden: [0.545, 0.442]
Output: 0.488
Note: A neural network with just one hidden layer can theoretically approximate any continuous function (the Universal Approximation Theorem). But in practice, deeper networks learn complex patterns much more efficiently than very wide shallow ones. That's why modern architectures go deep.

Key Metrics

Forward Pass
Matrix multiplications through each layer
Fast O(total weights)
Training (Backprop)
Must compute gradients for every weight
Slow O(total weights * epochs * n)
Expressiveness
Can learn almost any pattern given enough neurons
Very high Universal approximator
Interpretability
Hard to understand what the hidden layers learned
Low Black box

Quick check

What does each hidden layer in a neural network do?
Challenge

Continue reading