BST Balancing
A BST's speed promise — \(O(\log n)\) search — only holds when the tree is balanced. An unbalanced BST can be just as slow as scanning a linked list. Understanding why balance matters is key to understanding AVL trees, Red-Black trees, and every other self-balancing structure.
What does "balanced" mean?
A tree is considered balanced when the height of the left and right subtrees of every node differs by at most a small constant (usually 1). The goal: keep the tree's overall height close to \(\log_2(n)\), where n is the number of nodes.
A perfectly balanced binary tree of n nodes has height ≈ \(\log_2(n)\). That means a tree of 1,000,000 nodes only needs about 20 levels to store all values — and you find any value in at most 20 comparisons.
← → arrow keys to step · click elements to interact
Check BST Balance & Height
The degenerate case: a linked list in disguise
Insert the values 1, 2, 3, 4, 5 into an empty BST in that order. Each new value is larger than all previous ones, so it always goes to the right of the last inserted node. You end up with:
This is called a degenerate BST. It has height 4 (n-1) instead of the optimal height 2 (\(\log_2\)5 ≈ 2.3). To find value 5, you traverse every node — \(O(n)\) instead of \(O(\log n)\).
Why does insertion order matter so much?
The shape of a BST is entirely determined by the order in which you insert values. The same set of values can produce trees of wildly different heights depending on insertion order:
- Insert [4, 2, 6, 1, 3, 5, 7] → perfectly balanced tree, height 2
- Insert [1, 2, 3, 4, 5, 6, 7] → degenerate chain, height 6
In real applications, you often cannot control the insertion order (e.g., incoming data is already sorted). That's why self-balancing trees are needed.
How balance is measured
The balance factor of a node is defined as the height of its left subtree minus the height of its right subtree:
For a balanced tree, every node's balance factor should be -1, 0, or +1. A balance factor of ±2 or more means the tree needs restructuring via rotations.
Self-balancing solutions
Two of the most widely used self-balancing BSTs are:
- AVL Trees — strictly maintain balance factor within {-1, 0, 1} after every insert or delete. Faster lookups but more rotations on insert/delete.
- Red-Black Trees — use color properties to maintain a looser balance guarantee. Fewer rotations on insert/delete, used in most language standard libraries (Java TreeMap, C++ std::map).