Trees5 min read

BST Balancing

An unbalanced BST is just a slow linked list in disguise
balanced search: \(O(\log n)\)unbalanced search: \(O(n)\)

A BST's speed promise — \(O(\log n)\) search — only holds when the tree is balanced. An unbalanced BST can be just as slow as scanning a linked list. Understanding why balance matters is key to understanding AVL trees, Red-Black trees, and every other self-balancing structure.

What does "balanced" mean?

A tree is considered balanced when the height of the left and right subtrees of every node differs by at most a small constant (usually 1). The goal: keep the tree's overall height close to \(\log_2(n)\), where n is the number of nodes.

A perfectly balanced binary tree of n nodes has height ≈ \(\log_2(n)\). That means a tree of 1,000,000 nodes only needs about 20 levels to store all values — and you find any value in at most 20 comparisons.

Watch BST balancing

← → arrow keys to step · click elements to interact

Check BST Balance & Height

class Node:
def __init__(self, val, left=None, right=None):
self.val = val; self.left = left; self.right = right
def height(node):
if not node: return 0
return 1 + max(height(node.left), height(node.right))
def is_balanced(node):
if not node: return True
lh = height(node.left)
rh = height(node.right)
if abs(lh - rh) > 1: return False
return is_balanced(node.left) and is_balanced(node.right)
# Balanced tree
# 4
# / \
# 2 6
# / \ / \
# 1 3 5 7
root = Node(4, Node(2, Node(1), Node(3)), Node(6, Node(5), Node(7)))
print(is_balanced(root)) # True
print(height(root)) # 3
# Unbalanced (right-skewed)
skewed = Node(1, None, Node(2, None, Node(3, None, Node(4))))
print(is_balanced(skewed)) # False
Output
True
3
False
Note: A balanced tree of 1,000,000 nodes has height ~20. An unbalanced (degenerate) tree of 1,000,000 nodes has height 999,999. That's the difference between a web server responding in microseconds vs. seconds.

The degenerate case: a linked list in disguise

Insert the values 1, 2, 3, 4, 5 into an empty BST in that order. Each new value is larger than all previous ones, so it always goes to the right of the last inserted node. You end up with:

pseudocode
1
2
3
4
5
6
7
8
9
1
\
2
\
3
\
4
\
5

This is called a degenerate BST. It has height 4 (n-1) instead of the optimal height 2 (\(\log_2\)5 ≈ 2.3). To find value 5, you traverse every node — \(O(n)\) instead of \(O(\log n)\).

Why does insertion order matter so much?

The shape of a BST is entirely determined by the order in which you insert values. The same set of values can produce trees of wildly different heights depending on insertion order:

  • Insert [4, 2, 6, 1, 3, 5, 7] → perfectly balanced tree, height 2
  • Insert [1, 2, 3, 4, 5, 6, 7] → degenerate chain, height 6

In real applications, you often cannot control the insertion order (e.g., incoming data is already sorted). That's why self-balancing trees are needed.

How balance is measured

The balance factor of a node is defined as the height of its left subtree minus the height of its right subtree:

pseudocode
1
balance_factor(node) = height(node.left) - height(node.right)

For a balanced tree, every node's balance factor should be -1, 0, or +1. A balance factor of ±2 or more means the tree needs restructuring via rotations.

Self-balancing solutions

Two of the most widely used self-balancing BSTs are:

  • AVL Trees — strictly maintain balance factor within {-1, 0, 1} after every insert or delete. Faster lookups but more rotations on insert/delete.
  • Red-Black Trees — use color properties to maintain a looser balance guarantee. Fewer rotations on insert/delete, used in most language standard libraries (Java TreeMap, C++ std::map).

Complexity

Search (balanced BST)Height is log n — guaranteed by balancing
Fast\(O(\log n)\)
Search (unbalanced/degenerate)Worst case: tree is a single chain
Slow\(O(n)\)
Insert (balanced BST)Walk down log n levels, rotate if needed
Fast\(O(\log n)\)
SpaceOne node per value in both cases
Linear\(O(n)\)

Balance a Sorted Array into BST

class Node:
def __init__(self, val, left=None, right=None):
self.val = val; self.left = left; self.right = right
def sorted_array_to_bst(nums):
if not nums: return None
mid = len(nums) // 2
return Node(nums[mid],
sorted_array_to_bst(nums[:mid]),
sorted_array_to_bst(nums[mid+1:]))
def inorder(node):
if not node: return []
return inorder(node.left) + [node.val] + inorder(node.right)
def height(node):
if not node: return 0
return 1 + max(height(node.left), height(node.right))
root = sorted_array_to_bst([1,2,3,4,5,6,7])
print(inorder(root)) # [1, 2, 3, 4, 5, 6, 7]
print(height(root)) # 3 (perfectly balanced)
Output
[1, 2, 3, 4, 5, 6, 7]
3
Challenge

Quick check

What is the height of a degenerate BST with 8 nodes (all inserted in ascending order)?

Continue reading

Binary Search Tree
A sorted binary tree — left is smaller, right is bigger
AVL Tree
A self-balancing BST that rebalances itself after every insert or delete
Red-Black Tree
A self-balancing BST used inside Java's TreeMap and C++'s std::map