Graphs6 min read

Union-Find (Disjoint Set Union)

Track connected groups — merge them near-instantly
find: \(O(α(n)\)) ≈ \(O(1)\)union: \(O(α(n)\)) ≈ \(O(1)\)space: \(O(n)\)

Imagine a school where students start out as strangers. One by one, friendships are formed. At any moment you want to ask: are these two students in the same friend group? And: merge the friend groups of these two students.

Union-Find (also called Disjoint Set Union, DSU) is the data structure built for exactly this — tracking which elements belong to the same group, with near \(O(1)\) operations.

The core idea

Each element points to a representative (the group leader). Two elements are in the same group if and only if they share the same representative. Initially every element is its own leader — all isolated.

pseudocode
1
2
parent = [0, 1, 2, 3, 4] // each is its own leader
rank = [0, 0, 0, 0, 0] // all groups height 0
Watch nodes get connected and parent pointers update with path compression

← → arrow keys to step · click elements to interact

Union-Find Operations

class UnionFind:
def __init__(self, n):
self.parent = list(range(n))
self.rank = [0] * n
def find(self, x):
if self.parent[x] != x:
self.parent[x] = self.find(self.parent[x]) # path compression
return self.parent[x]
def union(self, x, y):
px, py = self.find(x), self.find(y)
if px == py: return
if self.rank[px] < self.rank[py]: px, py = py, px
self.parent[py] = px
if self.rank[px] == self.rank[py]: self.rank[px] += 1
def connected(self, x, y):
return self.find(x) == self.find(y)
uf = UnionFind(5)
uf.union(0, 1)
uf.union(1, 2)
uf.union(3, 4)
print(uf.connected(0, 2)) # True
print(uf.connected(0, 3)) # False
Output
True
False

Find with path compression

To find the representative of an element, follow parent pointers up to the root. Path compression is the key optimization: on the way back, make every element point directly to the root. Future queries on those elements take \(O(1)\).

pseudocode
1
2
3
4
function find(x):
if parent[x] !== x:
parent[x] = find(parent[x]) // path compression
return parent[x]

Union by rank

To merge two groups, attach one root under the other. Union by rank: always attach the shorter tree under the taller one. This keeps trees flat, keeping future finds fast. Equal ranks — pick either and increment the rank.

pseudocode
1
2
3
4
5
6
7
8
9
10
11
12
function union(a, b):
rootA = find(a)
rootB = find(b)
if rootA === rootB: return // already connected
if rank[rootA] < rank[rootB]:
parent[rootA] = rootB
else if rank[rootA] > rank[rootB]:
parent[rootB] = rootA
else:
parent[rootB] = rootA
rank[rootA]++

Why it is nearly \(O(1)\)

With both path compression and union by rank, operations run in \(O(α(n)\)) amortized time, where α is the inverse Ackermann function. For any practical input size, α(n) ≤ 4. In practice: constant time.

Union-Find operations: Find uses path compression, Union uses rank to keep trees flat
Note: Union-Find is the backbone of Kruskal's MST algorithm. It answers 'are these two nodes connected?' faster than running BFS or DFS each time.

Complexity

FindPath compression flattens the tree
Near-constant\(O(α(n)\))
UnionUnion by rank keeps trees short
Near-constant\(O(α(n)\))
SpaceOne parent and rank entry per element
Linear\(O(n)\)

Count Connected Components

class UnionFind:
def __init__(self, n):
self.parent = list(range(n))
self.count = n
def find(self, x):
if self.parent[x] != x:
self.parent[x] = self.find(self.parent[x])
return self.parent[x]
def union(self, x, y):
px, py = self.find(x), self.find(y)
if px == py: return
self.parent[py] = px
self.count -= 1
uf = UnionFind(6)
for a, b in [(0,1),(1,2),(3,4)]:
uf.union(a, b)
print(uf.count) # 3 (groups: {0,1,2}, {3,4}, {5})
Output
3
Challenge

Quick check

What does path compression do in the Find operation?

Continue reading

Minimum Spanning Tree
Connect all cities with the least total road construction cost
Graph Representation
How to store a map of cities and roads in memory
Depth-First Search
Follow one road as far as it goes before backtracking