Graphs6 min read

Union-Find (Disjoint Set Union)

Track connected groups — merge them near-instantly

find: \(O(α(n)\)) ≈ \(O(1)\)union: \(O(α(n)\)) ≈ \(O(1)\)space: \(O(n)\)

Imagine a school where students start out as strangers. One by one, friendships are formed. At any moment you want to ask: are these two students in the same friend group? And: merge the friend groups of these two students.

Union-Find (also called Disjoint Set Union, DSU) is the data structure built for exactly this — tracking which elements belong to the same group, with near \(O(1)\) operations.

The core idea

Each element points to a representative (the group leader). Two elements are in the same group if and only if they share the same representative. Initially every element is its own leader — all isolated.

pseudocode

parent = [0, 1, 2, 3, 4]  // each is its own leader
rank   = [0, 0, 0, 0, 0]  // all groups height 0

◈ Watch nodes get connected and parent pointers update with path compression

← → arrow keys to step · click elements to interact

Union-Find Operations

class UnionFind:
    def __init__(self, n):
        self.parent = list(range(n))
        self.rank = [0] * n

    def find(self, x):
        if self.parent[x] != x:
            self.parent[x] = self.find(self.parent[x])  # path compression
        return self.parent[x]

    def union(self, x, y):
        px, py = self.find(x), self.find(y)
        if px == py: return
        if self.rank[px] < self.rank[py]: px, py = py, px
        self.parent[py] = px
        if self.rank[px] == self.rank[py]: self.rank[px] += 1

    def connected(self, x, y):
        return self.find(x) == self.find(y)

uf = UnionFind(5)
uf.union(0, 1)
uf.union(1, 2)
uf.union(3, 4)
print(uf.connected(0, 2))  # True
print(uf.connected(0, 3))  # False

Output

True
False

Find with path compression

To find the representative of an element, follow parent pointers up to the root. Path compression is the key optimization: on the way back, make every element point directly to the root. Future queries on those elements take \(O(1)\).

pseudocode

function find(x):
  if parent[x] !== x:
    parent[x] = find(parent[x])  // path compression
  return parent[x]

Union by rank

To merge two groups, attach one root under the other. Union by rank: always attach the shorter tree under the taller one. This keeps trees flat, keeping future finds fast. Equal ranks — pick either and increment the rank.

pseudocode

function union(a, b):
  rootA = find(a)
  rootB = find(b)
  if rootA === rootB: return  // already connected

  if rank[rootA] < rank[rootB]:
    parent[rootA] = rootB
  else if rank[rootA] > rank[rootB]:
    parent[rootB] = rootA
  else:
    parent[rootB] = rootA
    rank[rootA]++

Why it is nearly \(O(1)\)

With both path compression and union by rank, operations run in \(O(α(n)\)) amortized time, where α is the inverse Ackermann function. For any practical input size, α(n) ≤ 4. In practice: constant time.

Click chart to zoom

Union-Find operations: Find uses path compression, Union uses rank to keep trees flat

Note: Union-Find is the backbone of Kruskal's MST algorithm. It answers 'are these two nodes connected?' faster than running BFS or DFS each time.

Complexity

Find

Path compression flattens the tree

Near-constant \(O(α(n)\))

Union

Union by rank keeps trees short

Near-constant \(O(α(n)\))

Space

One parent and rank entry per element

Linear \(O(n)\)

Count Connected Components

class UnionFind:
    def __init__(self, n):
        self.parent = list(range(n))
        self.count = n
    def find(self, x):
        if self.parent[x] != x:
            self.parent[x] = self.find(self.parent[x])
        return self.parent[x]
    def union(self, x, y):
        px, py = self.find(x), self.find(y)
        if px == py: return
        self.parent[py] = px
        self.count -= 1

uf = UnionFind(6)
for a, b in [(0,1),(1,2),(3,4)]:
    uf.union(a, b)
print(uf.count)  # 3  (groups: {0,1,2}, {3,4}, {5})

Output

Challenge

Quick check

What does path compression do in the Find operation?

Minimum Spanning Tree

Connect all cities with the least total road construction cost

→

Graph Representation

How to store a map of cities and roads in memory

→

Depth-First Search

Follow one road as far as it goes before backtracking

→