Union-Find (also called Disjoint Set Union or DSU) is a data structure that tracks elements split into non-overlapping groups. It answers one question very fast: “Are these two elements in the same group?” It is the best tool for connected components problems and appears in many graph-related interview questions. In this article, you will learn how Union-Find works, why path compression makes it nearly O(1), and how to use it in coding interviews.
We show every example in Kotlin, Python, and Go.
What is Union-Find?
Imagine you have a group of people. Some are friends. If A is friends with B, and B is friends with C, then A, B, and C are all in the same friend group. Union-Find helps you manage these groups efficiently.
It supports two operations:
- Find(x): which group does x belong to? Returns the “representative” (root) of the group.
- Union(x, y): merge the groups that contain x and y into one group.
Naive Implementation
The simplest approach: each element has a parent. The root of a group points to itself.
Python:
class UnionFind:
def __init__(self, n):
self.parent = list(range(n))
def find(self, x):
while self.parent[x] != x:
x = self.parent[x]
return x
def union(self, x, y):
root_x = self.find(x)
root_y = self.find(y)
if root_x != root_y:
self.parent[root_x] = root_y
Kotlin:
class UnionFind(n: Int) {
val parent = IntArray(n) { it }
fun find(x: Int): Int {
var node = x
while (parent[node] != node) {
node = parent[node]
}
return node
}
fun union(x: Int, y: Int) {
val rootX = find(x)
val rootY = find(y)
if (rootX != rootY) {
parent[rootX] = rootY
}
}
}
Go:
type UnionFind struct {
Parent []int
}
func NewUnionFind(n int) *UnionFind {
parent := make([]int, n)
for i := range parent {
parent[i] = i
}
return &UnionFind{Parent: parent}
}
func (uf *UnionFind) Find(x int) int {
for uf.Parent[x] != x {
x = uf.Parent[x]
}
return x
}
func (uf *UnionFind) Union(x, y int) {
rootX := uf.Find(x)
rootY := uf.Find(y)
if rootX != rootY {
uf.Parent[rootX] = rootY
}
}
Problem: In the worst case, the tree can become a long chain (like a linked list). Then find takes O(n) time. We need two optimizations.
Optimization 1: Path Compression
When you call find(x), make every node on the path point directly to the root. This flattens the tree so future calls are faster.
Before find(4): After find(4):
0 0
| / | \
1 1 2 4
| |
2 3
|
3
|
4
Python:
def find(self, x):
if self.parent[x] != x:
self.parent[x] = self.find(self.parent[x]) # Path compression
return self.parent[x]
Kotlin:
fun find(x: Int): Int {
if (parent[x] != x) {
parent[x] = find(parent[x]) // Path compression
}
return parent[x]
}
Go:
func (uf *UnionFind) Find(x int) int {
if uf.Parent[x] != x {
uf.Parent[x] = uf.Find(uf.Parent[x]) // Path compression
}
return uf.Parent[x]
}
Optimization 2: Union by Rank
When merging two groups, attach the shorter tree under the taller tree. This keeps the tree balanced.
Rank is an upper bound on the height of the tree. When two trees have the same rank, pick one as the root and increase its rank by 1.
Optimized Union-Find (Full Implementation)
Here is the complete implementation with both optimizations.
Python:
class UnionFind:
def __init__(self, n):
self.parent = list(range(n))
self.rank = [0] * n
self.count = n # Number of groups
def find(self, x):
if self.parent[x] != x:
self.parent[x] = self.find(self.parent[x])
return self.parent[x]
def union(self, x, y):
root_x = self.find(x)
root_y = self.find(y)
if root_x == root_y:
return False # Already in the same group
if self.rank[root_x] < self.rank[root_y]:
self.parent[root_x] = root_y
elif self.rank[root_x] > self.rank[root_y]:
self.parent[root_y] = root_x
else:
self.parent[root_y] = root_x
self.rank[root_x] += 1
self.count -= 1
return True
def connected(self, x, y):
return self.find(x) == self.find(y)
Kotlin:
class UnionFind(n: Int) {
val parent = IntArray(n) { it }
val rank = IntArray(n)
var count = n
private set
fun find(x: Int): Int {
if (parent[x] != x) {
parent[x] = find(parent[x])
}
return parent[x]
}
fun union(x: Int, y: Int): Boolean {
val rootX = find(x)
val rootY = find(y)
if (rootX == rootY) return false
when {
rank[rootX] < rank[rootY] -> parent[rootX] = rootY
rank[rootX] > rank[rootY] -> parent[rootY] = rootX
else -> {
parent[rootY] = rootX
rank[rootX]++
}
}
count--
return true
}
fun connected(x: Int, y: Int) = find(x) == find(y)
}
Go:
type UnionFind struct {
Parent []int
Rank []int
Count int
}
func NewUnionFind(n int) *UnionFind {
parent := make([]int, n)
for i := range parent {
parent[i] = i
}
return &UnionFind{
Parent: parent,
Rank: make([]int, n),
Count: n,
}
}
func (uf *UnionFind) Find(x int) int {
if uf.Parent[x] != x {
uf.Parent[x] = uf.Find(uf.Parent[x])
}
return uf.Parent[x]
}
func (uf *UnionFind) Union(x, y int) bool {
rootX := uf.Find(x)
rootY := uf.Find(y)
if rootX == rootY {
return false
}
if uf.Rank[rootX] < uf.Rank[rootY] {
uf.Parent[rootX] = rootY
} else if uf.Rank[rootX] > uf.Rank[rootY] {
uf.Parent[rootY] = rootX
} else {
uf.Parent[rootY] = rootX
uf.Rank[rootX]++
}
uf.Count--
return true
}
func (uf *UnionFind) Connected(x, y int) bool {
return uf.Find(x) == uf.Find(y)
}
Time Complexity
With both path compression and union by rank, the amortized time per operation is O(alpha(n)), where alpha is the inverse Ackermann function. This grows so slowly that for all practical purposes, it is O(1).
| Operation | Amortized Time |
|---|---|
| Find | O(alpha(n)) ~ O(1) |
| Union | O(alpha(n)) ~ O(1) |
| Connected | O(alpha(n)) ~ O(1) |
| Space | O(n) |
Union-Find vs BFS/DFS
Both can solve connected components problems. When should you use which?
| Union-Find | BFS/DFS | |
|---|---|---|
| Dynamic edges (edges added over time) | Best choice | Must rebuild graph |
| Static graph | Works | Works |
| Count components | Easy (track count) | Need full traversal |
| Shortest path | Cannot do | BFS can |
| Cycle detection | Easy | Also easy |
Rule of thumb: if edges arrive one at a time and you need to answer connectivity queries, use Union-Find. If you need shortest paths or level-by-level traversal, use BFS.
Interview Problem: Number of Connected Components
Given n nodes and a list of edges, find the number of connected components. This is LeetCode #323.
Python:
def count_components(n, edges):
uf = UnionFind(n)
for u, v in edges:
uf.union(u, v)
return uf.count
# Example:
# n=5, edges=[[0,1],[1,2],[3,4]]
# Components: {0,1,2} and {3,4} -> answer is 2
print(count_components(5, [[0, 1], [1, 2], [3, 4]])) # Output: 2
Kotlin:
fun countComponents(n: Int, edges: Array<IntArray>): Int {
val uf = UnionFind(n)
for ((u, v) in edges) {
uf.union(u, v)
}
return uf.count
}
Go:
func countComponents(n int, edges [][]int) int {
uf := NewUnionFind(n)
for _, edge := range edges {
uf.Union(edge[0], edge[1])
}
return uf.Count
}
Interview Problem: Redundant Connection
Given a graph that started as a tree with n nodes, one extra edge was added. Find that edge. If there are multiple answers, return the one that appears last. This is LeetCode #684.
Strategy: Process edges one by one. If both endpoints are already in the same group, this edge creates a cycle — it is the redundant edge.
Python:
def find_redundant_connection(edges):
n = len(edges)
uf = UnionFind(n + 1) # Nodes are 1-indexed
for u, v in edges:
if not uf.union(u, v):
return [u, v] # This edge connects two already-connected nodes
return []
Kotlin:
fun findRedundantConnection(edges: Array<IntArray>): IntArray {
val uf = UnionFind(edges.size + 1)
for ((u, v) in edges) {
if (!uf.union(u, v)) {
return intArrayOf(u, v)
}
}
return intArrayOf()
}
Go:
func findRedundantConnection(edges [][]int) []int {
uf := NewUnionFind(len(edges) + 1)
for _, edge := range edges {
if !uf.Union(edge[0], edge[1]) {
return edge
}
}
return nil
}
Interview Problem: Number of Islands (Union-Find Approach)
You can also solve Number of Islands (LeetCode #200) with Union-Find instead of DFS. Treat each land cell as a node. Union adjacent land cells.
Python:
def num_islands_uf(grid):
if not grid:
return 0
rows, cols = len(grid), len(grid[0])
uf = UnionFind(rows * cols)
water = 0
for r in range(rows):
for c in range(cols):
if grid[r][c] == '0':
water += 1
continue
# Union with right and down neighbors
if c + 1 < cols and grid[r][c + 1] == '1':
uf.union(r * cols + c, r * cols + c + 1)
if r + 1 < rows and grid[r + 1][c] == '1':
uf.union(r * cols + c, (r + 1) * cols + c)
return uf.count - water
Why Interviewers Ask Union-Find Questions
Union-Find questions test:
- Knowing the right data structure — recognizing when Union-Find is better than BFS/DFS
- Optimization knowledge — path compression and union by rank show you understand amortized analysis
- Graph modeling — converting real-world problems into connectivity queries
- Clean implementation — Union-Find is short enough to implement from scratch during an interview
Practice Problems
Try these on LeetCode:
- Number of Connected Components (#323) — basic Union-Find
- Redundant Connection (#684) — detect cycle with Union-Find
- Accounts Merge (#721) — group accounts by email
- Number of Provinces (#547) — connected components in adjacency matrix
- Longest Consecutive Sequence (#128) — Union-Find or hash set approach
What’s Next?
In the next article, we bring everything together with When to Use Which Data Structure — a complete comparison guide and cheat sheet. You will learn the decision framework for choosing the right data structure for any problem.
Next: DSA Tutorial #10: When to Use Which Data Structure
Full series: DSA from Zero to Interview Ready