Union-Find — disjoint set union
A parent array plus two operations: find(x) returns the
representative of x's group, union(x, y) merges the two groups.
With path compression in find and union by rank in union, each operation is
amortised effectively O(α(n)) — for any input you'll ever see, that's a
constant.
1 · Why union-find / DSU matters
Union-find is the data structure that turns "merge equivalence
classes online" into a problem you can solve in near-constant
amortised time per operation. The theoretical bound is
O(α(n)) — inverse Ackermann — where α(n) ≤ 4 for any
n < 265536. For any input on the planet,
treat it as constant.
The senior-interview signal is unmistakable: "edges arriving incrementally" + "ask about connectivity" = union-find. The shape recurs across infrastructure (does removing this DC link disconnect the network?), social (which friend circles merge as new friendships form?), simulation (when does the percolating cluster span the grid?), and graph algorithms (Kruskal's MST is just sorted edges + union-find).
Real production examples
- Kruskal's minimum spanning tree. Sort all edges by weight; for each, union endpoints if they're in different components. Used in network design (laying fibre between data centers cheaply), clustering, and image segmentation. The DSU is the engine inside Kruskal.
- Image segmentation (Felzenszwalb-Huttenlocher). Pixels are nodes; edges connect adjacent pixels with weights = colour difference. Sort edges; union pixels whose difference is below an adaptive threshold. The remaining components are the segments. Used in many computer-vision pipelines.
- Social-network friend-of-friend. When two users become friends, union their connected-component IDs. "Are these two users connected via mutual friends?" reduces to "do they share a root?" — O(α(n)) per query at any scale.
- Percolation simulations. Simulate fluid spreading on a grid: at time t, mark cells as "open" in random order; ask "is there an open path from top to bottom?". Union-find maintains the connected components; the percolation threshold is the fraction of open cells when the top connects to the bottom. The classic Sedgewick demo problem.
- Identity resolution / customer deduplication. Stripe, Salesforce, etc. need to merge customer records that share an email, phone, or device ID. Each merge condition is a union. LC 721 (Accounts Merge) is the cleaned-up version of this real problem.
- Build systems / dependency cycle detection. In a tool like Bazel, when adding a new build edge between two targets, check whether it would create a cycle. Equivalent to union-find with the edge-direction tracked separately.
2 · How to recognise it
Union-Find is the right pattern when the problem is incremental: edges or relationships arrive over time and you keep being asked about connectivity.
- Dynamic connectivity (add edges, ask "are u and v connected?"). The bread-and-butter case. Each new edge → union; each query → two finds.
- Cycle detection in an undirected graph. If
union(u, v)returns "already in same set", the edge would close a cycle. LC 684 Redundant Connection is the canonical example. - Count connected components / count islands. Start with n components; each successful union decrements the count. Final value = number of components. LC 547 Number of Provinces, LC 200 Number of Islands (via DSU).
- Kruskal's minimum spanning tree. Sort edges by weight; for each, union endpoints if they're in different components. The MST is the set of edges that actually merged something.
- Percolation / grid connectivity. Open cells on a grid one at a time; ask "is the top connected to the bottom yet?". Map (r,c) → r*cols+c; union with neighbours when opened. Classic Sedgewick demo.
- "Smallest equivalent string" / "regions merged by rule" (LC 1061, LC 947). Equivalences arrive (these two characters are equivalent; or these two stones share a row/column); union them; answer based on component leader.
- Offline queries — process in reverse. "Edges that DISAPPEAR over time, with connectivity queries between" can be flipped: process events in REVERSE so deletes become adds. LC 803 Bricks Falling When Hit is the classic.
- Weighted union-find for ratio / offset queries. When the relation is "x is 2× y" (LC 399 Evaluate Division), store an offset along each parent pointer;
findreturns both root and cumulative offset. - "Earliest time when X and Y connect" (LC 1101). Sort events by time, union as you go, the answer for a (u, v) query is the time of the union that first put them in the same set.
3 · Intuition — every component has a root
A union-find structure is a forest of trees stored
implicitly in a parent array. Each tree's root represents one
component; find(x) walks parent pointers upward until
the root; union(a, b) attaches one root under the
other.
find answers questions about the invariant;
union maintains it. The optimisations — path
compression in find, union by rank in union — are what get the
amortised cost to α(n). Without them, a chain of n unions can
build a tree of depth n, making find O(n) per call.4 · Canonical example — Redundant Connection (LC 684)
Problem. You're given a list of edges that form a tree with one extra edge added. The extra edge creates exactly one cycle. Return that edge — if multiple answers exist, return the one that appears last in the input.
Input: edges = [[1,2], [1,3], [2,3]]
Output: [2, 3]
Explanation: edges [1,2] and [1,3] connect 1, 2, 3 into a tree.
Adding [2,3] closes the cycle 1 - 2 - 3 - 1.Process the edges in order. For each (u, v), ask: are u and v
already in the same component? If yes, this edge would close a cycle —
it's the redundant one, return it. If no, union them and continue. Because
the problem guarantees exactly one extra edge, the first cycle-closing edge
we see is the answer.
5 · Reference implementation
class DSU:
def __init__(self, n: int):
self.parent = list(range(n)) # each node is its own parent
self.rank = [0] * n # tree height upper bound
def find(self, x: int) -> int:
# path compression: point every node on the path directly at the root
root = x
while self.parent[root] != root:
root = self.parent[root]
while self.parent[x] != root:
self.parent[x], x = root, self.parent[x]
return root
def union(self, a: int, b: int) -> bool:
ra, rb = self.find(a), self.find(b)
if ra == rb:
return False # already connected; no merge happened
# union by rank: attach the shorter tree under the taller one
if self.rank[ra] < self.rank[rb]:
ra, rb = rb, ra
self.parent[rb] = ra
if self.rank[ra] == self.rank[rb]:
self.rank[ra] += 1
return True
def redundant_connection(edges: list[list[int]]) -> list[int]:
dsu = DSU(len(edges) + 1) # nodes are 1..n
for u, v in edges:
if not dsu.union(u, v):
return [u, v] # union returned False -> cycle closer
return []find returns the root, not a bool. Most
DSU operations need the root identifier — to compare two nodes, to attach
one tree under another, to label a component. Returning the root keeps the
API uniform. union returns a bool only as a convenience: True
if it actually merged, False if the two nodes were already in the same set.6 · Sister algorithms
Standard union-find handles "add edge, query connectivity" efficiently. When the problem deviates — deletions, dynamic trees, persistent rollback, weighted components — one of the following structures takes over.
- Link-cut trees (LCT). Sleator & Tarjan, 1985. Dynamic tree connectivity: link two trees, cut an edge, query path-sum / path-max / LCA / depth — all in O(log n) per operation. The generalisation of union-find to support deletions. Used in network-flow algorithms and competitive programming.
- Euler-tour trees. Another dynamic-tree structure. Maintain the Euler tour of a forest in a balanced BST; link/cut by splicing the tour. Slightly different trade-offs from LCT — easier in some scenarios.
- Persistent union-find. Each operation produces a new version without mutating the old. Useful for offline queries that need to roll back to a previous state. O(log n) per operation, O(n log n) memory.
- Union-find with rollback. Standard DSU with path compression DISABLED so the parent pointers can be undone. Each union pushes its mutation onto a stack;
rollback()pops and restores. Used in offline algorithms (divide-and-conquer on time, "what if we hadn't done this edge?"). - Weighted union-find (offset-tracking). Each parent pointer carries a weight;
findreturns both root and the accumulated weight from x to root. Solves "what's the ratio between a and b given a chain of ratios?" — LC 399 Evaluate Division — and any "relative offset" problem. - Small-to-large merging. For each component, store the actual element list. On union, iterate the smaller side and migrate elements into the larger. Total cost O(n log² n) — strictly worse than DSU's α(n) but conceptually simpler when you need to enumerate a component's members.
- Bipartite union-find / 2-coloring DSU. Track a parity bit alongside each node — "x and root are same colour (0) / opposite colour (1)". Detects bipartite-violating unions (LC 785 cousin), and underlies "are these constraints satisfiable" questions.
- Distributed DSU. Each node owns its own portion of the parent array; unions go through a coordinator. Used in distributed graph processing (Pregel-style frameworks) when the graph doesn't fit on one machine.
7 · Variations
The core parent[] array stays the same; what changes is how
you compress paths, how you pick the new root, and what extra data you hang
off each node.
| Variation | What changes | When to use |
|---|---|---|
| Path compression: iterative vs recursive | Iterative does two passes (find root, then re-point). Recursive does it in one elegant call but risks stack overflow on long chains. | Iterative for production / deep inputs. Recursive for whiteboard clarity. |
| Union by rank vs union by size | Rank tracks an upper bound on tree height. Size tracks the number of nodes. Both give the same α(n) bound; size is sometimes more useful when the problem asks for component sizes. | Union by size when you also need component_size(x) for free. |
| Weighted union-find (with offsets) | Each edge carries a weight; find also returns the offset from x to its root. Lets you answer "what's the relative weight of x vs y" if they're connected. | LC 399 Evaluate Division, currency-conversion-style problems. |
| Persistent union-find | Each operation produces a new version of the structure without mutating the old one. Useful for offline queries on changing graphs. | Competitive programming, rarely interviews. |
| Union-find on a grid (2D → 1D mapping) | Flatten (r, c) to r * cols + c. The parent array becomes 1D; neighbour iteration stays 2D. | LC 200 Number of Islands done with DSU, LC 305 Number of Islands II. |
7a · Path compression in action
Path compression is the optimisation that takes union-find from O(log n)
to nearly O(1) amortised. Every find() traversal re-roots
intermediate nodes directly to the component root, flattening the tree
as a side effect. Three states of the same forest:
7b · Two optimisations, ordered by impact
| Optimisation | What it does | Worst-case find |
|---|---|---|
| Plain quick-union | No optimisation; each union sets one parent. | O(n) — chain |
| Union by rank/size | Always attach shorter tree under taller. | O(log n) |
| Path compression alone | Find flattens every walked node to root. | O(log n) amortised |
| Both (standard) | Rank guides union, compression flattens find. | O(α(n)) ≈ O(1) |
class DSU:
def __init__(self, n: int):
self.parent = list(range(n))
self.rank = [0] * n
def find(self, x: int) -> int:
# PATH COMPRESSION — flatten on the way up
if self.parent[x] != x:
self.parent[x] = self.find(self.parent[x])
return self.parent[x]
def union(self, x: int, y: int) -> bool:
rx, ry = self.find(x), self.find(y)
if rx == ry: return False # already same component
# UNION BY RANK — taller tree wins
if self.rank[rx] < self.rank[ry]:
rx, ry = ry, rx
self.parent[ry] = rx
if self.rank[rx] == self.rank[ry]:
self.rank[rx] += 1
return True8 · Three worked problems — easy, medium-hard, hard
Three union-find problems that show the pattern's full range. Counting components (LC 547), union with a structural twist (LC 952), and the time-reversed offline trick (LC 803).
8a · LC 547 — Number of Provinces
Problem. Given an n×n matrix
isConnected where isConnected[i][j] = 1
means cities i and j are directly connected, return the number
of provinces (connected components).
Input: isConnected = [[1,1,0],
[1,1,0],
[0,0,1]]
Output: 2
Explanation: Cities 0 and 1 are connected; city 2 is alone. Two provinces.The canonical "count components" application. Initialise n singletons; for each connection edge, attempt to union; track a counter that starts at n and decrements on each successful merge.
func findCircleNum(isConnected [][]int) int {
n := len(isConnected)
parent := make([]int, n)
rank := make([]int, n)
for i := range parent {
parent[i] = i
}
count := n
var find func(x int) int
find = func(x int) int {
if parent[x] != x {
parent[x] = find(parent[x]) // path compression
}
return parent[x]
}
union := func(a, b int) {
ra, rb := find(a), find(b)
if ra == rb {
return
}
// Union by rank
if rank[ra] < rank[rb] {
ra, rb = rb, ra
}
parent[rb] = ra
if rank[ra] == rank[rb] {
rank[ra]++
}
count--
}
for i := 0; i < n; i++ {
for j := i + 1; j < n; j++ {
if isConnected[i][j] == 1 {
union(i, j)
}
}
}
return count
}8b · LC 952 — Largest Component Size by Common Factor
Problem. Given an array of unique positive integers, draw an edge between any two values that share a common factor > 1. Return the size of the largest connected component.
Input: nums = [4, 6, 15, 35]
Output: 4
Explanation: 4-6 share factor 2; 6-15 share factor 3; 15-35 share factor 5.
All four are connected.The naive approach: for every pair (i, j), check
gcd(nums[i], nums[j]) > 1 and union. O(n² log
max). For n = 2 × 10⁴, that's 4 × 10⁸ — too slow.
The trick: instead of unioning numbers directly, union each number with its prime factors. Two numbers that share a prime factor will end up in the same component because both unioned with that prime. The DSU has space for both "real numbers" (indexed 0..n-1) and "primes" (indexed by prime value).
func largestComponentSize(nums []int) int {
maxVal := 0
for _, v := range nums {
if v > maxVal {
maxVal = v
}
}
// DSU big enough to hold numbers AND their prime factors
size := maxVal + 1
parent := make([]int, size)
for i := range parent {
parent[i] = i
}
var find func(x int) int
find = func(x int) int {
if parent[x] != x {
parent[x] = find(parent[x])
}
return parent[x]
}
union := func(a, b int) {
ra, rb := find(a), find(b)
if ra != rb {
parent[ra] = rb
}
}
// For each number, union it with each of its prime factors
for _, v := range nums {
x := v
for p := 2; p*p <= x; p++ {
if x%p == 0 {
union(v, p)
for x%p == 0 {
x /= p
}
}
}
if x > 1 {
union(v, x) // remaining prime factor
}
}
// Count occurrences per component (only counting actual nums values)
count := make(map[int]int)
best := 0
for _, v := range nums {
root := find(v)
count[root]++
if count[root] > best {
best = count[root]
}
}
return best
}8c · LC 803 — Bricks Falling When Hit (HARD)
Problem. A 2D grid where 1 = brick, 0 = empty. Bricks in the top row are "stable"; other bricks are stable iff they're connected (via 4-directional adjacency) to a stable brick. Process a sequence of hits — each hit removes one brick and returns how many bricks fall as a consequence.
Input: grid = [[1,0,0,0],[1,1,1,0]]
hits = [[1,0]]
Output: [2]
Explanation: After removing grid[1][0], bricks at (1,1) and (1,2) lose
support and fall. Two bricks fall.The straightforward simulation — for each hit, remove the brick and re-run BFS to count fallen bricks — is O(hits × grid). On big inputs, too slow.
The offline trick: process hits in REVERSE. Removing a brick becomes adding it back. Start from the final state (grid with ALL hits applied). Compute the initial set of stable bricks via a DSU connected to a virtual "ceiling" node. Then walk the hits in reverse: for each, add the brick back, union it with its (now stable) neighbours, and the increase in the ceiling component's size − 1 is the number of bricks that would have fallen in the forward simulation.
func hitBricks(grid [][]int, hits [][]int) []int {
rows, cols := len(grid), len(grid[0])
// Virtual ceiling node at index rows*cols
CEILING := rows * cols
parent := make([]int, rows*cols+1)
size := make([]int, rows*cols+1)
for i := range parent {
parent[i] = i
size[i] = 1
}
var find func(x int) int
find = func(x int) int {
if parent[x] != x {
parent[x] = find(parent[x])
}
return parent[x]
}
union := func(a, b int) {
ra, rb := find(a), find(b)
if ra == rb {
return
}
parent[ra] = rb
size[rb] += size[ra]
}
idx := func(r, c int) int { return r*cols + c }
inside := func(r, c int) bool { return r >= 0 && r < rows && c >= 0 && c < cols }
// STEP 1: Apply all hits up front
for _, h := range hits {
if grid[h[0]][h[1]] == 1 {
grid[h[0]][h[1]] = -1 // mark "was hit"
}
}
// STEP 2: Build DSU over the post-hit grid; bricks in row 0 connect to CEILING
for r := 0; r < rows; r++ {
for c := 0; c < cols; c++ {
if grid[r][c] != 1 {
continue
}
if r == 0 {
union(idx(r, c), CEILING)
}
if inside(r-1, c) && grid[r-1][c] == 1 {
union(idx(r, c), idx(r-1, c))
}
if inside(r, c-1) && grid[r][c-1] == 1 {
union(idx(r, c), idx(r, c-1))
}
}
}
dirs := [][2]int{{-1, 0}, {1, 0}, {0, -1}, {0, 1}}
result := make([]int, len(hits))
// STEP 3: Walk hits in REVERSE, adding bricks back
for i := len(hits) - 1; i >= 0; i-- {
r, c := hits[i][0], hits[i][1]
if grid[r][c] != -1 {
continue // was already empty before hit
}
before := size[find(CEILING)]
grid[r][c] = 1
if r == 0 {
union(idx(r, c), CEILING)
}
for _, d := range dirs {
nr, nc := r+d[0], c+d[1]
if inside(nr, nc) && grid[nr][nc] == 1 {
union(idx(r, c), idx(nr, nc))
}
}
after := size[find(CEILING)]
if after > before {
result[i] = after - before - 1 // -1 because the just-added brick doesn't count as "fallen"
}
}
return result
}9 · How to solve hard union-find problems step-by-step
Five questions to answer before writing code. The DSU class is short; almost all the senior craft is in modelling the problem into elements and equivalences.
- What are the elements? Cells of a grid? Array indices? Strings (emails, accounts)? Numbers AND their prime factors (LC 952)? Decide what each integer in
parent[]represents. If elements are non-integers, plan a string-to-id mapping. - What's the equivalence relation? "Connected by an edge" is the obvious case. But it could be "share a common prime factor", "in the same row or column", "equal by transitive rule". Some equivalences are obvious; others (like the prime-factor twist in LC 952) need a structural insight.
- Is it online or offline? Online: queries arrive one at a time, you must answer each immediately — standard DSU. Offline: you have all queries upfront — you can sort them, reverse them, or pre-process. Offline unlocks tricks like "reverse the time axis so deletions become insertions" (LC 803).
- Does each component need extra data? Size (so you can answer "how big is the component containing x"), sum (for weighted-sum queries), min/max element (for "what's the smallest element in x's component", LC 1061). Store at the root; update on union.
- Plan the optimisations. Always include path compression AND union by rank (or size). These together give the α(n) bound; either alone gives O(log n). For competitive-programming-tight time limits, path compression by halving (point each node directly to its grandparent) is slightly weaker but cache-friendlier than full recursive compression.
find(x) = parent[x]" and forget compression. Senior
candidates pause to identify the right elements and equivalence
relation BEFORE writing any DSU code — and articulate the
online/offline distinction. The data structure itself is
intermediate-level; the modelling is what separates
implementations.10 · Path compression vs union by rank — which optimisations actually matter
Both optimisations are needed for the α(n) bound. The proof is Tarjan's 1975 paper; the practical takeaway is that omitting either drops you to O(log n) per operation.
| Variant | Worst-case find | Implementation effort |
|---|---|---|
| Quick-union (no opts) | O(n) — a chain | 5 lines |
| Path compression alone | O(log n) amortised | 6 lines (one recursive call) |
| Union by rank alone | O(log n) worst case | +2 lines (the rank array, comparison) |
| Both together | O(α(n)) amortised — effectively constant | ~12 lines total |
Two compression flavours: full vs halving
Path compression has two common implementations. Both achieve amortised α(n) (with union by rank); they differ in constant factor and recursion depth.
// FULL PATH COMPRESSION — recursive, every walked node points to root
func find(x int) int {
if parent[x] != x {
parent[x] = find(parent[x]) // recursive; every node on path -> root
}
return parent[x]
}
// PATH COMPRESSION BY HALVING — iterative, every node points to its grandparent
func find(x int) int {
for parent[x] != x {
parent[x] = parent[parent[x]] // grandparent jump
x = parent[x]
}
return x
}- Full compression (recursive). Every node on the path is pointed directly at the root. Strongest flattening; uses recursion (stack depth = tree height). Best amortised behaviour but worst stack usage.
- Halving (iterative). Each node is pointed at its grandparent. Slightly less aggressive flattening, but no recursion → no stack overflow risk, and the tighter inner loop is cache-friendlier. Same asymptotic complexity. Often the production choice.
- Splitting. Like halving, but points the child at its grandparent AND moves the pointer up by two each iteration. Subtle improvement; rarely used.
Union by rank vs union by size
Rank = upper bound on tree height. Size = number of nodes in the tree. Either gives the same α(n) bound when paired with path compression. Pick based on what you also need:
- Union by rank. Slightly simpler; rank changes only when two trees of equal rank merge. Use when you don't need component sizes for the problem itself.
- Union by size. Tracks the actual node count per component. Use when "how big is X's component" is a query you need to answer (LC 952 Largest Component Size, LC 1319 Make Network Connected). Free side-effect that often pays for itself.
11 · Offline trick — process queries in reverse
Union-find supports inserts but not deletes. Many problems naturally ask about deletes (bricks fall when hit, edges disappear, components break apart over time). The senior trick is to reverse the time axis — turn deletes into inserts — and answer the original queries in the right order at the end.
The recipe
- Apply all deletes upfront to compute the FINAL state.
- Build the DSU over that final state. Union surviving connections; record the initial answer (component count, component sizes, ceiling-connected count).
- Walk the deletes in reverse. For each, ADD the element back; union with neighbours; observe the change in the relevant quantity (e.g., increase in component size, decrease in component count, change in ceiling-connectivity).
- The change at step i, in reverse, is the answer to the original (forward-time) query i.
Why this works
The forward problem is "delete X, observe consequence Y". The reverse problem is "add X, observe consequence −Y". Both describe the same transition between two graph states; the reverse just happens to be tractable with union-find (insertions) while the forward isn't (deletions). The arithmetic of "what changed" is symmetric.
Where this trick applies
- LC 803 — Bricks Falling When Hit. The canonical worked example (§8c). Bricks "removed by hits" become bricks "added in reverse". The DSU tracks connectivity to the ceiling; the increase in ceiling-component-size on reverse-add is the answer.
- "Edges disappear over time; ask connectivity at each moment". Process events in reverse; deletions become insertions. Each query has a known "freeze time"; reorder them so each is answered at the right reverse-time step.
- Offline percolation. "When does the path from top to bottom break?" Reverse: start fully connected, remove edges, ask when the top-bottom connectivity is lost — but actually, the FORWARD percolation problem (adding cells) is what union-find handles directly. The reverse trick is for the dual question.
- Time-travel queries in a graph. "At time t, were u and v connected, given that edges (u, v) appear at known times"? Process queries in time order, union as you go. Or, sort queries by time and unionable edges by appearance time, walk both pointers.
12 · Five-problem progression
| # | LC | Problem | Why it's here |
|---|---|---|---|
| 1 | LC 323 | Number of Connected Components | The minimal DSU. Initialise n components; each successful union decrements the count. |
| 2 | LC 684 | Redundant Connection | The canonical cycle-detection use. First edge that fails to union is the answer. |
| 3 | LC 721 | Accounts Merge | Union accounts that share an email. Then group by root to emit the merged accounts. Practice mapping strings to integer ids. |
| 4 | LC 1319 | Make Network Connected | Count redundant cables, count components. If redundant cables ≥ components - 1, the answer is components - 1; otherwise -1. |
| 5 | LC 1202 | Smallest String With Swaps | Pairs of swappable indices form components; within each component, sort the characters. DSU plus a per-component sort. |
12a · The inverse Ackermann function — why α(n) is "constant"
Union-Find with path compression and union by rank has an amortised cost of O(α(n)) per operation, where α is the inverse Ackermann function. For any value of n you will ever encounter in this universe (n < 2^65536), α(n) ≤ 4. That is why complexity analysts call this "effectively constant" without scare quotes.
Why so good? Path compression flattens the tree on every find — repeated operations make subsequent finds cheaper. Union by rank ensures that the tree never grows tall when there's a flatter option. Together, the amortised work per operation drops to a rate that grows slower than any constant function you can write — including iterated logarithm, log* n, and similar slowly-growing functions.
The 1989 Tarjan proof is the canonical reference; the practical takeaway is that, for any input size you will see, union-find behaves like a constant-time data structure. This is why it dominates problems where you need to merge equivalence classes online (Kruskal's MST, percolation simulation, dynamic connectivity).
12b · Weighted Union-Find — when groups carry data
A weighted variant tracks the offset (or "weight") between each node and its parent. Used when the union operation conveys not just "x and y are in the same group" but "x is some specific distance/relationship from y". Examples:
Sentence similarity (LC 737). Words are grouped by similarity; the relation is symmetric and transitive. Standard union-find.
Evaluate Division (LC 399). Variables related by ratios: "a/b = 2.0, b/c = 3.0, find a/c". Weighted union-find: each node stores its ratio to its parent; find returns both the root and the cumulative ratio along the path.
Possible Bipartition / Coloring problems. "These pairs must be in different groups". A weighted union-find with weight in {0, 1} (same color / opposite color) finds contradictions automatically.
class WeightedUF:
def __init__(self, n):
self.parent = list(range(n))
self.weight = [0.0] * n # offset from this node to its parent
def find(self, x):
if self.parent[x] != x:
root = self.find(self.parent[x])
self.weight[x] += self.weight[self.parent[x]]
self.parent[x] = root
return self.parent[x]
def union(self, x, y, w):
# weight w means: x = y * w (or x is w units away from y)
rx, ry = self.find(x), self.find(y)
if rx == ry: return
# adjust weights so the relation holds
self.parent[rx] = ry
self.weight[rx] = self.weight[y] + w - self.weight[x]12c · Offline union-find — sweeping over a sorted event list
A surprising number of "what's the minimum cost to connect..." problems become trivial when you process edges in sorted order. The pattern:
1. Generate every relevant event (edge with a weight, query with a parameter).
2. Sort events by the parameter (weight, time, threshold).
3. Walk through events in order; for each, do a union or answer a query against the current state of the DSU.
Examples:
Kruskal's MST. Sort edges by weight; for each edge, union the endpoints if they're in different components. Total weight is the MST cost.
Number of connected components after each query (offline). Process queries in reverse — start from the final state, "undo" by adding edges back, answer each query at the appropriate point.
Earliest moment when two people are connected (LC 1101). Sort the friendship events by time; union as you go; the answer to "when are A and B connected" is the time of the union that first put them in the same group.
Min cost to connect all cities with a budget. Sort potential edges by cost; union until either the graph is connected or the budget is exhausted.
12d · When union-find isn't quite right
Three situations where union-find seems like the fit but is actually wrong:
You need to delete edges. Union-find supports only adding. For dynamic connectivity with deletions, you need link-cut trees or Euler-tour trees. Common in competitive programming, rare in interviews.
You need group sizes that change non-monotonically. Union-find groups only grow. If your problem requires groups to split, union-find cannot do it directly — you typically reverse the problem (process events in reverse, so splits become joins).
You need to enumerate elements of a group. Union-find tells you whether two nodes are in the same group; it does not give you the members of a group without an extra structure. Add a dict from root → list of members, but be careful about updating on union.
13 · Common pitfalls
- Skipping path compression. The code still works — find walks up the parent chain — but each operation can degrade to O(n) on adversarial inputs. With compression, the chain flattens as you go, and the amortised cost falls to α(n).
- Wrong order in union. If you write
parent[ra] = rbwithout considering rank, you can stack a tall tree on top of a short one and double its height. Always compare ranks (or sizes) first and attach the shorter tree under the taller root. - Forgetting union by rank. Path compression alone is enough for amortised O(log n); pairing it with union by rank tightens that to α(n). Both optimisations together is the standard form — leave either out and the bound loosens.
- Reaching for DSU when BFS/DFS is simpler. If the graph is static and given as an adjacency list, BFS or DFS labels connected components in one pass. DSU earns its keep when edges arrive incrementally or when you need answers between insertions.
- 2D grid mapping bugs. The flattening
r * cols + cis easy to get wrong — usingrowsinstead ofcols, or mixing uprandc. Worth writing a tiny helper and testing it on a 2×3 grid before trusting it. - Confusing
union(u, v)withunion(parent[u], parent[v]). Both work in the final result, but the latter bypasses the find-with-compression. If you pre-compute roots and reuse them, you skip the chance for path compression to flatten the tree. Always go throughfind. - The
if rank[u] < rank[v]direction. Easy to flip. Mnemonic: the TALLER tree wins — the shorter tree gets attached UNDER the taller's root. So whenrank[u] < rank[v], swap u and v so u becomes the taller; then attach v's root under u's. Trace a 2-element example to lock in the direction. - Recursive find vs iterative find on adversarial input. A recursive full-compression
findcan blow the stack if the chain is long enough (~10⁴ nodes in Python). The iterative-halving variant avoids this. For competitive programming, prefer iterative; for whiteboard clarity, recursive is fine. - Mixing rank and size in the same DSU. Pick one — they encode different things. Using rank for the union decision but size for queries is fine, but make sure size is updated correctly on every successful merge (and only on success).
- Not handling self-loops.
union(x, x)should return False (no merge). Most implementations get this right viaif ra == rb: return False, but it's worth checking. - Off-by-one in DSU size. Many problems use 1-indexed node IDs (nodes labelled 1..n). Allocate
parentwith sizen + 1, or remap to 0-indexed up front. Forgetting this is a silent OOB. - Slow string-to-id mapping in problems like LC 721 Accounts Merge. Use a dict to assign incrementing integer ids; reuse them across passes. Re-mapping strings inside the inner loop is a 5-10× slowdown.
14 · What to say in the interview
- "Dynamic connectivity / cycle in an undirected graph → union-find." Name the pattern. The interviewer wants to hear you recognise the incremental shape.
- "Parent array plus find and union." State the data structure and the two operations in one sentence.
- "Path compression in find, union by rank in union." Both optimisations together. Mention them explicitly — the bound depends on it.
- "Amortised O(α(n)) per operation, effectively constant." α is the inverse Ackermann function. For any n you'll ever see in practice, α(n) ≤ 4. Worth stating once and moving on.
- "Union returns False if the two nodes are already in the same set." That's the cycle-detection hook for problems like Redundant Connection.
- Edge cases to mention. Self-loops (
union(x, x)returns False immediately), duplicate edges, isolated nodes (each is its own component initially), node ids that don't start at zero (map them first).
15 · Where this gets asked
| Company | Common framing | Why it fits their stack |
|---|---|---|
| Number of Connected Components (LC 323), Redundant Connection (LC 684), Accounts Merge (LC 721) | Knowledge-graph + identity-resolution teams. | |
| Meta | Friend Circles / Number of Provinces (LC 547), Accounts Merge (LC 721) | Social graph: every "merge two communities" feature is a union-find. |
| Amazon | Most Stones Removed (LC 947), Satisfiability of Equality Equations (LC 990) | Inventory clustering + constraint solvers. |
| Microsoft | Graph Valid Tree (LC 261), Redundant Connection II (LC 685 — directed variant) | Bing's link graph + Azure subnet validation. |
| Bloomberg | Smallest String With Swaps (LC 1202), Sentence Similarity II (LC 737) | Reference-data deduplication. |
| Stripe / Datadog | Customer-id merging across signup paths, alert-group consolidation | Identity resolution + alert aggregation are textbook union-find. |
16 · Try it yourself
- Number of Connected Components · LC 323
- The canonical warm-up. Start with N components; every successful union decrements the count. Hint: maintain a
countvariable as you go — avoids a final scan to count distinct roots. - Redundant Connection · LC 684
- Cycle detection. The first edge that connects two already-connected nodes is the redundant one. Hint: union returns false on already-connected → that's the answer edge; no second pass needed.
- Accounts Merge · LC 721
- Multi-key entity resolution. Map every email to the first account-id you saw it on; union accounts that share any email. Hint: build the email→account map first, then iterate emails once to do unions.
- Stretch — Most Stones Removed · LC 947
- Union stones sharing a row or column; the answer is N − (number of components). Hint: encode rows and columns into a single int space (e.g. cols offset by 10000) so you can union them in the same DSU.
Further reading
- Sedgewick & Wayne — Algorithms, Chapter 1.5. The classic union-find chapter. Builds the structure from quick-find through weighted quick-union with path compression, with timings at each step.
- CLRS — Chapter 21. "Data structures for disjoint sets". The proof of the α(n) amortised bound is here; worth reading once for the analysis, even if the code is denser than Sedgewick's.
- Tarjan (1975) — "Efficiency of a good but not linear set union algorithm". The original paper that established the α(n) bound. Short and historically important.
- Adjacent: Graph algorithms. Where Kruskal's MST sits, with DSU as the engine. Also Prim's, Dijkstra, Bellman-Ford for the weighted cases.
- Adjacent: BFS and DFS. The right tools for static connectivity. Compare and contrast with DSU's incremental flavour.
- Semicolony: Algorithm visualiser. Step through DSU operations and watch the parent pointers re-route as path compression kicks in.