13 / 20 · Tier C

Patterns / 13

Union-Find — disjoint set union

A parent array plus two operations: find(x) returns the representative of x's group, union(x, y) merges the two groups. With path compression in find and union by rank in union, each operation is amortised effectively O(α(n)) — for any input you'll ever see, that's a constant.

1 · Why union-find / DSU matters

Union-find is the data structure that turns "merge equivalence classes online" into a problem you can solve in near-constant amortised time per operation. The theoretical bound is O(α(n)) — inverse Ackermann — where α(n) ≤ 4 for any n < 2⁶⁵⁵³⁶. For any input on the planet, treat it as constant.

The senior-interview signal is unmistakable: "edges arriving incrementally" + "ask about connectivity" = union-find. The shape recurs across infrastructure (does removing this DC link disconnect the network?), social (which friend circles merge as new friendships form?), simulation (when does the percolating cluster span the grid?), and graph algorithms (Kruskal's MST is just sorted edges + union-find).

Real production examples

Kruskal's minimum spanning tree. Sort all edges by weight; for each, union endpoints if they're in different components. Used in network design (laying fibre between data centers cheaply), clustering, and image segmentation. The DSU is the engine inside Kruskal.
Image segmentation (Felzenszwalb-Huttenlocher). Pixels are nodes; edges connect adjacent pixels with weights = colour difference. Sort edges; union pixels whose difference is below an adaptive threshold. The remaining components are the segments. Used in many computer-vision pipelines.
Social-network friend-of-friend. When two users become friends, union their connected-component IDs. "Are these two users connected via mutual friends?" reduces to "do they share a root?" — O(α(n)) per query at any scale.
Percolation simulations. Simulate fluid spreading on a grid: at time t, mark cells as "open" in random order; ask "is there an open path from top to bottom?". Union-find maintains the connected components; the percolation threshold is the fraction of open cells when the top connects to the bottom. The classic Sedgewick demo problem.
Identity resolution / customer deduplication. Stripe, Salesforce, etc. need to merge customer records that share an email, phone, or device ID. Each merge condition is a union. LC 721 (Accounts Merge) is the cleaned-up version of this real problem.
Build systems / dependency cycle detection. In a tool like Bazel, when adding a new build edge between two targets, check whether it would create a cycle. Equivalent to union-find with the edge-direction tracked separately.

Why senior interviewers love DSU. It's the rare data structure where the algorithm is short (15 lines), the complexity analysis is non-trivial (α(n) requires understanding amortised analysis), and the real-world applications are immediate. A candidate who can name path compression AND union by rank AND explain why the combination gives α(n) (not just log n) signals depth that 5 minutes of textbook reading can't fake.

2 · How to recognise it

Union-Find is the right pattern when the problem is incremental: edges or relationships arrive over time and you keep being asked about connectivity.

Dynamic connectivity (add edges, ask "are u and v connected?"). The bread-and-butter case. Each new edge → union; each query → two finds.
Cycle detection in an undirected graph. If union(u, v) returns "already in same set", the edge would close a cycle. LC 684 Redundant Connection is the canonical example.
Count connected components / count islands. Start with n components; each successful union decrements the count. Final value = number of components. LC 547 Number of Provinces, LC 200 Number of Islands (via DSU).
Kruskal's minimum spanning tree. Sort edges by weight; for each, union endpoints if they're in different components. The MST is the set of edges that actually merged something.
Percolation / grid connectivity. Open cells on a grid one at a time; ask "is the top connected to the bottom yet?". Map (r,c) → r*cols+c; union with neighbours when opened. Classic Sedgewick demo.
"Smallest equivalent string" / "regions merged by rule" (LC 1061, LC 947). Equivalences arrive (these two characters are equivalent; or these two stones share a row/column); union them; answer based on component leader.
Offline queries — process in reverse. "Edges that DISAPPEAR over time, with connectivity queries between" can be flipped: process events in REVERSE so deletes become adds. LC 803 Bricks Falling When Hit is the classic.
Weighted union-find for ratio / offset queries. When the relation is "x is 2× y" (LC 399 Evaluate Division), store an offset along each parent pointer; find returns both root and cumulative offset.
"Earliest time when X and Y connect" (LC 1101). Sort events by time, union as you go, the answer for a (u, v) query is the time of the union that first put them in the same set.

Union-Find vs BFS/DFS for connectivity. BFS and DFS work on a static graph: build the adjacency list, traverse, label components. Union-Find is incremental — it answers connectivity questions as edges arrive without rebuilding anything. If the graph is given once and never changes, BFS/DFS is usually simpler. If edges stream in, reach for DSU.

3 · Intuition — every component has a root

A union-find structure is a forest of trees stored implicitly in a parent array. Each tree's root represents one component; find(x) walks parent pointers upward until the root; union(a, b) attaches one root under the other.

Two operations, one invariant. The invariant: two nodes are in the same component iff they share the same root. find answers questions about the invariant; union maintains it. The optimisations — path compression in find, union by rank in union — are what get the amortised cost to α(n). Without them, a chain of n unions can build a tree of depth n, making find O(n) per call.

4 · Canonical example — Redundant Connection (LC 684)

Problem. You're given a list of edges that form a tree with one extra edge added. The extra edge creates exactly one cycle. Return that edge — if multiple answers exist, return the one that appears last in the input.

Input:  edges = [[1,2], [1,3], [2,3]]
Output: [2, 3]
Explanation: edges [1,2] and [1,3] connect 1, 2, 3 into a tree.
             Adding [2,3] closes the cycle 1 - 2 - 3 - 1.

Process the edges in order. For each (u, v), ask: are u and v already in the same component? If yes, this edge would close a cycle — it's the redundant one, return it. If no, union them and continue. Because the problem guarantees exactly one extra edge, the first cycle-closing edge we see is the answer.

5 · Reference implementation

class DSU:
    def __init__(self, n: int):
        self.parent = list(range(n))   # each node is its own parent
        self.rank   = [0] * n          # tree height upper bound

    def find(self, x: int) -> int:
        # path compression: point every node on the path directly at the root
        root = x
        while self.parent[root] != root:
            root = self.parent[root]
        while self.parent[x] != root:
            self.parent[x], x = root, self.parent[x]
        return root

    def union(self, a: int, b: int) -> bool:
        ra, rb = self.find(a), self.find(b)
        if ra == rb:
            return False               # already connected; no merge happened
        # union by rank: attach the shorter tree under the taller one
        if self.rank[ra] < self.rank[rb]:
            ra, rb = rb, ra
        self.parent[rb] = ra
        if self.rank[ra] == self.rank[rb]:
            self.rank[ra] += 1
        return True


def redundant_connection(edges: list[list[int]]) -> list[int]:
    dsu = DSU(len(edges) + 1)          # nodes are 1..n
    for u, v in edges:
        if not dsu.union(u, v):
            return [u, v]              # union returned False -> cycle closer
    return []

Why find returns the root, not a bool. Most DSU operations need the root identifier — to compare two nodes, to attach one tree under another, to label a component. Returning the root keeps the API uniform. union returns a bool only as a convenience: True if it actually merged, False if the two nodes were already in the same set.

6 · Sister algorithms

Standard union-find handles "add edge, query connectivity" efficiently. When the problem deviates — deletions, dynamic trees, persistent rollback, weighted components — one of the following structures takes over.

Link-cut trees (LCT). Sleator & Tarjan, 1985. Dynamic tree connectivity: link two trees, cut an edge, query path-sum / path-max / LCA / depth — all in O(log n) per operation. The generalisation of union-find to support deletions. Used in network-flow algorithms and competitive programming.
Euler-tour trees. Another dynamic-tree structure. Maintain the Euler tour of a forest in a balanced BST; link/cut by splicing the tour. Slightly different trade-offs from LCT — easier in some scenarios.
Persistent union-find. Each operation produces a new version without mutating the old. Useful for offline queries that need to roll back to a previous state. O(log n) per operation, O(n log n) memory.
Union-find with rollback. Standard DSU with path compression DISABLED so the parent pointers can be undone. Each union pushes its mutation onto a stack; rollback() pops and restores. Used in offline algorithms (divide-and-conquer on time, "what if we hadn't done this edge?").
Weighted union-find (offset-tracking). Each parent pointer carries a weight; find returns both root and the accumulated weight from x to root. Solves "what's the ratio between a and b given a chain of ratios?" — LC 399 Evaluate Division — and any "relative offset" problem.
Small-to-large merging. For each component, store the actual element list. On union, iterate the smaller side and migrate elements into the larger. Total cost O(n log² n) — strictly worse than DSU's α(n) but conceptually simpler when you need to enumerate a component's members.
Bipartite union-find / 2-coloring DSU. Track a parity bit alongside each node — "x and root are same colour (0) / opposite colour (1)". Detects bipartite-violating unions (LC 785 cousin), and underlies "are these constraints satisfiable" questions.
Distributed DSU. Each node owns its own portion of the parent array; unions go through a coordinator. Used in distributed graph processing (Pregel-style frameworks) when the graph doesn't fit on one machine.

The escalation ladder. Standard DSU → weighted DSU (when offsets matter) → DSU with rollback (when you need to undo) → link-cut trees (when you need deletions on a tree) → Euler-tour trees (when LCT is the wrong shape). Most interview problems stop at standard or weighted DSU. Mentioning LCT by name when the problem talks about edge deletions is a credible senior signal.

7 · Variations

The core parent[] array stays the same; what changes is how you compress paths, how you pick the new root, and what extra data you hang off each node.

Variation	What changes	When to use
Path compression: iterative vs recursive	Iterative does two passes (find root, then re-point). Recursive does it in one elegant call but risks stack overflow on long chains.	Iterative for production / deep inputs. Recursive for whiteboard clarity.
Union by rank vs union by size	Rank tracks an upper bound on tree height. Size tracks the number of nodes. Both give the same α(n) bound; size is sometimes more useful when the problem asks for component sizes.	Union by size when you also need `component_size(x)` for free.
Weighted union-find (with offsets)	Each edge carries a weight; `find` also returns the offset from x to its root. Lets you answer "what's the relative weight of x vs y" if they're connected.	LC 399 Evaluate Division, currency-conversion-style problems.
Persistent union-find	Each operation produces a new version of the structure without mutating the old one. Useful for offline queries on changing graphs.	Competitive programming, rarely interviews.
Union-find on a grid (2D → 1D mapping)	Flatten `(r, c)` to `r * cols + c`. The parent array becomes 1D; neighbour iteration stays 2D.	LC 200 Number of Islands done with DSU, LC 305 Number of Islands II.

7a · Path compression in action

Path compression is the optimisation that takes union-find from O(log n) to nearly O(1) amortised. Every find() traversal re-roots intermediate nodes directly to the component root, flattening the tree as a side effect. Three states of the same forest:

Why α(n) and not log n. Without compression, a chain of n unions can build a tree of depth n, making find O(n). Union by rank alone gives O(log n). Path compression on top — every find flattens the tree it walked — drives the amortised cost to α(n), the inverse Ackermann function. For any n ≤ 2^65536, α(n) ≤ 4. In practice, everyone says "amortised O(1)" and treats α as a constant.

7b · Two optimisations, ordered by impact

Optimisation	What it does	Worst-case find
Plain quick-union	No optimisation; each union sets one parent.	O(n) — chain
Union by rank/size	Always attach shorter tree under taller.	O(log n)
Path compression alone	Find flattens every walked node to root.	O(log n) amortised
Both (standard)	Rank guides union, compression flattens find.	O(α(n)) ≈ O(1)

class DSU:
    def __init__(self, n: int):
        self.parent = list(range(n))
        self.rank = [0] * n

    def find(self, x: int) -> int:
        # PATH COMPRESSION — flatten on the way up
        if self.parent[x] != x:
            self.parent[x] = self.find(self.parent[x])
        return self.parent[x]

    def union(self, x: int, y: int) -> bool:
        rx, ry = self.find(x), self.find(y)
        if rx == ry: return False         # already same component
        # UNION BY RANK — taller tree wins
        if self.rank[rx] < self.rank[ry]:
            rx, ry = ry, rx
        self.parent[ry] = rx
        if self.rank[rx] == self.rank[ry]:
            self.rank[rx] += 1
        return True

8 · Three worked problems — easy, medium-hard, hard

Three union-find problems that show the pattern's full range. Counting components (LC 547), union with a structural twist (LC 952), and the time-reversed offline trick (LC 803).

8a · LC 547 — Number of Provinces

Problem. Given an n×n matrix isConnected where isConnected[i][j] = 1 means cities i and j are directly connected, return the number of provinces (connected components).

Input:  isConnected = [[1,1,0],
                       [1,1,0],
                       [0,0,1]]
Output: 2
Explanation: Cities 0 and 1 are connected; city 2 is alone. Two provinces.

The canonical "count components" application. Initialise n singletons; for each connection edge, attempt to union; track a counter that starts at n and decrements on each successful merge.

func findCircleNum(isConnected [][]int) int {
    n := len(isConnected)
    parent := make([]int, n)
    rank := make([]int, n)
    for i := range parent {
        parent[i] = i
    }
    count := n

    var find func(x int) int
    find = func(x int) int {
        if parent[x] != x {
            parent[x] = find(parent[x]) // path compression
        }
        return parent[x]
    }

    union := func(a, b int) {
        ra, rb := find(a), find(b)
        if ra == rb {
            return
        }
        // Union by rank
        if rank[ra] < rank[rb] {
            ra, rb = rb, ra
        }
        parent[rb] = ra
        if rank[ra] == rank[rb] {
            rank[ra]++
        }
        count--
    }

    for i := 0; i < n; i++ {
        for j := i + 1; j < n; j++ {
            if isConnected[i][j] == 1 {
                union(i, j)
            }
        }
    }
    return count
}

Why DSU and not BFS here. Both work — BFS labels components in O(V + E). DSU has the same complexity but generalises trivially: if the next variant of the question adds "now process queries: 'are cities X and Y in the same province?'", DSU answers each query in O(α(n)) without re-running BFS. Choosing DSU over BFS for this question is a signal that you anticipate where the problem might extend.

8b · LC 952 — Largest Component Size by Common Factor

Problem. Given an array of unique positive integers, draw an edge between any two values that share a common factor > 1. Return the size of the largest connected component.

Input:  nums = [4, 6, 15, 35]
Output: 4
Explanation: 4-6 share factor 2; 6-15 share factor 3; 15-35 share factor 5.
             All four are connected.

The naive approach: for every pair (i, j), check gcd(nums[i], nums[j]) > 1 and union. O(n² log max). For n = 2 × 10⁴, that's 4 × 10⁸ — too slow.

The trick: instead of unioning numbers directly, union each number with its prime factors. Two numbers that share a prime factor will end up in the same component because both unioned with that prime. The DSU has space for both "real numbers" (indexed 0..n-1) and "primes" (indexed by prime value).

func largestComponentSize(nums []int) int {
    maxVal := 0
    for _, v := range nums {
        if v > maxVal {
            maxVal = v
        }
    }
    // DSU big enough to hold numbers AND their prime factors
    size := maxVal + 1
    parent := make([]int, size)
    for i := range parent {
        parent[i] = i
    }
    var find func(x int) int
    find = func(x int) int {
        if parent[x] != x {
            parent[x] = find(parent[x])
        }
        return parent[x]
    }
    union := func(a, b int) {
        ra, rb := find(a), find(b)
        if ra != rb {
            parent[ra] = rb
        }
    }

    // For each number, union it with each of its prime factors
    for _, v := range nums {
        x := v
        for p := 2; p*p <= x; p++ {
            if x%p == 0 {
                union(v, p)
                for x%p == 0 {
                    x /= p
                }
            }
        }
        if x > 1 {
            union(v, x) // remaining prime factor
        }
    }

    // Count occurrences per component (only counting actual nums values)
    count := make(map[int]int)
    best := 0
    for _, v := range nums {
        root := find(v)
        count[root]++
        if count[root] > best {
            best = count[root]
        }
    }
    return best
}

The "extra entities in the DSU" trick. Adding primes as first-class DSU elements lets two numbers be "transitively connected via a shared factor" without ever explicitly testing the pair. The DSU's size grows by O(max value), but the union work drops from O(n²) to O(n × √max). This is a recurring pattern: when direct pairwise edges are too expensive, find a smaller set of hub entities and union each real node with its hubs.

8c · LC 803 — Bricks Falling When Hit (HARD)

Problem. A 2D grid where 1 = brick, 0 = empty. Bricks in the top row are "stable"; other bricks are stable iff they're connected (via 4-directional adjacency) to a stable brick. Process a sequence of hits — each hit removes one brick and returns how many bricks fall as a consequence.

Input:  grid = [[1,0,0,0],[1,1,1,0]]
        hits = [[1,0]]
Output: [2]
Explanation: After removing grid[1][0], bricks at (1,1) and (1,2) lose
             support and fall. Two bricks fall.

The straightforward simulation — for each hit, remove the brick and re-run BFS to count fallen bricks — is O(hits × grid). On big inputs, too slow.

The offline trick: process hits in REVERSE. Removing a brick becomes adding it back. Start from the final state (grid with ALL hits applied). Compute the initial set of stable bricks via a DSU connected to a virtual "ceiling" node. Then walk the hits in reverse: for each, add the brick back, union it with its (now stable) neighbours, and the increase in the ceiling component's size − 1 is the number of bricks that would have fallen in the forward simulation.

func hitBricks(grid [][]int, hits [][]int) []int {
    rows, cols := len(grid), len(grid[0])
    // Virtual ceiling node at index rows*cols
    CEILING := rows * cols
    parent := make([]int, rows*cols+1)
    size := make([]int, rows*cols+1)
    for i := range parent {
        parent[i] = i
        size[i] = 1
    }

    var find func(x int) int
    find = func(x int) int {
        if parent[x] != x {
            parent[x] = find(parent[x])
        }
        return parent[x]
    }
    union := func(a, b int) {
        ra, rb := find(a), find(b)
        if ra == rb {
            return
        }
        parent[ra] = rb
        size[rb] += size[ra]
    }
    idx := func(r, c int) int { return r*cols + c }
    inside := func(r, c int) bool { return r >= 0 && r < rows && c >= 0 && c < cols }

    // STEP 1: Apply all hits up front
    for _, h := range hits {
        if grid[h[0]][h[1]] == 1 {
            grid[h[0]][h[1]] = -1 // mark "was hit"
        }
    }

    // STEP 2: Build DSU over the post-hit grid; bricks in row 0 connect to CEILING
    for r := 0; r < rows; r++ {
        for c := 0; c < cols; c++ {
            if grid[r][c] != 1 {
                continue
            }
            if r == 0 {
                union(idx(r, c), CEILING)
            }
            if inside(r-1, c) && grid[r-1][c] == 1 {
                union(idx(r, c), idx(r-1, c))
            }
            if inside(r, c-1) && grid[r][c-1] == 1 {
                union(idx(r, c), idx(r, c-1))
            }
        }
    }

    dirs := [][2]int{{-1, 0}, {1, 0}, {0, -1}, {0, 1}}
    result := make([]int, len(hits))

    // STEP 3: Walk hits in REVERSE, adding bricks back
    for i := len(hits) - 1; i >= 0; i-- {
        r, c := hits[i][0], hits[i][1]
        if grid[r][c] != -1 {
            continue // was already empty before hit
        }
        before := size[find(CEILING)]
        grid[r][c] = 1
        if r == 0 {
            union(idx(r, c), CEILING)
        }
        for _, d := range dirs {
            nr, nc := r+d[0], c+d[1]
            if inside(nr, nc) && grid[nr][nc] == 1 {
                union(idx(r, c), idx(nr, nc))
            }
        }
        after := size[find(CEILING)]
        if after > before {
            result[i] = after - before - 1 // -1 because the just-added brick doesn't count as "fallen"
        }
    }
    return result
}

The offline reverse trick is the entire algorithm. Forward — removing bricks — would require some kind of dynamic connectivity with deletions (link-cut trees). Reversed — adding bricks — is plain union-find. The technique generalises: whenever a problem asks about a sequence of deletions with connectivity queries, try processing in reverse so deletes become inserts. The transformation is the whole insight; the DSU code that follows is mechanical.

9 · How to solve hard union-find problems step-by-step

Five questions to answer before writing code. The DSU class is short; almost all the senior craft is in modelling the problem into elements and equivalences.

What are the elements? Cells of a grid? Array indices? Strings (emails, accounts)? Numbers AND their prime factors (LC 952)? Decide what each integer in parent[] represents. If elements are non-integers, plan a string-to-id mapping.
What's the equivalence relation? "Connected by an edge" is the obvious case. But it could be "share a common prime factor", "in the same row or column", "equal by transitive rule". Some equivalences are obvious; others (like the prime-factor twist in LC 952) need a structural insight.
Is it online or offline? Online: queries arrive one at a time, you must answer each immediately — standard DSU. Offline: you have all queries upfront — you can sort them, reverse them, or pre-process. Offline unlocks tricks like "reverse the time axis so deletions become insertions" (LC 803).
Does each component need extra data? Size (so you can answer "how big is the component containing x"), sum (for weighted-sum queries), min/max element (for "what's the smallest element in x's component", LC 1061). Store at the root; update on union.
Plan the optimisations. Always include path compression AND union by rank (or size). These together give the α(n) bound; either alone gives O(log n). For competitive-programming-tight time limits, path compression by halving (point each node directly to its grandparent) is slightly weaker but cache-friendlier than full recursive compression.

The senior signal. Junior candidates write "find(x) = parent[x]" and forget compression. Senior candidates pause to identify the right elements and equivalence relation BEFORE writing any DSU code — and articulate the online/offline distinction. The data structure itself is intermediate-level; the modelling is what separates implementations.

10 · Path compression vs union by rank — which optimisations actually matter

Both optimisations are needed for the α(n) bound. The proof is Tarjan's 1975 paper; the practical takeaway is that omitting either drops you to O(log n) per operation.

Variant	Worst-case find	Implementation effort
Quick-union (no opts)	O(n) — a chain	5 lines
Path compression alone	O(log n) amortised	6 lines (one recursive call)
Union by rank alone	O(log n) worst case	+2 lines (the rank array, comparison)
Both together	O(α(n)) amortised — effectively constant	~12 lines total

Two compression flavours: full vs halving

Path compression has two common implementations. Both achieve amortised α(n) (with union by rank); they differ in constant factor and recursion depth.

// FULL PATH COMPRESSION — recursive, every walked node points to root
func find(x int) int {
    if parent[x] != x {
        parent[x] = find(parent[x]) // recursive; every node on path -> root
    }
    return parent[x]
}

// PATH COMPRESSION BY HALVING — iterative, every node points to its grandparent
func find(x int) int {
    for parent[x] != x {
        parent[x] = parent[parent[x]] // grandparent jump
        x = parent[x]
    }
    return x
}

Full compression (recursive). Every node on the path is pointed directly at the root. Strongest flattening; uses recursion (stack depth = tree height). Best amortised behaviour but worst stack usage.
Halving (iterative). Each node is pointed at its grandparent. Slightly less aggressive flattening, but no recursion → no stack overflow risk, and the tighter inner loop is cache-friendlier. Same asymptotic complexity. Often the production choice.
Splitting. Like halving, but points the child at its grandparent AND moves the pointer up by two each iteration. Subtle improvement; rarely used.

Union by rank vs union by size

Rank = upper bound on tree height. Size = number of nodes in the tree. Either gives the same α(n) bound when paired with path compression. Pick based on what you also need:

Union by rank. Slightly simpler; rank changes only when two trees of equal rank merge. Use when you don't need component sizes for the problem itself.
Union by size. Tracks the actual node count per component. Use when "how big is X's component" is a query you need to answer (LC 952 Largest Component Size, LC 1319 Make Network Connected). Free side-effect that often pays for itself.

The pragmatic default. Path compression by halving + union by size. Both iterative, both give α(n), and you get free component-size queries. Mention the recursive full-compression variant if asked, but reach for halving in production code.

11 · Offline trick — process queries in reverse

Union-find supports inserts but not deletes. Many problems naturally ask about deletes (bricks fall when hit, edges disappear, components break apart over time). The senior trick is to reverse the time axis — turn deletes into inserts — and answer the original queries in the right order at the end.

The recipe

Apply all deletes upfront to compute the FINAL state.
Build the DSU over that final state. Union surviving connections; record the initial answer (component count, component sizes, ceiling-connected count).
Walk the deletes in reverse. For each, ADD the element back; union with neighbours; observe the change in the relevant quantity (e.g., increase in component size, decrease in component count, change in ceiling-connectivity).
The change at step i, in reverse, is the answer to the original (forward-time) query i.

Why this works

The forward problem is "delete X, observe consequence Y". The reverse problem is "add X, observe consequence −Y". Both describe the same transition between two graph states; the reverse just happens to be tractable with union-find (insertions) while the forward isn't (deletions). The arithmetic of "what changed" is symmetric.

Where this trick applies

LC 803 — Bricks Falling When Hit. The canonical worked example (§8c). Bricks "removed by hits" become bricks "added in reverse". The DSU tracks connectivity to the ceiling; the increase in ceiling-component-size on reverse-add is the answer.
"Edges disappear over time; ask connectivity at each moment". Process events in reverse; deletions become insertions. Each query has a known "freeze time"; reorder them so each is answered at the right reverse-time step.
Offline percolation. "When does the path from top to bottom break?" Reverse: start fully connected, remove edges, ask when the top-bottom connectivity is lost — but actually, the FORWARD percolation problem (adding cells) is what union-find handles directly. The reverse trick is for the dual question.
Time-travel queries in a graph. "At time t, were u and v connected, given that edges (u, v) appear at known times"? Process queries in time order, union as you go. Or, sort queries by time and unionable edges by appearance time, walk both pointers.

The trick named. "Offline + reverse the time axis" is one of two or three high-use senior-interview tricks. The other two are "binary search on the answer" and "sort and sweep". All three convert a hard online problem into an easy offline one by sacrificing latency for throughput. Worth naming explicitly in interviews ("this is an offline problem; I can reverse the hits to turn deletes into adds and use plain union-find").

12 · Five-problem progression

#	LC	Problem	Why it's here
1	LC 323	Number of Connected Components	The minimal DSU. Initialise n components; each successful union decrements the count.
2	LC 684	Redundant Connection	The canonical cycle-detection use. First edge that fails to union is the answer.
3	LC 721	Accounts Merge	Union accounts that share an email. Then group by root to emit the merged accounts. Practice mapping strings to integer ids.
4	LC 1319	Make Network Connected	Count redundant cables, count components. If redundant cables ≥ components - 1, the answer is components - 1; otherwise -1.
5	LC 1202	Smallest String With Swaps	Pairs of swappable indices form components; within each component, sort the characters. DSU plus a per-component sort.

12a · The inverse Ackermann function — why α(n) is "constant"

Union-Find with path compression and union by rank has an amortised cost of O(α(n)) per operation, where α is the inverse Ackermann function. For any value of n you will ever encounter in this universe (n < 2^65536), α(n) ≤ 4. That is why complexity analysts call this "effectively constant" without scare quotes.

Why so good? Path compression flattens the tree on every find — repeated operations make subsequent finds cheaper. Union by rank ensures that the tree never grows tall when there's a flatter option. Together, the amortised work per operation drops to a rate that grows slower than any constant function you can write — including iterated logarithm, log* n, and similar slowly-growing functions.

The 1989 Tarjan proof is the canonical reference; the practical takeaway is that, for any input size you will see, union-find behaves like a constant-time data structure. This is why it dominates problems where you need to merge equivalence classes online (Kruskal's MST, percolation simulation, dynamic connectivity).

12b · Weighted Union-Find — when groups carry data

A weighted variant tracks the offset (or "weight") between each node and its parent. Used when the union operation conveys not just "x and y are in the same group" but "x is some specific distance/relationship from y". Examples:

Sentence similarity (LC 737). Words are grouped by similarity; the relation is symmetric and transitive. Standard union-find.

Evaluate Division (LC 399). Variables related by ratios: "a/b = 2.0, b/c = 3.0, find a/c". Weighted union-find: each node stores its ratio to its parent; find returns both the root and the cumulative ratio along the path.

Possible Bipartition / Coloring problems. "These pairs must be in different groups". A weighted union-find with weight in {0, 1} (same color / opposite color) finds contradictions automatically.

python weighted_uf.py

12345678910111213141516171819

class WeightedUF:
    def __init__(self, n):
        self.parent = list(range(n))
        self.weight = [0.0] * n  # offset from this node to its parent

    def find(self, x):
        if self.parent[x] != x:
            root = self.find(self.parent[x])
            self.weight[x] += self.weight[self.parent[x]]
            self.parent[x] = root
        return self.parent[x]

    def union(self, x, y, w):
        # weight w means: x = y * w  (or x is w units away from y)
        rx, ry = self.find(x), self.find(y)
        if rx == ry: return
        # adjust weights so the relation holds
        self.parent[rx] = ry
        self.weight[rx] = self.weight[y] + w - self.weight[x]

12c · Offline union-find — sweeping over a sorted event list

A surprising number of "what's the minimum cost to connect..." problems become trivial when you process edges in sorted order. The pattern:

1. Generate every relevant event (edge with a weight, query with a parameter).

2. Sort events by the parameter (weight, time, threshold).

3. Walk through events in order; for each, do a union or answer a query against the current state of the DSU.

Examples:

Kruskal's MST. Sort edges by weight; for each edge, union the endpoints if they're in different components. Total weight is the MST cost.

Number of connected components after each query (offline). Process queries in reverse — start from the final state, "undo" by adding edges back, answer each query at the appropriate point.

Earliest moment when two people are connected (LC 1101). Sort the friendship events by time; union as you go; the answer to "when are A and B connected" is the time of the union that first put them in the same group.

Min cost to connect all cities with a budget. Sort potential edges by cost; union until either the graph is connected or the budget is exhausted.

12d · When union-find isn't quite right

Three situations where union-find seems like the fit but is actually wrong:

You need to delete edges. Union-find supports only adding. For dynamic connectivity with deletions, you need link-cut trees or Euler-tour trees. Common in competitive programming, rare in interviews.

You need group sizes that change non-monotonically. Union-find groups only grow. If your problem requires groups to split, union-find cannot do it directly — you typically reverse the problem (process events in reverse, so splits become joins).

You need to enumerate elements of a group. Union-find tells you whether two nodes are in the same group; it does not give you the members of a group without an extra structure. Add a dict from root → list of members, but be careful about updating on union.

13 · Common pitfalls

Skipping path compression. The code still works — find walks up the parent chain — but each operation can degrade to O(n) on adversarial inputs. With compression, the chain flattens as you go, and the amortised cost falls to α(n).
Wrong order in union. If you write parent[ra] = rb without considering rank, you can stack a tall tree on top of a short one and double its height. Always compare ranks (or sizes) first and attach the shorter tree under the taller root.
Forgetting union by rank. Path compression alone is enough for amortised O(log n); pairing it with union by rank tightens that to α(n). Both optimisations together is the standard form — leave either out and the bound loosens.
Reaching for DSU when BFS/DFS is simpler. If the graph is static and given as an adjacency list, BFS or DFS labels connected components in one pass. DSU earns its keep when edges arrive incrementally or when you need answers between insertions.
2D grid mapping bugs. The flattening r * cols + c is easy to get wrong — using rows instead of cols, or mixing up r and c. Worth writing a tiny helper and testing it on a 2×3 grid before trusting it.
Confusing union(u, v) with union(parent[u], parent[v]). Both work in the final result, but the latter bypasses the find-with-compression. If you pre-compute roots and reuse them, you skip the chance for path compression to flatten the tree. Always go through find.
The if rank[u] < rank[v] direction. Easy to flip. Mnemonic: the TALLER tree wins — the shorter tree gets attached UNDER the taller's root. So when rank[u] < rank[v], swap u and v so u becomes the taller; then attach v's root under u's. Trace a 2-element example to lock in the direction.
Recursive find vs iterative find on adversarial input. A recursive full-compression find can blow the stack if the chain is long enough (~10⁴ nodes in Python). The iterative-halving variant avoids this. For competitive programming, prefer iterative; for whiteboard clarity, recursive is fine.
Mixing rank and size in the same DSU. Pick one — they encode different things. Using rank for the union decision but size for queries is fine, but make sure size is updated correctly on every successful merge (and only on success).
Not handling self-loops. union(x, x) should return False (no merge). Most implementations get this right via if ra == rb: return False, but it's worth checking.
Off-by-one in DSU size. Many problems use 1-indexed node IDs (nodes labelled 1..n). Allocate parent with size n + 1, or remap to 0-indexed up front. Forgetting this is a silent OOB.
Slow string-to-id mapping in problems like LC 721 Accounts Merge. Use a dict to assign incrementing integer ids; reuse them across passes. Re-mapping strings inside the inner loop is a 5-10× slowdown.

14 · What to say in the interview

"Dynamic connectivity / cycle in an undirected graph → union-find." Name the pattern. The interviewer wants to hear you recognise the incremental shape.
"Parent array plus find and union." State the data structure and the two operations in one sentence.
"Path compression in find, union by rank in union." Both optimisations together. Mention them explicitly — the bound depends on it.
"Amortised O(α(n)) per operation, effectively constant." α is the inverse Ackermann function. For any n you'll ever see in practice, α(n) ≤ 4. Worth stating once and moving on.
"Union returns False if the two nodes are already in the same set." That's the cycle-detection hook for problems like Redundant Connection.
Edge cases to mention. Self-loops (union(x, x) returns False immediately), duplicate edges, isolated nodes (each is its own component initially), node ids that don't start at zero (map them first).

15 · Where this gets asked

Company	Common framing	Why it fits their stack
Google	Number of Connected Components (LC 323), Redundant Connection (LC 684), Accounts Merge (LC 721)	Knowledge-graph + identity-resolution teams.
Meta	Friend Circles / Number of Provinces (LC 547), Accounts Merge (LC 721)	Social graph: every "merge two communities" feature is a union-find.
Amazon	Most Stones Removed (LC 947), Satisfiability of Equality Equations (LC 990)	Inventory clustering + constraint solvers.
Microsoft	Graph Valid Tree (LC 261), Redundant Connection II (LC 685 — directed variant)	Bing's link graph + Azure subnet validation.
Bloomberg	Smallest String With Swaps (LC 1202), Sentence Similarity II (LC 737)	Reference-data deduplication.
Stripe / Datadog	Customer-id merging across signup paths, alert-group consolidation	Identity resolution + alert aggregation are textbook union-find.

Pattern of patterns. Three sub-shapes — (1) build connected components from an edge list, (2) detect a cycle as you build (union returns false → cycle), (3) weighted/sized union-find for "how many elements share my group". Master those three and most union-find questions are adaptations.

16 · Try it yourself

Number of Connected Components · LC 323: The canonical warm-up. Start with N components; every successful union decrements the count. Hint: maintain a count variable as you go — avoids a final scan to count distinct roots.
Redundant Connection · LC 684: Cycle detection. The first edge that connects two already-connected nodes is the redundant one. Hint: union returns false on already-connected → that's the answer edge; no second pass needed.
Accounts Merge · LC 721: Multi-key entity resolution. Map every email to the first account-id you saw it on; union accounts that share any email. Hint: build the email→account map first, then iterate emails once to do unions.
Stretch — Most Stones Removed · LC 947: Union stones sharing a row or column; the answer is N − (number of components). Hint: encode rows and columns into a single int space (e.g. cols offset by 10000) so you can union them in the same DSU.

How to practise. Write a generic DSU class from scratch in your language of choice, with path compression + union-by-rank. Once that's in your fingers, every union-find problem becomes "wrap the input as edges, call union/find, count components". The data structure is the hard part; the problems are wrappers.

Union-Find — disjoint set union

1 · Why union-find / DSU matters

Real production examples

2 · How to recognise it

3 · Intuition — every component has a root

4 · Canonical example — Redundant Connection (LC 684)

5 · Reference implementation

6 · Sister algorithms

7 · Variations

7a · Path compression in action

7b · Two optimisations, ordered by impact

8 · Three worked problems — easy, medium-hard, hard

8a · LC 547 — Number of Provinces

8b · LC 952 — Largest Component Size by Common Factor

8c · LC 803 — Bricks Falling When Hit (HARD)

9 · How to solve hard union-find problems step-by-step

10 · Path compression vs union by rank — which optimisations actually matter

Two compression flavours: full vs halving

Union by rank vs union by size

11 · Offline trick — process queries in reverse

The recipe

Why this works

Where this trick applies

12 · Five-problem progression

12a · The inverse Ackermann function — why α(n) is "constant"

12b · Weighted Union-Find — when groups carry data

12c · Offline union-find — sweeping over a sorted event list

12d · When union-find isn't quite right

13 · Common pitfalls

14 · What to say in the interview

15 · Where this gets asked

16 · Try it yourself

Further reading

Trie — prefix tree