23 / 24 · Tier C
Patterns / 23

Advanced DP

Vanilla DP covers most LeetCode mediums. The advanced shapes are what crack hards. Bitmask when N ≤ 20. Interval when the answer comes from merging subarrays. Digit DP when the input is the number 10¹⁸. Re-rooting when every node needs an answer. Convex hull trick and Knuth optimisation when the inner loop has a monotone structure. Each is a constraint-driven escalation on top of the regular DP recipe.


1 · Why these matter

Vanilla DP — 1D linear, 2D grid, knapsack — clears most LeetCode mediums. The advanced shapes start showing up on hards and in the senior-onsite tier where constraints look weird on purpose. Five recurring shapes worth investing in.

  • Bitmask DP. State is a subset of items, encoded as an integer. LC 847 Shortest Path Visiting All Nodes, LC 691 Stickers to Spell Word, LC 1255 Maximum Score Words Formed by Letters.
  • Interval DP. State is a range [l, r]; the answer combines subintervals split at some k. LC 312 Burst Balloons, LC 1000 Minimum Cost to Merge Stones, LC 1547 Minimum Cost to Cut a Stick.
  • Digit DP. Count integers in [L, R] with some digit property. LC 233 Number of Digit One, LC 600 Non-Negative Integers Without Consecutive Ones, LC 902 Numbers At Most N Given Digit Set.
  • Tree DP with re-rooting. Compute the answer at every node in O(n) total. LC 834 Sum of Distances in Tree, LC 310 Minimum Height Trees (variant), Codeforces problem set's bread and butter.
  • Convex hull trick. Linear-in-x DP transitions. Rare on LeetCode, common in competitive programming. Worth recognising the shape.
The constraint cheat-sheet. N ≤ 20 → bitmask. N ≤ 500 with "split somewhere" → interval DP, O(N²) or O(N³). Number up to 10¹⁸ with a digit property → digit DP. "Answer at every node of a tree" → re-rooting. Read the constraint before you reach for the algorithm.

2 · How to recognise each

Each technique has a one-line tell. If you can match the prompt to one of the five, the state usually falls out of the archetype.

TellTechniqueConstraint hint
"Visit all of a small set" / "permutation of items"Bitmask DPN ≤ 20 (sometimes 22 if memory's tight)
"Merge / burst / split until empty"Interval DPN ≤ 100 (O(N³)) or N ≤ 500 (O(N²) with optimisation)
"Count integers in [L, R] with property X"Digit DPL, R up to 10¹⁸; the digit count is the state size
"For each node, compute f(node) over the tree"Tree DP with re-rootingN up to 10⁵, but you need every node's answer
dp[i] = min over j of (dp[j] + a[j]·x[i] + b[j])Convex hull trick / Li ChaoN up to 10⁵ with linear-in-x transitions
dp[i][j] = min over k of (dp[i][k] + dp[k+1][j] + cost) with Monge costKnuth optimisationDrops O(N³) to O(N²) — niche

3 · Bitmask DP

Represent subsets of an n-element set as integers in [0, 2ⁿ). Bit i set means "item i is in the subset". The state is the subset (plus maybe one extra parameter like "currently at node u"). The transition flips one bit or iterates over submasks.

Subsets of 3 — each row = popcount00000001001001001000001101010110100110101100011110111101111011112⁴ = 16 subsets

For n = 4, that's 16 subsets. For n = 16, 65 536. For n = 20, ~10⁶, the practical ceiling. With one extra dimension (e.g., "and you're currently standing at node u"), the table is 2ⁿ × n. Past 20 the state space explodes.

Canonical bit operations

mask | (1 << i)     // add element i
mask &^ (1 << i)    // remove element i        (Go's bit-clear operator)
mask & (1 << i)     // test if i is in the subset (truthy when present)
bits.OnesCount(uint(mask))  // popcount — how many elements in the subset
mask == 0           // empty subset (base case for many recurrences)

// Iterate over every non-empty submask of mask:
for sub := mask; sub > 0; sub = (sub - 1) & mask {
    // sub is a subset of mask
}

LC 847 — Shortest Path Visiting All Nodes

Undirected graph of n ≤ 12 nodes. Return the length of the shortest path that visits every node (you can start and end anywhere, revisit nodes/edges). Classic TSP variant. State: (currentNode, visitedMask). Transition: extend to any neighbour, set its bit. BFS over the state space gives the shortest path in number of edges.

// LC 847 — BFS over (node, mask) states. O(n^2 * 2^n) time.
func shortestPathLength(graph [][]int) int {
    n := len(graph)
    full := (1 << n) - 1

    type state struct{ node, mask int }
    visited := make(map[state]bool, n<<n)
    q := []state{}

    // Start from every node simultaneously
    for i := 0; i < n; i++ {
        s := state{i, 1 << i}
        q = append(q, s)
        visited[s] = true
    }

    steps := 0
    for len(q) > 0 {
        next := make([]state, 0, len(q))
        for _, s := range q {
            if s.mask == full {
                return steps
            }
            for _, nb := range graph[s.node] {
                ns := state{nb, s.mask | (1 << nb)}
                if !visited[ns] {
                    visited[ns] = true
                    next = append(next, ns)
                }
            }
        }
        q = next
        steps++
    }
    return -1
}
Why BFS, not DP recursion. The graph is undirected and we want the shortest path in edges, not weighted distance. BFS over the state space gives that for free — every "layer" of the BFS is one more edge traversed. The state space has at most n × 2ⁿ nodes (here, 12 × 4096 ≈ 49 152), each visited once.

4 · Interval DP

State: dp[l][r] = answer on the subarray a[l..r]. Transition: try every split point k in (l, r) and combine the two halves. Or, in burst-balloons style, try every last element k in [l, r] to be processed and recurse on the two open subintervals.

The catch is iteration order. To compute dp[l][r] from subintervals, the subintervals must already be computed. Iterate by interval length from 2 upward — never by l or r directly.

LC 312 — Burst Balloons

You have n balloons each with a number. Bursting balloon i earns nums[i-1] * nums[i] * nums[i+1] (with implicit 1s outside the array). After bursting, neighbours close up. Maximise total coins. Naively 2ⁿ orderings; interval DP brings it to O(n³).

The trick. Instead of "which balloon to burst first", ask "which balloon in [l, r] is burst last". If k is last, its neighbours at the moment of bursting are nums[l-1] and nums[r+1] (the boundary balloons, which never get burst in this subproblem). The two halves [l, k-1] and [k+1, r] are independent.

// LC 312 — Burst Balloons. O(n^3) time, O(n^2) space.
func maxCoins(nums []int) int {
    // Pad with 1s on both ends to handle boundaries uniformly
    a := append([]int{1}, nums...)
    a = append(a, 1)
    n := len(a)

    // dp[l][r] = max coins from bursting all balloons strictly between l and r
    dp := make([][]int, n)
    for i := range dp {
        dp[i] = make([]int, n)
    }

    // Iterate by interval length — the key to correct ordering
    for length := 2; length < n; length++ {
        for l := 0; l+length < n; l++ {
            r := l + length
            best := 0
            // k is the LAST balloon to burst in the open interval (l, r)
            for k := l + 1; k < r; k++ {
                coins := a[l]*a[k]*a[r] + dp[l][k] + dp[k][r]
                if coins > best {
                    best = coins
                }
            }
            dp[l][r] = best
        }
    }
    return dp[0][n-1]
}
Why "last burst" works and "first burst" doesn't. If you ask "which is burst first", after bursting k the neighbours of all remaining balloons change, and the two halves are no longer independent. Reversing the question — "which is burst last" — fixes the neighbours of the last one to the boundary, and the subproblems decouple cleanly. This re-framing trick shows up in several interval DP problems (LC 1000 Merge Stones, LC 1547 Cut a Stick).

Other interval DP problems worth knowing

  • LC 1000 — Minimum Cost to Merge Stones. Merge K consecutive piles at a time. Three-dimensional state dp[l][r][p] for "merged into p piles".
  • LC 1547 — Minimum Cost to Cut a Stick. Sort cut positions, add the endpoints, interval-DP over the cut list.
  • Matrix Chain Multiplication (CLRS). The textbook example. Same shape as LC 312 — split at every k, combine the two halves.
  • LC 87 — Scramble String. 3D interval DP over two strings — every split position of one matches a (possibly swapped) split of the other.

5 · Digit DP

"Count integers in [L, R] with property P." If R is up to 10¹⁸ you can't iterate; digit DP processes the number one digit at a time. The state is (position, tight, leadingZero, problemSpecific):

  • position — current digit index, 0 to ~18.
  • tight — boolean; are we still bounded above by R's digits? If true, the next digit can be at most R's digit at that position; if false, anything 0–9.
  • leadingZero — boolean; are we still in the leading-zero prefix? Needed when "0001" should be treated as "1" for digit-property purposes.
  • problem-specific state — for "count of digit 1": the running count. For "no adjacent equal digits": the previous digit.

Compute count(R) - count(L - 1) to get the answer on [L, R].

LC 233 — Number of Digit One

Count occurrences of the digit 1 in all integers from 0 to n. There's a closed-form O(log n) formula, but the digit-DP framing generalises to harder problems.

// LC 233 — Number of Digit One via digit DP. O(log n) time.
func countDigitOne(n int) int {
    if n <= 0 {
        return 0
    }
    digits := []int{}
    for x := n; x > 0; x /= 10 {
        digits = append([]int{x % 10}, digits...)
    }
    L := len(digits)

    // memo[pos][cnt][tight] — leadingZero not needed since 1s don't appear in leading zeros
    memo := make(map[[3]int]int)
    var dp func(pos, cnt int, tight bool) int
    dp = func(pos, cnt int, tight bool) int {
        if pos == L {
            return cnt
        }
        t := 0
        if tight {
            t = 1
        }
        key := [3]int{pos, cnt, t}
        if !tight {
            if v, ok := memo[key]; ok {
                return v
            }
        }
        limit := 9
        if tight {
            limit = digits[pos]
        }
        total := 0
        for d := 0; d <= limit; d++ {
            nc := cnt
            if d == 1 {
                nc++
            }
            total += dp(pos+1, nc, tight && d == limit)
        }
        if !tight {
            memo[key] = total
        }
        return total
    }
    return dp(0, 0, true)
}
The memoisation rule. Only cache states where tight == false. Tight states depend on the specific input number — a tight state at position 5 of "12345" is different from a tight state at position 5 of "98765". Non-tight states are fully determined by (position, accumulated state) and can be shared across calls.

Other digit DP problems

  • LC 600 — Non-Negative Integers Without Consecutive Ones. Extra state: the previous bit. Classic teaching example.
  • LC 902 — Numbers At Most N Given Digit Set. Iterate only over allowed digits at each position.
  • LC 2376 — Count Special Integers. Extra state: bitmask of digits used so far (10 bits).
  • LC 1067 — Digit Count in Range. Same shape as LC 233 with the target digit parametrised.

6 · Tree DP with re-rooting

Standard tree DP picks a root, runs DFS, and answers a question about that root. Re-rooting asks: what if every node were the root? Naive "DFS from every node" is O(n²). Re-rooting gets you O(n) by doing two DFS passes — one rooted at an arbitrary node, then one that "rerolls" the answer as you walk from a parent to each child.

The two-pass recipe

  1. Pass 1 (post-order): root at node 0, compute down[u] = answer considering only the subtree at u.
  2. Pass 2 (pre-order): walk from root downward, computing ans[u] = answer if u were the root, using the parent's ans and the subtree contribution being "removed" from one side.

LC 834 — Sum of Distances in Tree

Tree with n nodes. For each node u, compute the sum of distances from u to every other node. O(n) total via re-rooting.

Let size[u] = subtree size at u (when rooted at 0). Let down[u] = sum of distances from u to all nodes in its subtree. Then for ans[root] = down[0]. Re-rooting from parent p to child c: when you move the root from p to c, every node in c's subtree gets one step closer (subtract size[c]) and every node outside gets one step farther (add n - size[c]). So ans[c] = ans[p] - size[c] + (n - size[c]).

root at p — ans[p] = down[p]pc·size[c] = 3root at c — ans[c] = ans[p] − size[c] + (n − size[c])pcflip: c's subtree closer, rest farther
// LC 834 — Sum of Distances in Tree. O(n) total via re-rooting.
func sumOfDistancesInTree(n int, edges [][]int) []int {
    adj := make([][]int, n)
    for _, e := range edges {
        adj[e[0]] = append(adj[e[0]], e[1])
        adj[e[1]] = append(adj[e[1]], e[0])
    }

    size := make([]int, n)   // size[u] = subtree size at u
    ans := make([]int, n)    // pass 1: ans[u] = sum of distances within subtree

    // PASS 1 — post-order: compute size and "down" answer
    var dfs1 func(u, parent int)
    dfs1 = func(u, parent int) {
        size[u] = 1
        for _, v := range adj[u] {
            if v == parent {
                continue
            }
            dfs1(v, u)
            size[u] += size[v]
            ans[u] += ans[v] + size[v]  // every node in v's subtree is one further from u
        }
    }
    dfs1(0, -1)

    // PASS 2 — pre-order: re-root from parent to child
    var dfs2 func(u, parent int)
    dfs2 = func(u, parent int) {
        for _, v := range adj[u] {
            if v == parent {
                continue
            }
            ans[v] = ans[u] - size[v] + (n - size[v])
            dfs2(v, u)
        }
    }
    dfs2(0, -1)

    return ans
}
The re-rooting formula derivation. When you re-root from u to its child v: every node in v's subtree (there are size[v] of them) is now one edge closer to the root → subtract size[v]. Every other node (there are n - size[v] of them) is one edge farther → add n - size[v]. Net change: n - 2·size[v]. One line, one DFS pass.

7 · Convex hull trick

When the DP transition has the shape dp[i] = min over j of (dp[j] + a[j] · x[i] + b[j]), each candidate j is a line y = a[j]·x + (dp[j] + b[j]) in some x variable. At query x[i] you want the line with the minimum value at x[i]. The lower envelope of all lines is the convex hull, and queries on it can be answered in O(log n) per query (or amortised O(1) if both insertions and queries are monotone).

Two implementations in practice:

  • Monotone CHT — both insert slopes and query x's are monotone. Maintain a deque; O(n) total.
  • Li Chao treesegment tree over the x-range, each node stores the line that dominates at its midpoint. O(log V) per insert/query. Less fiddly, easier to get right under time pressure.

LeetCode rarely needs this — most "minimise" problems with O(n²) DP either pass with the brute force or have a different optimisation (two pointers, monotonic queue). Worth recognising the shape so you don't waste time when an interviewer asks about it.

Where you'll actually see CHT. Codeforces and atcoder problems with the shape "split an array into k contiguous pieces, minimise the sum of f(piece) where f is linear in length". Stock problems with quadratic cost. Rarely on Google/Meta onsites — they prefer cleaner shapes.

8 · Knuth optimisation

Applies to recurrences of the form dp[i][j] = min over k in [i, j) of (dp[i][k] + dp[k+1][j] + cost(i, j)) when cost satisfies the Monge condition: cost(a, c) + cost(b, d) ≤ cost(a, d) + cost(b, c) for a ≤ b ≤ c ≤ d. Under that condition the optimal split k*(i, j) is monotone in i and j, which lets you restrict the inner loop to k*(i, j-1) ≤ k ≤ k*(i+1, j). The total work telescopes from O(n³) to O(n²).

Used in optimal binary search tree (CLRS 15.5) and a handful of classical problems. Mostly competitive programming. Recognise it; don't memorise the proof.

9 · Divide-and-conquer DP

A different monotone-k optimisation. Applies when dp[i][j] = min over k of (dp[i-1][k] + cost(k, j)) and the optimal k for index j in row i is monotone in j. Computes one row in O(n log n) via divide-and-conquer: split j in the middle, find the optimal k for the midpoint by full scan, then the optimal ks for j's left half lie in [lo, k*] and the right half in [k*, hi]. Recurse.

Compared to Knuth: D&C DP optimises a single row, Knuth optimises an interval recurrence. Same monotonicity intuition, different structure. LC has a couple of these (LC 1478 Allocate Mailboxes is closest), but they're typically solvable with simpler techniques.

10 · Worked hard problem — LC 691 Stickers to Spell Word

Problem. You're given a list of stickers (each a string) and a target word. Each sticker can be used any number of times. Return the minimum number of stickers needed to spell the target (any order), or -1 if impossible.

Constraints. Target length up to 15. 2¹⁵ = 32 768 subsets of the target's letter positions. Bitmask DP fits.

State. dp[mask] = minimum stickers needed to cover the letters of the target at the positions marked in mask. We start from mask = (1 << n) - 1 (all letters needed) and shrink down to 0.

Transition. For each sticker, count how many of each letter it provides. From state mask, "apply" the sticker by walking the target positions still in mask and consuming the sticker's letters greedily — that gives the next mask. dp[mask] = 1 + min(dp[next]) over all stickers.

// LC 691 — Stickers to Spell Word. Bitmask DP with memoisation.
// O(2^n * S * L) where n = len(target), S = #stickers, L = max sticker length.
func minStickers(stickers []string, target string) int {
    n := len(target)
    full := (1 << n) - 1

    // Pre-count each sticker's letters
    counts := make([]map[byte]int, len(stickers))
    for i, s := range stickers {
        c := map[byte]int{}
        for j := 0; j < len(s); j++ {
            c[s[j]]++
        }
        counts[i] = c
    }

    memo := make(map[int]int)

    var solve func(mask int) int
    solve = func(mask int) int {
        if mask == 0 {
            return 0
        }
        if v, ok := memo[mask]; ok {
            return v
        }
        best := -1
        for _, c := range counts {
            // Optimisation: only try stickers that cover the lowest unset bit.
            // (Without this the recursion explores symmetric orderings.)
            lowBit := 0
            for (mask>>lowBit)&1 == 0 {
                lowBit++
            }
            if c[target[lowBit]] == 0 {
                continue
            }
            avail := map[byte]int{}
            for k, v := range c {
                avail[k] = v
            }
            next := mask
            for j := 0; j < n; j++ {
                if next&(1<<j) == 0 {
                    continue
                }
                if avail[target[j]] > 0 {
                    avail[target[j]]--
                    next &^= 1 << j
                }
            }
            if next == mask {
                continue   // sticker contributed nothing
            }
            sub := solve(next)
            if sub == -1 {
                continue
            }
            if best == -1 || sub+1 < best {
                best = sub + 1
            }
        }
        memo[mask] = best
        return best
    }
    return solve(full)
}
The "lowest unset bit" optimisation. Without it, the recursion tries every sticker at every state, generating exponentially many symmetric orderings (sticker A then B reaches the same mask as B then A). Constraining the next sticker to one that covers the lowest still-needed letter cuts the branching dramatically — often the difference between TLE and passing.

11 · How to solve a hard DP step-by-step

  1. Identify the state. What information do you need at decision time? The state must be the minimal sufficient statistic — small enough to fit in memory, complete enough that no relevant context is missing. State definition is 80% of the problem.
  2. Identify transitions. From a state, what are the choices? For each choice, where do you go? The recurrence falls out once you can articulate this in English.
  3. Read the constraint to pick the shape. N ≤ 20 → bitmask. N up to 500 with "split" → interval. Number up to 10¹⁸ → digit DP. "Answer at every node" → re-rooting. Pick before coding.
  4. Think about evaluation order. Bottom-up needs every cell read to be already written. Interval DP iterates by length, knapsack by item then capacity, tree DP by post-order. Get this right or the table is garbage.
  5. Memoisation vs tabulation. Memoisation is easier to write and only computes reachable states. Tabulation is faster (no function-call overhead), avoids stack-depth issues, and enables rolling-array space optimisation.
  6. Space optimisation last. Get it correct first. Once you see that dp[i] only depends on dp[i-1], roll to two arrays. Premature optimisation is debug hell.

12 · Complexity

TechniqueTimeSpaceCanonical LC
Bitmask DPO(2ⁿ · n) typicalO(2ⁿ · n)LC 847, 691, 1255, 526
Bitmask + iterate submasksO(3ⁿ) — sum of (n choose k) · 2ᵏO(2ⁿ)LC 1986 Min Sessions, LC 1986
Interval DPO(n³) usually, O(n²) with KnuthO(n²)LC 312, 1000, 1547, 87
Digit DPO(D · S · base) where D ≈ 18, S = state sizeO(D · S)LC 233, 600, 902, 2376
Tree DP + re-rootingO(n) — two DFS passesO(n)LC 834, 310, 2538
Convex hull trick (monotone)O(n) amortisedO(n)Rare on LC; CF/AtCoder staple
Convex hull trick (Li Chao)O(n log V) per insert/queryO(V)Rare on LC
Knuth optimisationO(n²) instead of O(n³)O(n²)Optimal BST; rare on LC
D&C DP per layerO(n log n) per layerO(n)LC 1478 (close), LC 1335

13 · What breaks

  • Bitmask states explode at N = 21. 2²¹ × 21 ≈ 44 million state-action pairs, and each often touches a map. Once N > 20, look for a different shape — meet-in-the-middle, branch-and-bound, or rethink the state entirely.
  • Interval DP iterated in the wrong order. Iterating by l from 0 to n then r from l+1 to n reads cells that haven't been computed. Always iterate by length first, then by l. If you forget, the table fills with stale zeros and the answer is silently wrong.
  • Digit DP forgetting the leading-zero state. For problems where "0042" should be treated as "42" (e.g., "no two adjacent equal digits"), the leading-zero flag is required. Without it, you double-count or miscount.
  • Digit DP caching tight states. Cache key includes tight, but the tight-true memo is meaningless across different upper bounds. Either don't memoise tight states or include the bound in the key.
  • Re-rooting with wrong subtree handling. When re-rooting, the contribution from the child you're moving to must be subtracted, not the whole subtree. Common bug: subtracting down[v] instead of accounting for the edge-flip correctly.
  • Confusing "first" and "last" in interval DP. Burst Balloons works because "last burst" decouples; "first burst" doesn't. Stones-merging works because "first pile after which to split" decouples. Re-frame until the subproblems are independent.
  • CHT with non-monotone queries. The deque-based CHT assumes monotone query x's. With arbitrary queries you need Li Chao or a balanced BST over slopes. Wrong choice → silently wrong answers.

14 · Further reading

  • Errichto — DP YouTube series. The clearest free explanation of bitmask, digit, and interval DP, with live problem-solving on Codeforces problems. Search "Errichto dynamic programming" on YouTube.
  • CP-Algorithms — Dynamic Programming chapter. Free, comprehensive. The digit DP, profile DP, and CHT pages are the canonical web references. cp-algorithms.com/dynamic_programming.
  • AtCoder Educational DP Contest. 26 problems from Frog (A) to Subtree (V) that systematically walk through every DP shape. The bitmask and tree-DP problems (M–V) are exactly the patterns on this page.
  • CLRS — Chapter 15. Optimal BST construction is the canonical Knuth-optimisation example. Matrix chain multiplication is the textbook interval DP.
  • USACO Guide — Advanced section. Bitmask DP, sum-over-subsets (SOS) DP, and convex hull trick chapters are written for high-school olympiad audiences, which means they're unusually clear.
  • Adjacent: vanilla DP. If state-and-transition isn't second nature yet, fix that first.
  • Adjacent: tree DP. Re-rooting is one technique built on top of post-order tree DP.
Found this useful?