15 / 20 · Tier C
Patterns / 15

Topological sort

A linear ordering of a directed acyclic graph such that for every edge u → v, u appears before v. Two implementations cover almost every problem: Kahn's algorithm (BFS with in-degree counts) and DFS post-order reversed. If the graph has a cycle, no topological order exists, and the cycle check is usually the bug.


1 · Why topological sort matters: the dependency-resolution primitive

Topological sort linearises a DAG into a valid build order. Given a set of items with directed "must-happen-before" relationships, it produces a sequence where every prerequisite shows up before the thing it enables. It is the single primitive that turns "x depends on y" into "do them in this order".

Every dependency-resolution system on Earth runs on this primitive. npm install orders package installs by topological sort over the dependency tree. Bazel, Buck, and Make schedule build targets by topological sort. The Linux kernel boot sequence picks systemd units to start in topological order over the After= / Requires= relation. Airflow, Dagster, and Prefect schedule pipeline tasks the same way. When the ordering is wrong, the build fails; when there is no ordering, something is cyclic and must be broken.

The intuition is almost too simple: if you can do anything that has no unmet prerequisites, do that thing, then update what is now unblocked. Kahn's algorithm formalises this into BFS over a frontier of in-degree-0 nodes. DFS post-order does the same thing from the other end: finish a node only when all its descendants are finished, then prepend it. Two implementations, one underlying truth: a DAG has a linear extension iff it is acyclic, and the algorithm constructs that extension in O(V + E).

Why this is the L4+ graph pattern. Most "easy" graph questions are reachability (BFS / DFS) or shortest path (Dijkstra). The moment a problem says "schedule", "build", "prerequisite", "deduce the order from constraints", or "depends on", the right pattern is topological sort. Recognising the dependency-resolution framing inside a non-obvious problem (Alien Dictionary, Sequence Reconstruction) is the senior signal.

2 · How to recognise it: five distinct shapes

Topological sort is the right pattern when the problem describes dependencies between items and asks for an order that respects them. Five recognisable sub-shapes cover almost every interview problem in this family. Naming the shape early picks the right variant (plain Kahn vs level-based vs DFS three-colour) before you commit to code.

Shape A: tasks with dependencies / explicit prerequisites

  • Direct framing. "Given N tasks and a list of (a, b) pairs meaning a depends on b, output a valid order to complete all N." LC 210 Course Schedule II is the textbook example.
  • The edge direction is given. Either as "a depends on b" (edge b → a) or "to do a, finish b first" (same thing). Read carefully. Half of all bugs are inverted edges.
  • The answer is a list of nodes in some valid order. If multiple orders are valid, returning any of them is usually accepted unless the problem says "lexicographically smallest".

Shape B: course prerequisites / build order / deployment sequence

  • Real-world framing. Course scheduling, package install order, microservice deployment order. The vocabulary changes but the graph doesn't.
  • Common tells: "valid order", "any valid order", "schedule", "build", "install", "deploy". The word "prerequisite" appears verbatim in LC 207 and LC 210.
  • The graph is implicit in the input format. Often given as a list of pairs; you build the adjacency list and in-degree array yourself.

Shape C: minimum number of semesters / rounds (level-based topo)

  • Parallel-execution framing. "Each semester you can take any number of available courses." "Each round you can process every task whose prerequisites are done." LC 1136 Parallel Courses, LC 2050 Parallel Courses III.
  • The answer is a number: the count of layers, not the ordering itself.
  • Implementation tweak: instead of processing one node per loop iteration, process every in-degree-0 node together as one "level", then count levels. The level count is the critical path length.

Shape D: cycle detection (no valid topo order)

  • "Is there a cycle?" framing. LC 207 Course Schedule asks only "can all courses be finished?", equivalent to "is the graph a DAG?".
  • Two clean signals: (1) Kahn's variant: after the loop, if fewer than N nodes were emitted, a cycle exists. (2) DFS three-colour: when you visit a node currently in the recursion stack (grey), you've found a back edge.
  • Goes beyond just yes/no. "Find Eventual Safe States" (LC 802) asks which nodes are guaranteed not to be part of any cycle, solved by topo sort on the reverse graph.

Shape E: longest path in DAG (DP on topo order)

  • Optimisation on a DAG. "Longest increasing path in matrix" (LC 329), "Longest path in DAG": process nodes in topological order, take dp[v] = 1 + max(dp[u] for u in predecessors).
  • Why topo order matters: by the time you compute dp[v], every predecessor already has its final value. No re-computation, O(V + E) total.
  • Generalisation: counting paths, summing values along paths, critical-path scheduling: all DAG DP after a topological pass.
The recognition trick. The word "order" plus a dependency relation = topological sort. The word "rounds" or "semesters" or "parallel" = level-based topo (Shape C). The word "cycle" or "possible" with no required output order = cycle detection (Shape D). "Longest / count / sum along a path" on a DAG = DAG DP on topo order (Shape E). Most problems that look like "plain DFS" but constrain the order of visits collapse to one of these five.

3 · Sister algorithms: the topo-sort neighbourhood

Topological sort sits inside a small family of DAG algorithms. Each neighbour applies when plain topo doesn't quite fit: when ties matter, when timing matters, when parallel execution matters. Knowing the family prevents forcing Kahn's where a richer tool is the right one.

  • Kahn's algorithm (BFS, in-degree based). The default. Build an in-degree array, seed a queue with every zero-in-degree node, pop and decrement. Easy to reason about, easy to extend to lexicographic order (swap queue for min-heap) or level-based scheduling (process the whole queue per "round"). Detects cycles by counting emissions vs node count.
  • DFS-based topological sort (post-order reversed). Recurse from any unvisited node; on finish, prepend the node to the result. With a three-colour scheme (white / grey / black) you also detect cycles. Any edge to a grey node is a back edge. Shorter to write, harder to extend to tie-breaking or parallelism. Tarjan formalised this in 1976.
  • Critical Path Method (CPM). A two-pass DP on the topological order used in project scheduling. Forward pass computes the earliest start of every task; backward pass computes the latest start without delaying the project. Tasks where earliest == latest are on the critical path; delaying any of them delays the whole project. PERT charts are the visual form. Used in every construction-scheduling tool from MS Project to Primavera.
  • Layered topological sort (parallel execution). Group nodes by "level": node v's level is 1 + max(level of any predecessor). The number of levels equals the critical path length; nodes at the same level can run in parallel. Spark stage execution, GPU operator scheduling in TensorFlow / PyTorch, and Airflow's schedule_interval traversal all use this shape.
  • Longest path in a DAG. NP-hard on general graphs, polynomial on DAGs. After a topological pass, dp[v] = max(dp[u] for u in predecessors) + 1. Becomes the critical path with weighted nodes; becomes the LIS reduction when the DAG is "i < j and a[i] < a[j]".
  • Lexicographic topological sort. Replace Kahn's queue with a min-heap (Go's container/heap or Python's heapq). Each step extracts the smallest available node. O((V + E) log V). The standard requirement in LC 269 Alien Dictionary follow-ups and LC 1203 Sort Items by Groups.
Composition rule. Kahn's algorithm composes cleanly with anything that uses a priority queue (lex topo, weighted scheduling, "always pick the cheapest available next task"). DFS post-order composes cleanly with anything that uses the recursion stack as state (cycle path recovery, SCC via Kosaraju's two-pass). When the problem says "step by step" / "level by level", default to Kahn; when it says "find a cycle / find the cycle path", default to DFS three-colour.

4 · Canonical example: Course Schedule II (LC 210)

Problem. Given numCourses and a list of prerequisite pairs [a, b] meaning "to take course a you must first take course b", return any valid order in which to take all courses. If it's impossible (a cycle exists), return an empty list.

Input:  numCourses = 4, prerequisites = [[1,0],[2,0],[3,1],[3,2]]
Output: [0, 1, 2, 3]   (or [0, 2, 1, 3] — any valid order)
Explanation: 0 has no prereqs; 1 and 2 both depend on 0; 3 depends on both.

Kahn's algorithm. Build an adjacency list (edge from prereq to course) and an in-degree array. Push every node with in-degree 0 onto a queue; these are the courses with no prerequisites. Pop one at a time, append to the result, and decrement the in-degree of each neighbour. When a neighbour's in-degree hits 0, push it. If the final result has fewer than numCourses entries, the graph had a cycle and no valid order exists.

5 · Reference implementation in Go

The Go version uses a slice as a queue (cheaper than container/list for small N) and tracks in-degree in a slice indexed by node id.

// FindOrder returns a valid topological order of [0, numCourses),
// or an empty slice if a cycle exists.
//
// prerequisites[i] = [course, prereq] meaning to take 'course' you must
// first take 'prereq'. Edge direction: prereq -> course.
func findOrder(numCourses int, prerequisites [][]int) []int {
    adj := make([][]int, numCourses)
    indeg := make([]int, numCourses)

    for _, p := range prerequisites {
        course, prereq := p[0], p[1]
        adj[prereq] = append(adj[prereq], course)
        indeg[course]++
    }

    // Seed the queue with every node whose in-degree is zero.
    queue := make([]int, 0, numCourses)
    for i := 0; i < numCourses; i++ {
        if indeg[i] == 0 {
            queue = append(queue, i)
        }
    }

    order := make([]int, 0, numCourses)
    for len(queue) > 0 {
        node := queue[0]
        queue = queue[1:]               // pop front; for huge N use a ring buffer
        order = append(order, node)

        for _, nxt := range adj[node] {
            indeg[nxt]--
            if indeg[nxt] == 0 {
                queue = append(queue, nxt)
            }
        }
    }

    if len(order) != numCourses {
        return nil                       // cycle: some node never reached in-degree 0
    }
    return order
}
Always check the final length. The cycle detection doesn't happen during the loop; it happens at the end. If the result has fewer entries than the node count, some subset of nodes formed a cycle and never reached in-degree 0. Skipping this check returns a partial ordering and silently mis-answers any cyclic input.

6 · Variations

The shape of the queue and the bookkeeping changes by problem. Five common variations:

VariationWhat changesExample
Kahn vs DFSDFS does post-order and reverses; uses three colours (white / grey / black) to detect cycles during recursion.LC 210 either way
Lexicographic orderReplace the queue with a min-heap so the smallest available node is processed first.LC 269 Alien Dictionary, LC 1203
All topological ordersDFS with backtracking: at each step try every in-degree-0 node, recurse, undo. Exponential in the worst case.Interview follow-ups
Find one cycleDFS three-colour: when an edge points to a grey (in-progress) node, walk the recursion stack back to recover the cycle.LC 207 Course Schedule
Parallel / level-orderProcess all in-degree-0 nodes together as one "semester"; the number of levels is the minimum schedule length.LC 1136 Parallel Courses

7 · Kahn's algorithm, step by step

Kahn's algorithm is BFS-flavoured topological sort. Start with all zero-in-degree nodes in a queue; pop one, emit it, decrement neighbours, enqueue any neighbour whose in-degree hits zero. Watching it on a small DAG makes the algorithm's correctness clear. Every node enters the queue exactly once, exactly when its prerequisites are satisfied.

Ain:0Bin:0Cin:2→0Din:1→0Ein:1→0Fin:3→0Initial in-degreesA:0 B:0 C:2 D:1 E:1 F:3Queue starts:[A, B] (the two with in:0)Pop A — emit A; decrement C, Dqueue=[B, D] emit=[A]Pop B — emit B; decrement C, Equeue=[D, C, E] emit=[A, B]Pop D — emit D; decrement Fqueue=[C, E] emit=[A, B, D]Pop C, E — emit each; F→0queue=[F] emit=[A, B, D, C, E]Pop F — emit=[A, B, D, C, E, F] ✓
Why this terminates iff the graph is a DAG. Each zero-in-degree node we pop emits one element. We need to pop n nodes total to get a complete order. If at some point the queue is empty but we haven't emitted n nodes, the remaining graph has a cycle: every node in it has in-degree ≥ 1 because their predecessors are themselves cyclic. This is how Kahn's algorithm detects cycles for free: if len(emit) < n: cycle.

8 · Kahn vs DFS post-order

PropertyKahn (BFS)DFS post-order
Output orderTopological, ascending by depthTopological, reverse of finish time
Cycle detectionlen(emit) < n at endThree-colour DFS (back-edge = cycle)
Memory shapeQueue + in-degree arrayRecursion stack + colour array
Order determinismYes (with a sorted queue)Depends on iteration order
ParallelisableYes — all zero-in-degree nodes can run togetherNo — sequential recursion
Best whenYou want "as early as possible" orderingYou want lexicographically smallest via reverse-iteration

9 · Five-problem progression

#LCProblemWhy it's here
1LC 207Course ScheduleThe minimal version — only cycle detection, no order needed. Practise the in-degree mechanics.
2LC 210Course Schedule IIFull ordering. The canonical Kahn's algorithm exercise.
3LC 269Alien DictionaryThe graph is implicit — you build it from adjacent word comparisons. Easy to get the edge direction wrong.
4LC 1136Parallel CoursesLevel-by-level Kahn's: count semesters instead of producing an order. Returns -1 on a cycle.
5LC 2115Find All Possible Recipes from Given SuppliesMulti-source topological sort with seed inputs. The point where the pattern stops looking like a textbook problem.

10 · Detecting cycles: the most common variant

Topological sort exists if and only if the graph is a DAG. Use the same algorithm to detect cycles; the signal differs by approach.

Kahn's variant: after the BFS finishes, count the nodes placed in the order. If the count is less than N, there is a cycle — some nodes had non-zero in-degree throughout and were never enqueued.

DFS variant: classify each node as WHITE (unvisited), GRAY (on current path), or BLACK (fully processed). If DFS encounters a GRAY neighbour, that edge closes a cycle. The path from the GRAY node back to itself on the recursion stack is the cycle.

Kahn's is usually easier for cycle detection; DFS gives you the actual cycle path "for free" via the recursion stack. Course Schedule I (just detect cycle) maps cleanly to Kahn's; Find Eventual Safe States (LC 802) maps cleanly to DFS three-color.

11 · Multiple valid orderings: uniqueness and lexicographic

A topological order is rarely unique. Two common interview wrinkles:

"Return the lexicographically smallest topological order." Use Kahn's but with a min-heap instead of a queue. At each step, dequeue the smallest available source. O((V + E) log V).

"Is there only one valid order?" Run Kahn's; at every step, check that the queue contains exactly one element. If at any point there are two or more zero- in-degree nodes, the order is ambiguous (Sequence Reconstruction, LC 444).

"How many valid topological orders?" Computing this exactly is #P-complete in general. For small N (≤20), bitmask DP works: count[mask] = number of ways to order the prefix-set in mask. O(2^N × N).

12 · DAG-DP: the pattern beyond ordering

Many DAG problems are solved by processing nodes in topological order and doing DP on each. The pattern: order the nodes, then for each node, compute its DP value as a function of its predecessors' already-computed values.

Longest path in a DAG. Topo order, then dp[v] = 1 + max(dp[u] for u in predecessors). O(V + E). NP-hard for general graphs; DAGs are special.

Number of paths from source to destination in a DAG. Topo order, then count[v] = sum(count[u] for u in predecessors), with count[source] = 1.

Critical path in scheduling (PERT charts). Two passes — forward for earliest-start, backward for latest-start. Tasks where they are equal are on the critical path.

Parallel job scheduling. Topo order + level assignment (level[v] = max level of predecessors + 1). All nodes at the same level can run in parallel; the max level is the minimum number of rounds.

Whenever you see "edges between events", "x must happen before y", or "longest sequence of compatible things", probe whether the relation forms a DAG. If yes, topo + DP solves the problem in O(V + E).

13 · Worked problem A: LC 207 Course Schedule (cycle detection only)

Problem. Given numCourses and a list of prerequisite pairs [a, b] meaning "to take a you must first take b", return whether you can finish all courses. No order required: just a boolean.

Input:  numCourses = 2, prerequisites = [[1,0]]
Output: true
Explanation: take 0, then 1.

Input:  numCourses = 2, prerequisites = [[1,0],[0,1]]
Output: false
Explanation: 0 depends on 1, 1 depends on 0 — a cycle.

The cleanest answer is Kahn's algorithm with the count check, no result list kept. If the count of processed nodes is less than numCourses, a cycle exists.

func canFinish(numCourses int, prerequisites [][]int) bool {
    adj := make([][]int, numCourses)
    indeg := make([]int, numCourses)
    for _, p := range prerequisites {
        adj[p[1]] = append(adj[p[1]], p[0])
        indeg[p[0]]++
    }

    queue := make([]int, 0, numCourses)
    for i := 0; i < numCourses; i++ {
        if indeg[i] == 0 {
            queue = append(queue, i)
        }
    }

    processed := 0
    for len(queue) > 0 {
        node := queue[0]
        queue = queue[1:]
        processed++
        for _, nxt := range adj[node] {
            indeg[nxt]--
            if indeg[nxt] == 0 {
                queue = append(queue, nxt)
            }
        }
    }
    return processed == numCourses
}

The DFS three-colour alternative (recurse from each unvisited node, mark grey on entry, black on exit, return false if you hit a grey neighbour) is the same complexity and slightly fewer lines, but Kahn's is the answer to give first because the cycle-detection signal is more visible.

The interview pitch. "I'll use Kahn's algorithm: build the in-degree array, BFS over zero-in-degree nodes, decrement neighbours. If I processed every node, the graph is a DAG. If not, there's a cycle. O(V + E) time and space." Two sentences in, the interviewer already knows you recognised the pattern.

14 · Worked problem B: LC 210 Course Schedule II (return the order)

Problem. Same input as LC 207, but now return a valid order (any one) in which to take all courses. Return an empty list if it's impossible.

This is the canonical Kahn's exercise. The implementation is identical to Section 5 above: collect the popped node into the result slice. Two extension ideas worth practising once:

Extension 1: return all valid orderings (backtracking)

When the interviewer asks "what if you needed every valid topological order?", the answer is backtracking: at each step, try every available zero-in-degree node, recurse, then undo (reincrement the in-degrees you decremented).

// Enumerate all topological orderings. Exponential in the worst case
// (full enumeration of a permutation when no edges exist).
func allOrders(numCourses int, prerequisites [][]int) [][]int {
    adj := make([][]int, numCourses)
    indeg := make([]int, numCourses)
    for _, p := range prerequisites {
        adj[p[1]] = append(adj[p[1]], p[0])
        indeg[p[0]]++
    }

    var (
        out  [][]int
        cur  = make([]int, 0, numCourses)
    )

    var backtrack func()
    backtrack = func() {
        if len(cur) == numCourses {
            cp := make([]int, numCourses)
            copy(cp, cur)
            out = append(out, cp)
            return
        }
        for v := 0; v < numCourses; v++ {
            if indeg[v] == 0 {
                indeg[v] = -1            // mark as used
                for _, n := range adj[v] {
                    indeg[n]--
                }
                cur = append(cur, v)
                backtrack()
                cur = cur[:len(cur)-1]
                for _, n := range adj[v] {
                    indeg[n]++
                }
                indeg[v] = 0
            }
        }
    }
    backtrack()
    return out
}

Extension 2: lexicographically smallest order (min-heap)

Swap the FIFO queue for a min-heap. Each pop picks the smallest-numbered available node. Complexity becomes O((V + E) log V).

import "container/heap"

type intHeap []int
func (h intHeap) Len() int            { return len(h) }
func (h intHeap) Less(i, j int) bool  { return h[i] < h[j] }
func (h intHeap) Swap(i, j int)       { h[i], h[j] = h[j], h[i] }
func (h *intHeap) Push(x interface{}) { *h = append(*h, x.(int)) }
func (h *intHeap) Pop() interface{}   { old := *h; n := len(old); x := old[n-1]; *h = old[:n-1]; return x }

func findOrderLex(numCourses int, prerequisites [][]int) []int {
    adj := make([][]int, numCourses)
    indeg := make([]int, numCourses)
    for _, p := range prerequisites {
        adj[p[1]] = append(adj[p[1]], p[0])
        indeg[p[0]]++
    }

    h := &intHeap{}
    for i := 0; i < numCourses; i++ {
        if indeg[i] == 0 {
            heap.Push(h, i)
        }
    }

    order := make([]int, 0, numCourses)
    for h.Len() > 0 {
        node := heap.Pop(h).(int)
        order = append(order, node)
        for _, nxt := range adj[node] {
            indeg[nxt]--
            if indeg[nxt] == 0 {
                heap.Push(h, nxt)
            }
        }
    }
    if len(order) != numCourses {
        return nil
    }
    return order
}
Why this lex-smallest extension shows up. Build systems with deterministic outputs (Bazel's --stamp mode, npm with frozen lockfiles) need reproducible ordering. The "any valid order" Kahn's variant produces different output on every run depending on input order; the heap-based variant always produces the same output. The extra log V is the price of determinism.

15 · Worked problem C (HARD): LC 269 Alien Dictionary

Problem. Given a list of words sorted lexicographically in some unknown alien alphabet, return any valid alphabet ordering of the characters that appear in the words. Return an empty string if no valid ordering exists.

Input:  words = ["wrt", "wrf", "er", "ett", "rftt"]
Output: "wertf"
Explanation:
  Compare "wrt" vs "wrf": first differing char is t < f → edge t → f.
  Compare "wrf" vs "er":  first differing char is w < e → edge w → e.
  Compare "er" vs "ett":  r < t → edge r → t.
  Compare "ett" vs "rftt": e < r → edge e → r.
  Topological order: w → e → r → t → f.

This is a Tier-A problem because the harder half is building the graph from the input — not the topological sort itself. Three things to derive:

  • What are the nodes? Every distinct character that appears anywhere in the words. Build the node set from a single pass over all characters.
  • What are the edges? For each adjacent word pair (w1, w2), walk both strings together; at the first differing character, add edge w1[i] → w2[i]. That's the only ordering constraint these two words give you.
  • What's the edge case? If w1 is a proper prefix of w2 with len(w1) < len(w2), that's fine. But if w1 is longer and w2 is a prefix of w1 (like ["abc", "ab"]), the input is invalid: no alphabet ordering can make "abc" come before "ab" because they share all of "ab" and "abc" should come after.
func alienOrder(words []string) string {
    // STEP 1: collect nodes (every unique character).
    indeg := make(map[byte]int)
    for _, w := range words {
        for i := 0; i < len(w); i++ {
            if _, ok := indeg[w[i]]; !ok {
                indeg[w[i]] = 0
            }
        }
    }

    // STEP 2: build edges from adjacent word pairs.
    adj := make(map[byte]map[byte]bool)   // dedupe edges
    for i := 0; i+1 < len(words); i++ {
        w1, w2 := words[i], words[i+1]
        minLen := len(w1)
        if len(w2) < minLen {
            minLen = len(w2)
        }
        // INVALID PREFIX CASE
        if len(w1) > len(w2) && w1[:minLen] == w2[:minLen] {
            return ""
        }
        for j := 0; j < minLen; j++ {
            if w1[j] != w2[j] {
                if adj[w1[j]] == nil {
                    adj[w1[j]] = make(map[byte]bool)
                }
                if !adj[w1[j]][w2[j]] {
                    adj[w1[j]][w2[j]] = true
                    indeg[w2[j]]++
                }
                break                     // only the first diff matters
            }
        }
    }

    // STEP 3: standard Kahn's topological sort.
    queue := make([]byte, 0)
    for ch, d := range indeg {
        if d == 0 {
            queue = append(queue, ch)
        }
    }

    var out []byte
    for len(queue) > 0 {
        ch := queue[0]
        queue = queue[1:]
        out = append(out, ch)
        for nxt := range adj[ch] {
            indeg[nxt]--
            if indeg[nxt] == 0 {
                queue = append(queue, nxt)
            }
        }
    }

    if len(out) != len(indeg) {
        return ""                          // cycle
    }
    return string(out)
}
Why this is the senior-loop topo problem. Three failure modes: (1) wrong edge direction (w1[i] comes before w2[i], so edge is w1[i] → w2[i]; invert it and you topo-sort backwards); (2) missing the invalid-prefix edge case (["abc","ab"] should return ""); (3) double-counting edges (don't increment in-degree twice for the same edge; use a set to dedupe). Get any one wrong and the visible tests still pass while the hidden tests fail.

16 · Parallel courses: layered topological sort visualised

LC 1136 Parallel Courses changes the question from "what order" to "how few semesters". You can take any number of courses in a single semester as long as all their prerequisites were taken in earlier semesters. The answer is the number of levels in the layered topological sort.

Sem 1Sem 2Sem 3Sem 4ABCDEFGH8 nodes,8 edges.4 semesters= longest path= critical path.Sequential wouldtake 8 semesters.

The implementation is one tweak from Section 5: instead of processing one node per outer loop iteration, process the entire current queue as a "semester", then move on. Count semesters.

func minimumSemesters(n int, relations [][]int) int {
    adj := make([][]int, n+1)
    indeg := make([]int, n+1)
    for _, r := range relations {
        adj[r[0]] = append(adj[r[0]], r[1])
        indeg[r[1]]++
    }

    queue := make([]int, 0)
    for i := 1; i <= n; i++ {
        if indeg[i] == 0 {
            queue = append(queue, i)
        }
    }

    semesters := 0
    processed := 0
    for len(queue) > 0 {
        // Snapshot the current queue as "this semester's courses".
        nextQueue := make([]int, 0)
        for _, node := range queue {
            processed++
            for _, nxt := range adj[node] {
                indeg[nxt]--
                if indeg[nxt] == 0 {
                    nextQueue = append(nextQueue, nxt)
                }
            }
        }
        queue = nextQueue
        semesters++
    }

    if processed != n {
        return -1                          // cycle
    }
    return semesters
}
The level count is the critical path length. The number of semesters equals the longest chain of prerequisites, the critical path. Even with infinite parallelism, that chain bounds the schedule. Same insight drives CPM, Spark stage scheduling, and any framework that parallelises DAG execution.

17 · How to solve hard topological sort problems, step-by-step

The hardest topo problems aren't hard because Kahn's algorithm is hard. They're hard because the graph is implicit and you have to derive the nodes, edges, and direction from the input. Five questions that defuse most of them:

  1. What is the node, and what is the edge direction? The most common bug is an inverted edge — Course Schedule pairs are [course, prereq], so the edge from prereq → course points to "what's now enabled". Write the edge meaning out loud: "edge u → v means u must finish before v". If you can't articulate it, you don't have the graph right.
  2. Kahn's or DFS three-colour? Default to Kahn's — easier to reason about, easier to extend. Pick DFS only if (a) you need to recover the actual cycle path, or (b) the problem composes with DFS-based machinery (SCC via Kosaraju, post-order DP on trees that turn out to be DAGs).
  3. Does the problem need tie-breaking? If the answer must be "lexicographically smallest" or "earliest by id", swap the FIFO queue for a min-heap. Costs O(log V) per pop. Don't add this complexity unless asked.
  4. Is this layered (parallel) or strict (sequential)? If the question is "how many semesters / rounds / batches", process the queue level-by-level (Section 16). If the question is "what order", pop one at a time (Section 14). Same data structures, different outer loop.
  5. Does it need cycle detection or just an ordering? If the input is guaranteed a DAG, you can skip the final count check — but it costs nothing to keep it as defensive code. If the input can be cyclic, the count check is the answer to "does a valid ordering exist?". For both Course Schedule and Alien Dictionary, the cycle case is part of the problem statement.
The recipe. Node and edge direction → algorithm choice → tie-breaking → layered or strict → cycle handling. Do those five in order and the rest is bookkeeping. The hardest topo problems hide one or two of these decisions; recognising that you have to make them is the whole skill.

18 · Common pitfalls: what breaks

  • Forgetting the cycle check. A cycle means no topological order exists, but Kahn's loop terminates either way — it just produces a short result. Always compare len(order) against the node count at the end.
  • Incrementing in-degree on processing instead of decrementing. The in-degree counts how many prerequisites a node still has. When a prerequisite finishes, the count goes down; the node is ready when the count reaches 0.
  • Wrong edge direction — the #1 bug. "u depends on v" means the edge is v → u (v must come first). Course Schedule pairs are [course, prereq] in LC's convention; the edge is from p[1] to p[0]. Read carefully. Invert this and the topo sort returns a reversed ordering that silently passes the visible test cases.
  • Not handling multiple sources. Many real DAGs have several independent zero-in-degree nodes (multiple disconnected components, or several "roots"). Seed the queue with all of them in one pass, not just the first one you find. The "find any in-degree-0 node and start there" pattern from naive implementations misses the others entirely.
  • Ambiguity in tie-breaking when multiple orderings are valid. A plain FIFO queue gives one valid order, not the lexicographically smallest. If the problem demands a specific tie-break, use a min-heap or sort the per-level snapshot. Two correct topo orders can both be wrong answers to "the lex-smallest" question.
  • Missing nodes with zero edges (isolated nodes). A node with no prerequisites and no dependents still has in-degree 0 and must appear in the result. Iterate i := 0; i < n; i++ when seeding the queue — don't iterate over the edge list. Half of "wrong answer on small inputs" bugs are missing isolated nodes.
  • Self-loops are cycles of length 1. An edge u → u means u depends on itself — instant cycle. Kahn's handles this naturally (u's in-degree never reaches 0), but it's worth a sanity check: a single self-loop produces a result shorter than n, which the cycle check catches.
  • Duplicate edges inflate in-degree. If the input contains [a, b] twice, naively incrementing in-degree twice over-counts. Use a set per node (map[int]bool in Go) to dedupe edges as you build the adjacency list, or accept the cost and trust the count check.
  • The hidden "invalid input" case in Alien Dictionary. If word w1 has w2 as a prefix and len(w1) > len(w2), no alphabet ordering is consistent — return "". Easy to miss; tests for it specifically.
  • Recursion-depth blowups in DFS post-order. A long chain of 10⁴ nodes blows the default stack in many languages (Python's 1000-frame limit, JVM defaults). Convert to iterative post-order with an explicit stack or use Kahn's instead.

7 · What to say in the interview

  1. "Dependency ordering → topological sort." Name the pattern out loud. The interviewer wants to hear you recognise the shape.
  2. "Kahn's algorithm — BFS with in-degree counts." Be specific. Mention you'll seed the queue with every in-degree-0 node and decrement as you process.
  3. "DFS-based works too — post-order reversed." Show you know both. DFS is shorter; Kahn's extends better to parallel scheduling and lexicographic order.
  4. "Cycle detection falls out naturally." If the result is shorter than the node count, the graph has a cycle. No extra pass needed.
  5. "O(V + E) time, O(V + E) space." Adjacency list plus in-degree array. The queue holds at most V nodes.
  6. Edge cases to mention. Empty graph, single node, disconnected components (still works — all in-degree-0 nodes seed the queue), self-loop (a one-node cycle), duplicate edges (inflate in-degree if unguarded).

8 · Where this gets asked

CompanyCommon framingWhy it fits their stack
GoogleCourse Schedule (LC 207), Course Schedule II (LC 210), Alien Dictionary (LC 269)Alien Dictionary is the Google L4 signature topo problem — build the graph from word pairs, then sort.
MetaBuild a Matrix With Conditions (LC 2392), Sort Items by Groups Respecting Dependencies (LC 1203)Multi-level dependency ordering — applies to news-feed re-ranking and rollout-order constraints.
AmazonCourse Schedule (LC 207), Parallel Courses (LC 1136), Minimum Height Trees (LC 310)Build-order in CI/CD pipelines and package-manager resolution maps directly to topo sort.
MicrosoftCourse Schedule II (LC 210), Find Eventual Safe States (LC 802), Loud and Rich (LC 851)VS Code's task graph + Azure DevOps pipeline ordering — same shape as these problems.
AppleCourse Schedule (LC 207), Alien Dictionary (LC 269)The two canonical topo problems show up in iOS/Xcode build-system rounds at senior loops.
Databricks / SnowflakeQuery-plan DAG ordering, Spark stage execution, Airflow DAG validationTopological sort literally underpins how their schedulers fire tasks. Asked routinely in onsite design rounds.
Pattern of patterns. Three sub-shapes — (1) Kahn's algorithm (BFS): in-degree array + queue of zero-in-degree nodes (cycle detection falls out — if you can't process all N nodes, there's a cycle), (2) DFS post-order: recurse, prepend the node to the result on finish (use a visiting/visited tri-state to catch cycles), (3) build-the-graph-first problems (Alien Dictionary) where the harder half is deriving the edges from the input.

9 · Try it yourself

Course Schedule · LC 207
The cycle-detection use of topo sort. Hint: Kahn's algorithm — count in-degrees, queue zero-in-degree nodes, process by decrementing neighbours. If you processed fewer than N nodes, there's a cycle → return false.
Course Schedule II · LC 210
Return the actual ordering. Hint: same as LC 207, but collect the processed order. The list of nodes as they leave the queue IS a valid topo order. If it has fewer than N items, cycle → return empty list.
Alien Dictionary · LC 269
Build the graph from word pairs. Hint: for each adjacent pair, find the first differing character — that's the edge. Watch the edge case where word1 is a strict prefix of word2 but word1 is LONGER → invalid input, return "". Then run standard topo sort.
Minimum Height Trees · LC 310
Topo-sort-flavoured leaf trimming. Hint: not strictly topo sort, but the same in-degree mechanic. Repeatedly remove leaves (degree-1 nodes) until ≤ 2 nodes remain. Those are the centroids.
Stretch — Sort Items by Groups · LC 1203
Two-level topo sort. Hint: build a group-level graph AND an item-within-group graph. Topo-sort the groups, then sort items within each group, then concatenate. Items in group −1 get their own singleton group.
How to practise. Always write down what each node represents and what each EDGE represents BEFORE you start. "Edge u → v means u must come before v." If you can't say it cleanly, your graph orientation is probably backwards — and a topo sort on a reversed graph gives a reversed answer. Half of Alien Dictionary bugs are edge-direction errors.

Further reading

  • Kleinberg & Tardos — Algorithm Design, Section 3.6. Clean treatment of DAG ordering with the proof that any DAG has a topological order.
  • CLRS — Section 22.4. The DFS-based version with the colour-marking cycle detection. Worth reading once for the formal proof.
  • Adjacent: BFS. Kahn's algorithm is BFS with a different bookkeeping rule. The queue mechanics carry over directly.
  • Adjacent: DFS. The post-order-reversed variant lives here. Same three-colour scheme also detects cycles in general directed graphs.
  • Adjacent: Graph algorithms. Where shortest path, strongly connected components, and the rest of the graph toolbox live.
  • Semicolony: Algorithm visualizer. Step through Kahn's algorithm on small graphs and watch the in-degree array drain.
Found this useful?