06 / 20 · Tier B
Patterns / 06

BFS: breadth-first search

Queue plus visited set. Explore by distance: every node at distance k from the source is visited before any node at distance k + 1. This property is what makes BFS the right tool for unweighted shortest path and level-order traversal. The visited-set bug is the single most common reason it fails.


1 · Why BFS matters: the intuition

BFS is the algorithm shape behind every "shortest" question on an unweighted graph. The reason is structural, not clever. The queue invariant (pop in the order you pushed) forces exploration to expand layer by layer. Every node at distance k from the source is discovered before any node at distance k + 1. So the first time you reach a node, the distance you reached it at is the shortest.

DFS does not have this property. Trust DFS for "any path" or "all components"; trust BFS for "fewest moves" or "minimum steps". The difference is one line of code (queue vs stack) and one line of complexity reasoning (the visited-on-enqueue invariant), but the downstream guarantees are very different.

There is a second reason BFS is worth knowing cold. Most "graph" problems on LeetCode are not given as graphs. They are given as grids, strings, locks, puzzles, words. The graph is implicit. Nodes are states, edges are valid transitions. BFS is the tool that turns a state-space problem into a shortest-path problem in three lines of Go.

The queue invariant in one sentence. If everything in the queue has distance d or d + 1, and you only enqueue children of d-distance nodes, the queue stays monotone in distance. This is the proof that BFS finds shortest paths. Lose the invariant (by allowing re-enqueues, or by mixing edge weights) and BFS silently becomes wrong, not slow.

2 · How to recognise it

BFS is the right pattern when any of the following hold. The first two are the strongest signals; the rest support. Each row is a sub-pattern worth recognising on its own.

  • Shortest path in an unweighted graph. The defining use case. If edge weights are uniform (or absent), BFS gives the shortest path in O(V + E). No priority queue, no Dijkstra overhead.
  • Level-order traversal of a tree or graph. Anything that asks "what's at depth k" or "process nodes layer by layer" is BFS by design. Snapshot the queue size at the start of each iteration to get level boundaries.
  • Flood fill, spread, connected components on a grid. Number-of-islands, flood-fill, paint-bucket. The implicit graph is the grid with 4- or 8-directional edges and uniform weight.
  • Multi-source BFS, the rotting-oranges shape. When the question is "minimum distance from any source to every cell", seed every source onto the queue at distance 0. The first time BFS reaches a cell, the distance recorded is the minimum across all sources. Cheaper than running BFS from each source separately.
  • BFS with state: key states, time states, parity. The node is no longer just a grid cell. It's (cell, set_of_keys_collected), or (cell, time_mod_k), or (cell, bitmask_of_visited). The graph blows up in size but BFS still works as long as the state captures everything that affects future moves.
  • 0-1 BFS with a deque. When edges have only two weights (0 or 1), Dijkstra is overkill. Use a deque: zero-weight edges go to the front, one-weight to the back. The queue stays monotone; total cost is O(V + E). Faster and simpler than Dijkstra with a heap.
  • "Minimum number of steps" or "fewest moves" framed as a puzzle. Word ladders, lock combinations, sliding puzzles. The graph is implicit; nodes are puzzle states; BFS finds the shortest sequence of moves.
BFS vs DFS in one line. BFS finds the closest match; DFS finds a match. If the question is "shortest" or "fewest", reach for BFS. If the question is "does any path exist" or "find all of X", DFS is usually simpler. When in doubt, name the goal out loud: "shortest" is BFS, "any" is DFS.

3 · Sister algorithms

BFS is the base. A handful of close relatives extend it to slightly different problem shapes. Knowing the boundary saves you from reaching for the wrong tool.

AlgorithmWhen to reach for itCost
Plain BFSUnweighted graph, shortest path or level-order.O(V + E)
Bidirectional BFSYou know both endpoints. High branching factor (word-ladder, social graph). Searches from both ends, meets in the middle.~O(bd/2)
0-1 BFS (deque)Edge weights are exactly 0 or 1. Push zero-weight edges to the front, one-weight to the back.O(V + E)
DijkstraArbitrary non-negative edge weights. BFS generalised — the queue is a priority queue keyed by distance.O((V + E) log V)
A* searchYou have a heuristic that estimates remaining distance to the target. Dijkstra biased toward the goal.O(E) best, O(bd) worst
Multi-source BFSMultiple sources, you want the distance from the nearest source to each node. Seed every source at distance 0.O(V + E)
The hierarchy. Multi-source BFS is plain BFS with a different seed. 0-1 BFS is plain BFS with a deque. Dijkstra is plain BFS with a priority queue. A* is Dijkstra with a heuristic. Each step up the ladder costs you either complexity or implementation effort. Don't take the step until you actually need it.

4 · Canonical example: Number of Islands (LC 200)

Problem. Given a 2D grid of '1' (land) and '0' (water), count the number of distinct islands. An island is a maximal connected group of land cells, 4-directionally adjacent.

Input:  [["1","1","0","0","0"],
         ["1","1","0","0","0"],
         ["0","0","1","0","0"],
         ["0","0","0","1","1"]]
Output: 3
Explanation: top-left 2×2 block, the centre cell, the bottom-right pair.

Walk the grid. Every time you hit a land cell that hasn't been visited, that's a new island. BFS from there to mark every land cell reachable as visited, then continue scanning. Each cell is visited at most once across the entire algorithm; the cost is O(rows × cols).

5 · Reference implementation

Two versions of the same algorithm: Go and Python. The Go version is the one to write in interviews; the Python version is faster to read. Both implement the same Number of Islands logic: walk the grid, BFS from each unvisited land cell, count the BFS launches.

Go: interview-ready

func numIslands(grid [][]byte) int {
    if len(grid) == 0 {
        return 0
    }
    rows, cols := len(grid), len(grid[0])
    seen := make([][]bool, rows)
    for r := range seen {
        seen[r] = make([]bool, cols)
    }
    dirs := [4][2]int{{-1, 0}, {1, 0}, {0, -1}, {0, 1}}
    count := 0

    bfs := func(r0, c0 int) {
        type cell struct{ r, c int }
        q := []cell{{r0, c0}}
        seen[r0][c0] = true
        for len(q) > 0 {
            cur := q[0]
            q = q[1:]
            for _, d := range dirs {
                nr, nc := cur.r+d[0], cur.c+d[1]
                if nr < 0 || nr >= rows || nc < 0 || nc >= cols {
                    continue
                }
                if seen[nr][nc] || grid[nr][nc] != '1' {
                    continue
                }
                seen[nr][nc] = true        // mark visited ON ENQUEUE
                q = append(q, cell{nr, nc})
            }
        }
    }

    for r := 0; r < rows; r++ {
        for c := 0; c < cols; c++ {
            if grid[r][c] == '1' && !seen[r][c] {
                count++
                bfs(r, c)
            }
        }
    }
    return count
}

Python: for reading and reasoning

from collections import deque

def num_islands(grid: list[list[str]]) -> int:
    if not grid: return 0
    rows, cols = len(grid), len(grid[0])
    seen = set()
    count = 0
    DIRS = [(-1, 0), (1, 0), (0, -1), (0, 1)]   # up, down, left, right

    def bfs(r0, c0):
        q = deque([(r0, c0)])
        seen.add((r0, c0))
        while q:
            r, c = q.popleft()
            for dr, dc in DIRS:
                nr, nc = r + dr, c + dc
                if (0 <= nr < rows and 0 <= nc < cols
                        and grid[nr][nc] == '1'
                        and (nr, nc) not in seen):
                    seen.add((nr, nc))      # mark visited ON ENQUEUE, not dequeue
                    q.append((nr, nc))

    for r in range(rows):
        for c in range(cols):
            if grid[r][c] == '1' and (r, c) not in seen:
                count += 1
                bfs(r, c)
    return count
Mark visited on enqueue, not on dequeue. The single most common BFS bug. If you mark when popping from the queue, the same node gets enqueued many times (once for each neighbour that visits it before it's dequeued) and the queue blows up exponentially on dense graphs. Mark the moment you add to the queue.

6 · Variations

The shape of the queue and what you track changes by problem type. Six common variations:

VariationWhat changesExample
Level-order treeProcess the queue in level batches: snapshot the queue size at the start of each iteration.LC 102 Binary Tree Level Order
Shortest distance gridTrack distance per cell in a parallel grid or as a tuple in the queue.LC 994 Rotting Oranges
Multi-source BFSPush every source onto the queue at the start. Distance from the nearest source falls out for free.LC 994 Rotting Oranges, LC 286 Walls and Gates
0-1 BFSUse a deque: zero-weight edges go to the front, one-weight to the back. O(V+E) shortest path with two edge weights.LC 1368 Min Cost to Make Path
Bidirectional BFSBFS from both ends simultaneously; meet in the middle. Roughly halves the explored frontier.LC 127 Word Ladder optimisation
State-space BFSNodes are states (tuples, strings, bitmasks), not coordinates. Generate neighbours on demand.LC 752 Open the Lock, LC 773 Sliding Puzzle

7 · Worked example: 01 Matrix (LC 542)

Problem. Given a binary matrix of 0s and 1s, return a matrix where each cell holds the distance to the nearest 0. Distance is measured in 4-directional steps.

The naive approach (BFS from every 1 cell) is O((rows × cols)2). The trick: multi-source BFS. Seed the queue with every 0 at distance 0, then BFS outward. Each cell is dequeued exactly once, and the first time it's reached, the distance recorded is the shortest distance to some zero, which is the answer.

func updateMatrix(mat [][]int) [][]int {
    rows, cols := len(mat), len(mat[0])
    dist := make([][]int, rows)
    for r := range dist {
        dist[r] = make([]int, cols)
    }

    type cell struct{ r, c int }
    queue := []cell{}

    // SEED: every zero starts at distance 0; every one starts as "unknown" (-1)
    for r := 0; r < rows; r++ {
        for c := 0; c < cols; c++ {
            if mat[r][c] == 0 {
                queue = append(queue, cell{r, c})
            } else {
                dist[r][c] = -1
            }
        }
    }

    dirs := [4][2]int{{-1, 0}, {1, 0}, {0, -1}, {0, 1}}
    for len(queue) > 0 {
        cur := queue[0]
        queue = queue[1:]
        for _, d := range dirs {
            nr, nc := cur.r+d[0], cur.c+d[1]
            if nr < 0 || nr >= rows || nc < 0 || nc >= cols {
                continue
            }
            if dist[nr][nc] != -1 {
                continue            // already finalised
            }
            dist[nr][nc] = dist[cur.r][cur.c] + 1
            queue = append(queue, cell{nr, nc})
        }
    }
    return dist
}
Why multi-source works as one BFS. All sources are at distance 0 at the same time. The queue invariant (everything in the queue is at distance d or d + 1) holds across sources just as it does within a single source. Whichever source happens to reach a target cell first wins, and BFS guarantees the first reach is the shortest.

8 · Worked example: Rotting Oranges (LC 994)

Problem. A grid contains 0 (empty), 1 (fresh orange), 2 (rotten orange). Every minute, every fresh orange adjacent to a rotten one becomes rotten. Return the minimum minutes until no fresh oranges remain, or -1 if some fresh orange can never be reached.

This is multi-source BFS with level tracking. Seed the queue with every rotten orange. Process one level per minute. After BFS completes, scan the grid. If any fresh orange remains, return -1.

func orangesRotting(grid [][]int) int {
    rows, cols := len(grid), len(grid[0])
    type cell struct{ r, c int }
    queue := []cell{}
    fresh := 0

    for r := 0; r < rows; r++ {
        for c := 0; c < cols; c++ {
            switch grid[r][c] {
            case 2:
                queue = append(queue, cell{r, c})
            case 1:
                fresh++
            }
        }
    }

    if fresh == 0 {
        return 0                   // nothing to rot — already done
    }

    dirs := [4][2]int{{-1, 0}, {1, 0}, {0, -1}, {0, 1}}
    minutes := 0

    for len(queue) > 0 && fresh > 0 {
        // LEVEL BOUNDARY — snapshot the size of THIS minute's frontier
        size := len(queue)
        for i := 0; i < size; i++ {
            cur := queue[0]
            queue = queue[1:]
            for _, d := range dirs {
                nr, nc := cur.r+d[0], cur.c+d[1]
                if nr < 0 || nr >= rows || nc < 0 || nc >= cols {
                    continue
                }
                if grid[nr][nc] != 1 {
                    continue
                }
                grid[nr][nc] = 2   // mark rotten on enqueue
                fresh--
                queue = append(queue, cell{nr, nc})
            }
        }
        minutes++
    }

    if fresh > 0 {
        return -1                  // unreachable fresh oranges
    }
    return minutes
}
Snapshot the queue size, then loop that many times. The level-tracking idiom. Without the snapshot, the inner loop would mix cells from minute m with cells just enqueued at minute m + 1, and the timer would lose meaning. The pattern generalises to every "BFS by level" problem: for size := len(q); size > 0; size-- on the outer loop, then advance the counter.

9 · Hard worked example: Open the Lock (LC 752)

Problem. You have a 4-wheel lock; each wheel shows a digit 0-9. From "0000", reach target in the fewest moves. Each move rotates one wheel up or down by one. Some combinations are "deadends"; you cannot pass through them.

This is BFS in state space. There is no grid. Nodes are strings like "5310"; neighbours are the 8 strings reachable by rotating one wheel by ±1. The deadends are forbidden nodes. BFS finds the shortest sequence of moves.

func openLock(deadends []string, target string) int {
    dead := make(map[string]bool, len(deadends))
    for _, d := range deadends {
        dead[d] = true
    }
    if dead["0000"] {
        return -1                  // can't even start
    }
    if target == "0000" {
        return 0
    }

    seen := map[string]bool{"0000": true}
    queue := []string{"0000"}
    steps := 0

    for len(queue) > 0 {
        steps++
        size := len(queue)
        for i := 0; i < size; i++ {
            cur := queue[0]
            queue = queue[1:]
            // 8 neighbours: each of 4 wheels rotated up or down
            for w := 0; w < 4; w++ {
                for _, delta := range []int{-1, 1} {
                    nb := rotate(cur, w, delta)
                    if dead[nb] || seen[nb] {
                        continue
                    }
                    if nb == target {
                        return steps
                    }
                    seen[nb] = true
                    queue = append(queue, nb)
                }
            }
        }
    }
    return -1
}

func rotate(s string, w, delta int) string {
    b := []byte(s)
    d := int(b[w]-'0') + delta
    if d < 0 {
        d = 9
    } else if d > 9 {
        d = 0
    }
    b[w] = byte('0' + d)
    return string(b)
}

The state space has 10⁴ = 10,000 nodes, each with 8 neighbours, small enough that plain BFS runs instantly. For larger lock problems (more wheels, more digits) the state space explodes; that's when you reach for bidirectional BFS or pruning by symmetry.

The state-space recipe. (1) Encode the state (string, tuple, bitmask), anything hashable. (2) Generate neighbours on demand: a function from state to slice-of-state. (3) Track visited by state, not by some derived attribute. (4) BFS on this implicit graph exactly as you would on an explicit one. Word Ladder (LC 127), Sliding Puzzle (LC 773), and Bus Routes (LC 815) are all the same shape.

10 · Five-problem progression

#LCProblemWhy it's here
1LC 102Binary Tree Level Order TraversalThe minimal BFS. Snapshot queue size to get level boundaries.
2LC 200Number of IslandsGrid BFS plus connected components. The canonical "mark visited on enqueue" exercise.
3LC 994Rotting OrangesMulti-source BFS. Push all sources at t=0; track minutes elapsed.
4LC 127Word LadderState-space BFS. Pattern-bucket neighbour generation; bidirectional BFS as the optimisation.
5LC 773Sliding PuzzleBFS over board states encoded as strings. The point where state-space BFS earns its keep.

11 · How to solve a hard BFS problem, step by step

Hard BFS problems are usually not hard because the BFS is hard. They are hard because the modelling is. Five questions that, answered correctly, collapse the hard problem to the easy one.

  1. What is a node? A raw grid cell? A tuple of (cell, keys_collected)? A string for a puzzle state? A bitmask of visited cities (TSP-on-small-graph)? The most senior part of the problem is picking the smallest state that still captures everything needed for the next decision.
  2. When do you mark a node visited: before pushing or after popping? Always on enqueue. If you mark on dequeue, the same node can be pushed many times by different parents and the queue blows up. This is the single most common BFS bug.
  3. Do you need to track levels? If the answer is "fewest steps" or "minimum time", yes: snapshot the queue size at the top of each outer-loop iteration, then loop exactly that many times. If the answer is "reachable" or "shortest path with explicit distance per cell", you can carry distance in the queue entry instead.
  4. Should this be multi-source? If the question is "minimum distance from any source to every node", seed every source at distance 0 in one BFS instead of running BFS from each source separately. Comes up constantly: Rotting Oranges, 01 Matrix, Walls and Gates, Shortest Bridge.
  5. Should this be 0-1 BFS or bidirectional? Two weights only (0 and 1), reach for a deque, not Dijkstra. Both endpoints known and branching factor large, bidirectional BFS halves the explored frontier. Otherwise plain BFS is fine.
The order matters. Get the node right first; only then write the BFS skeleton. Working out the BFS code before nailing the state is the fastest way to waste 20 minutes of an interview.

12 · What breaks: the deep list

  • Marking visited on dequeue, not enqueue. The classic. Same node enqueued multiple times by different parents; queue grows quadratic or worse. Correctness usually still holds (BFS still finds the shortest reach), but the performance gap is enormous on dense graphs.
  • Adding to the queue twice without checking visited. A variant of the same bug. After enqueueing a neighbour, mark it visited immediately in the same step, not after popping. Otherwise a third parent can re-enqueue it before its own pop.
  • Off-by-one in level count. Incrementing the level counter inside the inner loop instead of after the level boundary. Or starting at 1 when the problem expects 0. Or returning steps when the problem wants steps + 1. Always trace through with the smallest input (single source, single target adjacent) and check the answer matches.
  • Multi-source forgetting to seed all sources at once. If you BFS from one source, complete it, then BFS from the next, you've lost the multi-source efficiency. Push every source onto the queue before the first pop. Distances naturally interleave; the first reach wins.
  • State space too large to fit in memory. If the state is a 25-character puzzle string, you have 25! possible states, 15 quintillion. Plain BFS will OOM long before completing. Either prune symmetric states, use bidirectional BFS, or switch to IDA*.
  • Forgetting the visited set on a cyclic graph. BFS without visited runs forever. The grid version often hides this because reusing visited cells looks like inefficiency until you realise it's an infinite loop.
  • Using a slice instead of a real deque. Go's queue = queue[1:] is O(1) amortised (slice header is small), but it leaks the front of the backing array. For very long-running BFS use a ring buffer or container/list. Python's list.pop(0) is O(n): always collections.deque.
  • Confusing level boundaries. Snapshot the queue size at the start of each iteration, then loop that many times. Otherwise levels bleed together and the distance count goes wrong.
  • Using BFS on a weighted graph. BFS only finds shortest path when edges have uniform weight. Mixed weights call for Dijkstra; that's a different pattern.
  • Building the full path inefficiently. If you need the path (not just the distance), track parent pointers per node and reconstruct at the end. Storing the path in each queue entry copies O(V²) data.

13 · Complexity, side by side

BFS and DFS share asymptotic time but diverge in shape and space. The right pick depends on what the problem actually asks.

AspectBFSDFS
TimeO(V + E)O(V + E)
SpaceO(w) — max frontier widthO(h) — max recursion depth
Shortest path (unweighted)Yes — first reach = shortestNo — finds a path, not the shortest
Topological orderYes — Kahn's algorithmYes — reverse post-order
Pathological caseWide graph — frontier explosionDeep chain — stack overflow at ~10k
Cache localityWorse — fans out across nodesBetter — stays on one subtree
Iterative natural fitYes — explicit queueRecursive (or explicit stack)
The non-obvious rule. Use BFS when the problem says "shortest" or "minimum steps" or "fewest moves." Use DFS for everything else. Matching the prompt to the right one is the first thing the interviewer is listening for.

14 · The wavefront, made visual

BFS expands in concentric layers from the source. Every node is visited exactly once, and the moment it's reached, you know the shortest distance from the source: the layer number it landed in.

15 · Three BFS variants worth knowing cold

Plain BFS gets you most points, but three small variants come up often enough that interviewers expect you to recognise them by name. Each one is BFS with a tweak to the queue or the state.

A · Reconstruct the path (not just the distance)

Track a parent pointer per visited node; walk it back from the target at the end. Costs O(V) extra memory; doesn't change asymptotic time.

from collections import deque

def shortest_path(adj, src, dst):
    parent = {src: None}
    q = deque([src])
    while q:
        u = q.popleft()
        if u == dst: break
        for v in adj[u]:
            if v not in parent:
                parent[v] = u
                q.append(v)
    if dst not in parent: return None
    # Walk back
    path = []
    cur = dst
    while cur is not None:
        path.append(cur)
        cur = parent[cur]
    return path[::-1]

Why not store the path in each queue entry? Because each queue entry would carry an O(V) list and the total memory becomes O(V²). Parent pointers are O(V) total.

B · 0-1 BFS: two-weight shortest path in O(V + E)

When edges have weight 0 or 1 (no other values), Dijkstra is overkill; a deque does the job. Zero-weight edges go to the front; one-weight edges go to the back. The queue stays monotonic in distance.

from collections import deque

def zero_one_bfs(adj, src, n):
    INF = float('inf')
    dist = [INF] * n
    dist[src] = 0
    q = deque([src])
    while q:
        u = q.popleft()
        for v, w in adj[u]:    # w is 0 or 1
            if dist[u] + w < dist[v]:
                dist[v] = dist[u] + w
                if w == 0:
                    q.appendleft(v)
                else:
                    q.append(v)
    return dist

Comes up in problems like "minimum cost to flip cells to make a valid path" (LC 1368) or any grid where some moves are free and others cost one step. Don't reach for Dijkstra unless weights are arbitrary.

C · Bidirectional BFS: meet in the middle

Search from both source and target simultaneously, alternating which frontier you expand. Termination is when the two frontiers intersect. Roughly halves the explored area for problems where the branching factor is high (Word Ladder is the canonical case).

def bidirectional_bfs(adj, src, dst):
    if src == dst: return 0
    front, back = {src}, {dst}
    seen = {src, dst}
    steps = 0
    while front and back:
        steps += 1
        if len(front) > len(back):           # always expand the smaller frontier
            front, back = back, front
        next_front = set()
        for u in front:
            for v in adj[u]:
                if v in back: return steps   # frontiers met
                if v not in seen:
                    seen.add(v)
                    next_front.add(v)
        front = next_front
    return -1                                 # disconnected

The win: if BFS would have explored bd nodes (branching factor b, depth d), bidirectional explores ~2 × bd/2. For Word Ladder with ~26-ish branching, that's the difference between "times out" and "passes."

16 · Real-world BFS, in production code

Network routing — IS-IS, OSPF.
Link-state routing protocols flood topology updates via BFS-style waves. Each router receives, marks, and re-floods. The "shortest path tree" each router computes is a Dijkstra extension of BFS for weighted graphs.
Web crawlers.
Google's first crawler was BFS by design — start with a seed URL, fetch it, enqueue every link, repeat. Better breadth and less hostile to sites than DFS. Modern crawlers add priority queues but the skeleton is BFS.
Social-graph "degrees of separation."
LinkedIn's "1st / 2nd / 3rd connection" labels are literally BFS distance, run server-side per query against the friend graph in a sharded key-value store.
Game AI — shortest-path movement.
Grid-based games (Civilization, Pokémon, RTS games) use BFS for unit movement reach. A* extends it with a heuristic when the grid is huge.
Word-ladder + spell-check.
The "transform CAT into DOG by single-letter changes" canon is BFS in word-space. The same machinery underlies suggestion systems walking an edit-distance graph.
Stat worth knowing. Twitter's "who follows whom" BFS at L1+L2 is the canonical hot query on the timeline service. Average user has ~700 followers, so L2 hits ~500k nodes. Production uses bounded-depth BFS plus a Bloom filter for de-duplication.

17 · Where this gets asked

CompanyCommon framingWhy it fits their stack
GoogleShortest Path in Binary Matrix (LC 1091), Word Ladder (LC 127), Open the Lock (LC 752)Maps routing and indexing.
MetaNumber of Islands BFS (LC 200), Walls and Gates (LC 286), Rotting Oranges (LC 994)Social graph distance queries.
Amazon01 Matrix (LC 542), Bus Routes (LC 815), Shortest Bridge (LC 934)Warehouse pathfinding, last-mile routing.
MicrosoftWord Ladder II (LC 126), Shortest Path Visiting All Nodes (LC 847)Office collab + spelling suggestions in word-space.
Uber / Lyft / DoorDashShortest path on the road graph, time-bounded reachabilityRouting engines start with BFS, layer on Dijkstra.
LinkedInDistance between two users, bounded BFSThe "2nd / 3rd connection" feature.
Pattern of patterns. Three sub-shapes: (1) plain queue + distance map, (2) multi-source BFS (seed with all sources at dist 0, e.g. Walls and Gates), (3) BFS with state encoding (state is more than the node, e.g. Open the Lock). Master those three and most BFS questions reduce to adaptations.

18 · Try it yourself

Binary Tree Level Order · LC 102
The canonical BFS warm-up. Queue of nodes, process layer-by-layer. Hint: when you pop a level, snapshot len(queue) first: that's the count for this layer.
Rotting Oranges · LC 994
Multi-source BFS. Seed the queue with every initially-rotten cell. Hint: count fresh oranges up front; if any remain unreachable, return -1.
Word Ladder · LC 127
BFS in word-space; neighbours are words one letter different. Hint: precompute a pattern map ({"h*t": ["hot","hat"]}) so neighbour lookup is O(L) not O(N · L).
Stretch — Shortest Path Visiting All Nodes · LC 847
BFS with state = (current node, bitmask of visited). Hint: works because N ≤ 12 lets you bitmask; this is the "TSP on small graph" pattern.
How to practise. First pass — write it from scratch. Second pass — explain it aloud while you write. Third pass — rewrite from memory. By the third pass the queue/visited machinery is in your fingers, and the interview question becomes "what is the state?" rather than "how do I BFS?"

19 · State design: what counts as a "node"

The single hardest part of state-space BFS is deciding what to put in the node. Pick too little and the BFS loops (because two different situations look like the same state). Pick too much and the state space explodes. Six rules of thumb from canonical problems:

Problem familyWhat the node isWhy
Grid BFS, no constraints (LC 200)(row, col)The grid cell IS the state; no additional context affects future moves.
Grid BFS with keys / locks (LC 864)(row, col, bitmask_of_keys)The same cell visited with different keys is a different node — different doors are passable.
Grid BFS with limited "jump" budget (LC 1293)(row, col, k_remaining)Reaching the same cell with more budget left is strictly better.
Word ladder (LC 127)The word string itselfTwo words = two nodes; transitions are single-character substitutions.
Open the Lock (LC 752)The 4-digit stringThe whole lock state is the node.
Sliding Puzzle (LC 773)String encoding of the board, plus blank positionState must capture every tile; the blank-position field accelerates neighbour generation.
TSP-style "visit all" (LC 847)(current_node, bitmask_of_visited)The same node visited with different remaining-to-visit sets is a different problem.
The rule of thumb. The state must include everything that affects future transitions and nothing else. A useful test: "if I have two nodes with the same state but different histories, do they lead to identical futures from here?" If yes, the state is right. If no, you're missing a field. If you can prove some field doesn't affect futures, drop it — state-space size is exponential in the number of fields.

20 · BFS, Dijkstra, A*: the unified family

BFS, Dijkstra, and A* are the same algorithm with progressively richer queue semantics. Knowing the family makes it easy to generalise when the constraints change mid-interview.

AlgorithmQueue typeEdge weight modelHeuristicShortest path?
BFSFIFO queueUniform (unweighted)NoneYes — by layer
0-1 BFSDeque (push-front for 0, push-back for 1)Exactly 0 or 1NoneYes
DijkstraMin-priority queue keyed by distanceNon-negative arbitraryNoneYes
A*Min-priority queue keyed by g(n) + h(n)Non-negative arbitraryAdmissible (never overestimates remaining cost)Yes (if h is admissible)
Bellman-FordRepeated relaxation, V−1 iterationsCan be negative; detects negative cyclesNoneYes
Floyd-WarshallTriply-nested loop over all node pairsCan be negative; no negative cyclesNoneAll-pairs shortest path
The mental rule. Edge weights are all the same → BFS. Two distinct weights → 0-1 BFS. Arbitrary non-negative weights → Dijkstra. Plus a heuristic on top → A*. Negative weights → Bellman-Ford. The first three share 90% of their code; the difference is what data structure backs the queue.

21 · Bidirectional BFS: when to reach for it

Bidirectional BFS searches from both endpoints simultaneously, alternating which frontier you expand. Termination is when the two frontiers intersect. For a problem with branching factor b and shortest-path length d, ordinary BFS explores roughly bd nodes; bidirectional explores roughly 2 × bd/2 — exponentially fewer.

Three preconditions. (1) Both endpoints must be known up front; you can't search backward from "any target". (2) The graph must be explorable in reverse: for undirected graphs this is trivial; for directed graphs you need the reverse adjacency list. (3) The branching factor needs to be large enough that the bd/2 savings outweigh the bookkeeping of two queues plus two visited sets.

func bidirectionalBFS(adj map[string][]string, src, dst string) int {
    if src == dst {
        return 0
    }
    front := map[string]bool{src: true}
    back := map[string]bool{dst: true}
    seen := map[string]bool{src: true, dst: true}
    steps := 0

    for len(front) > 0 && len(back) > 0 {
        steps++
        // Always expand the SMALLER frontier — keeps total work down
        if len(front) > len(back) {
            front, back = back, front
        }
        nextFront := map[string]bool{}
        for u := range front {
            for _, v := range adj[u] {
                if back[v] {
                    return steps           // frontiers met — done
                }
                if !seen[v] {
                    seen[v] = true
                    nextFront[v] = true
                }
            }
        }
        front = nextFront
    }
    return -1                              // disconnected
}
Word Ladder is the canon. For a 5-letter word ladder over a dictionary of 10,000 words, the branching factor at each step is ~25 (replace one letter with one of 25 others). Plain BFS for a 6-step ladder explores ~256 = ~244M candidates; bidirectional explores ~2 × 253 = ~31k. The difference between "times out" and "passes in 50ms".

22 · What to say in the interview

  1. "Unweighted shortest path → BFS." Name the pattern out loud. The interviewer wants to hear you recognise the shape.
  2. "Queue plus visited." State the two pieces. Then specify: mark visited on enqueue.
  3. "Each node is enqueued at most once." Justifies the O(V + E) time bound.
  4. "Space is O(V) for visited plus O(V) worst-case queue." State both. Worst case is when the graph fans out wide (a star with V−1 leaves all enqueued at once).
  5. "For levels, snapshot the queue size before each iteration." Show you know the level-order idiom.
  6. Edge cases to mention. Empty input, single node, disconnected graph (multiple BFS calls needed), self-loops, large grids (memory for visited).

Further reading

  • CLRS — Chapter 22. The textbook treatment. Worth reading once for the proof of the O(V + E) bound.
  • Sedgewick & Wayne — Algorithms, Chapter 4. Gentler treatment with cleaner code examples.
  • Adjacent: DFS. The other half of graph traversal. When the question is "any path" rather than "shortest", DFS is usually simpler.
  • Adjacent: Topological sort. Kahn's algorithm is BFS with in-degree counts.
  • Adjacent: Graph algorithms. Where Dijkstra, Bellman-Ford, and the weighted shortest-path algorithms live.
  • Semicolony: Pathfinding simulator. Visualises BFS, Dijkstra, A*. Worth a few minutes to see BFS expand in concentric waves.
Found this useful?