BFS vs DFS Traversal Simulator: queue or stack, the only difference.

BFS and DFS visit every node of a graph in O(V + E); the only thing that changes is the frontier. Toggle the mode on the same graph and watch how a queue for BFS, a stack for DFS reshapes the order. The algorithm is identical otherwise.

Mode

BFS

Visited

0/12

Frontier

Mode

Start node

Run

Graph

Frontier (queue · FIFO)

—

next pop ← head

Traversal order

—

The only real difference

BFS and DFS share identical code except for one thing: which end of the frontier you pop from. BFS uses a queue (pop the head — FIFO), DFS uses a stack (pop the top — LIFO). That single choice reshapes the entire traversal.

When BFS

Shortest path in an unweighted graph (count edges). Level-order traversal of a tree. Finding all nodes within k hops. Connected-components on a small/medium graph.

When DFS

Topological sort, cycle detection, finding strongly-connected components (Tarjan/Kosaraju), maze generation, recursion-friendly tree problems. Path existence when shortest is not required.

Memory

BFS memory grows with the width of the level (worst case: O(branching^depth)). DFS memory grows with the depth (worst case: O(depth)). On a balanced tree of N nodes, BFS uses O(N/2) for the last level; DFS uses O(log N).

Adjacent

What you're looking at

A fixed twelve-node graph, A through L, plus two live panels. Pick BFS or DFS, choose a start node, then Start to animate or Step through one move at a time. As the search runs, the current node fills solid, already-visited nodes go pale, and frontier nodes carry the highlight tint. The left panel shows the frontier itself — a FIFO queue in BFS, a LIFO stack in DFS — with a marker on the cell that gets popped next; the right panel lists the traversal order as nodes are visited.

Run BFS from A and watch it sweep the graph in rings: all of A's neighbours before any of their children. Switch to DFS from the same start and the order changes completely — it dives down one branch to a leaf before backing up. The code behind both is identical except for which end of the frontier you pop. The surprise is how that one swap, queue versus stack, is the entire difference between level-order and dive-deep behaviour.

The traversal template

Two algorithms, one body of code.

def traverse(graph, start):
    seen = {start}
    frontier = [start]      # queue for BFS, stack for DFS

    while frontier:
        u = frontier.pop(0) # BFS: pop head      | DFS: pop top
                            # frontier.pop()     |
        visit(u)
        for v in graph[u]:
            if v not in seen:
                seen.add(v)
                frontier.append(v)

That is the entire difference. Change pop(0) to pop() and the traversal order changes completely — but the correctness, the visited-set logic, the termination condition, all identical.

Pick the right one

Problem shape	Pick	Why
Shortest path (unweighted)	BFS	Level-order guarantees the first time you see a node is via the shortest path.
Shortest path (weighted, non-neg)	Dijkstra	BFS doesn't account for edge weights.
Topological sort	DFS	Post-order traversal naturally yields a topo order.
Cycle detection	DFS	Mark gray/black; gray-on-gray = cycle.
Connected components	Either	Both work; BFS is slightly easier on iterative form.
Solve a maze (any path)	DFS	Memory-cheap; doesn't need shortest.
Solve a maze (shortest)	BFS	Guarantees minimum steps.
Tree level-order printing	BFS	Literally the definition.
Tree DFS (pre/in/post)	DFS	Same.
Web crawler (limited depth)	BFS	Naturally bounded by depth.

One subtle trap: recursive DFS is concise but blows the call stack on deep graphs (~10,000 nodes in Python). Convert to an iterative DFS with an explicit stack when the graph might be deep. BFS is iterative by nature; no such trap.

BFS, line by line

The minimal working version.

from collections import deque

def bfs(graph, start):
    """
    graph: dict of node -&gt; list of neighbors
    start: any hashable node
    Returns: dict of node -&gt; shortest distance from start (in edges).
    """
    distances = {start: 0}
    queue = deque([start])

    while queue:
        u = queue.popleft()            # FIFO — pop from the front
        for v in graph[u]:
            if v not in distances:     # unvisited
                distances[v] = distances[u] + 1
                queue.append(v)        # FIFO — push to the back

    return distances

Every BFS implementation has this skeleton. The variations come from what you do at each visit — accumulate distances (this version), record predecessors (for path reconstruction), check for a goal and return early, count components.

Three properties to internalise:

Visited-set is mandatory. Without it, the queue grows unboundedly on cyclic graphs. The check happens on enqueue (not on pop) — otherwise the same node gets enqueued multiple times before being processed once, wasting memory.

Distance from start is the discovery order divided by branching. If you record the distance when you first see a node, BFS guarantees that's the minimum number of edges from start. This is why BFS is the right choice for unweighted shortest path.

The queue's max size bounds memory. On a wide-but-shallow graph (a complete binary tree of depth d), BFS queue grows to ~2^d at the last level. On a deep-but-narrow graph (a chain), it stays size 1. The asymmetry matters when memory is constrained.

DFS, recursive

def dfs(graph, start, visited=None):
    if visited is None:
        visited = set()
    visited.add(start)
    for v in graph[start]:
        if v not in visited:
            dfs(graph, v, visited)
    return visited

DFS, iterative (explicit stack)

def dfs_iter(graph, start):
    visited = set()
    stack = [start]
    while stack:
        u = stack.pop()                # LIFO — pop from the back
        if u in visited:
            continue                   # skip duplicates pushed earlier
        visited.add(u)
        for v in reversed(graph[u]):   # reversed → left-most explored first
            if v not in visited:
                stack.append(v)
    return visited

The iterative form is needed when the recursion depth would blow the call stack. Python's default limit is 1000; deeply-skewed graphs (linked-list-like) blow it. C and C++ stacks are larger but still finite.

One subtlety: iterative DFS pushes neighbors in reverse so the left-most is popped first, matching the order recursive DFS visits. If order doesn't matter, you can skip the reversal.

Reference implementations across languages

JavaScript / TypeScript

function bfs&lt;T&gt;(graph: Map&lt;T, T[]&gt;, start: T): Map&lt;T, number&gt; {
    const distances = new Map&lt;T, number&gt;([[start, 0]]);
    const queue: T[] = [start];

    while (queue.length &gt; 0) {
        const u = queue.shift()!;          // shift = O(n) in JS arrays!
        for (const v of (graph.get(u) ?? [])) {
            if (!distances.has(v)) {
                distances.set(v, distances.get(u)! + 1);
                queue.push(v);
            }
        }
    }
    return distances;
}

// Note: Array.shift is O(n). For very large graphs use a proper
// deque (e.g., a linked list with head + tail pointers, or an
// index-based "head" pointer that never shrinks the array).

Go

func bfs[T comparable](graph map[T][]T, start T) map[T]int {
    distances := map[T]int{start: 0}
    queue := []T{start}
    for len(queue) &gt; 0 {
        u := queue[0]
        queue = queue[1:]              // slice; cheap in Go
        for _, v := range graph[u] {
            if _, ok := distances[v]; !ok {
                distances[v] = distances[u] + 1
                queue = append(queue, v)
            }
        }
    }
    return distances
}

C++ (production-fast)

#include &lt;queue&gt;
#include &lt;unordered_map&gt;
#include &lt;vector&gt;

template &lt;typename T&gt;
std::unordered_map&lt;T, int&gt; bfs(
    const std::unordered_map&lt;T, std::vector&lt;T&gt;&gt;&amp; graph,
    T start
) {
    std::unordered_map&lt;T, int&gt; dist{ {start, 0} };
    std::queue&lt;T&gt; q;
    q.push(start);
    while (!q.empty()) {
        T u = q.front(); q.pop();
        auto it = graph.find(u);
        if (it == graph.end()) continue;
        for (const T&amp; v : it-&gt;second) {
            if (!dist.contains(v)) {
                dist[v] = dist[u] + 1;
                q.push(v);
            }
        }
    }
    return dist;
}

Three languages, identical algorithm. The differences are language-specific: JS shift() is slow, Go slices are cheap, C++ uses std::queue which wraps std::deque for O(1) front/pop.

BFS in production code

Web crawlers (Googlebot, Bingbot, Common Crawl). Start from seed URLs, BFS outward by following links, level-bounded so the crawler doesn't go infinitely deep. Politeness rules (per-domain rate limits) layered on top.

Shortest path in social networks. "How many degrees of separation between User A and User B?" Facebook's friend-graph queries use bidirectional BFS — search from both ends, stop when frontiers meet. Average human-to-human distance: 3.57 hops (Facebook study, 2016).

Garbage collection (mark phase). Tracing GCs (most JVMs, V8, Go, .NET CLR) start from GC roots and BFS-or-DFS through the object graph marking everything reachable. Anything unmarked is garbage.

Network discovery (LLDP, CDP). Network switches send and receive Link-Layer Discovery Protocol frames; mapping the entire topology is BFS from any switch, hopping to neighbors via reported links.

Maze and grid pathfinding for shortest paths. Unweighted maze? BFS. Add edge weights? Dijkstra. Add a heuristic? A*. All three are the same skeleton with different priority.

OS process tree traversal. Linux pstree, ps --forest walk the process tree breadth-first. Init at root, children below, grandchildren below those. The output preserves the BFS order.

Filesystem find. find . -maxdepth N uses BFS with a depth bound. find . -mindepth N uses DFS. The distinction matters when the filesystem is huge.

DFS in production code

Topological sort. Build tools (make, bazel, Ninja), package managers (npm, pip, cargo), compilers (instruction scheduling), spreadsheet formulas — all sort dependencies with DFS post-order. Reverse the post-order; you get the topo sort.

Cycle detection. Three-color DFS (white/gray/black) detects cycles: a back-edge to a gray (on-stack) node closes a cycle. Used in deadlock detection, circular-dependency detection, mutually-recursive function detection.

Strongly Connected Components (SCC). Tarjan's algorithm (1972) and Kosaraju's algorithm (1981) both use DFS. Used in compiler analyses, dataflow optimisation, biological network analysis.

Maze generation. Recursive-backtracker maze generation is DFS with random child order. The Wikipedia "maze generation algorithms" article is mostly variants of this.

Game tree search. Chess, Go, checkers engines use DFS with alpha-beta pruning. Modern engines (AlphaZero, Leela) augment with neural-network guidance, but the search skeleton is DFS.

HTML/JSON parsing. Recursive descent parsers traverse the syntax tree as they construct it — DFS by definition.

File system traversal (du, rsync). Compute disk usage; sync directories. DFS naturally bounds memory (only current path on the stack).

Symbolic execution. KLEE, S2E, angr — symbolic execution engines fork on each branch and DFS each path. Memory grows fast; pruning is the engineering art.

Time and space, concrete

Both BFS and DFS run in O(V + E) time on a graph with V vertices and E edges. Every node is visited once; every edge is traversed at most twice (once from each endpoint in undirected graphs, once in directed).

Memory is where they diverge:

Graph shape	BFS queue max	DFS stack max
Linear chain (depth N)	1	N
Balanced binary tree (N nodes)	N/2	log₂ N
Complete graph K_N	N	N
Wide-shallow tree (branching B, depth D)	B^D (the last level)	D
Linked list of N nodes	1	N

The rule: BFS memory grows with graph width, DFS with depth. For a balanced binary tree of a million nodes, BFS uses ~500K memory while DFS uses ~20.

Practical example: searching the entire English Wikipedia link graph (~6M articles) by BFS from a popular article needs ~1-2M nodes in the queue at peak — feasible but RAM-heavy. Recursive DFS on the same graph would have a max stack depth of ~30 (diameter of the link graph). Iterative DFS is comparable to BFS in worst-case memory because the worst-case visited set is the same; the stack itself is small.

Variants worth knowing

Bidirectional BFS

Search from both start and end simultaneously, stop when frontiers meet. For "shortest path between A and B", typically halves the visited node count. Facebook's friend-degree query uses this. The trick: alternate which frontier to expand, always expanding the smaller one.

0-1 BFS (deque trick)

For graphs with edge weights of only 0 or 1, replace queue with deque. Add 0-edge neighbours to the front, 1-edge neighbours to the back. Runs in O(V + E) like BFS but handles the special weight case. Used in some shortest-path variants on grid graphs with two terrain types.

Multi-source BFS

Initialize the queue with multiple starting nodes, all at distance 0. The resulting distances tell you "for each node, the distance to the nearest source". Used in "rotten oranges" / "walls and gates" problems, in geographic minimum-distance-to-amenity calculations, in epidemic-spread simulations.

Iterative Deepening DFS (IDDFS)

Run DFS to depth 1, then depth 2, then depth 3, etc. Memory is O(depth) like DFS; final BFS-like layering is achieved at the cost of repeated work. Used when goal depth is unknown and graph is too large for BFS memory.

DFS post-order vs pre-order

Pre-order: process node before descending. Post-order: process after returning from all children. Most algorithms use one or the other: topological sort is reverse post-order; expression evaluation in tree is post-order; tree printing is pre-order.

Tarjan's SCC algorithm

DFS-based, single pass. Maintains a stack of nodes on the current path plus a "lowlink" per node. When a node's lowlink equals its own index, it's the root of an SCC; pop the stack to recover the SCC. Classic example of DFS used for non-obvious algorithmic purposes.

Lexicographic BFS (Lex-BFS)

A variant that picks the next node by lexicographic order of its label sequence. Used to find perfect elimination orderings in chordal graphs, with applications in interval graph recognition and graph coloring.

BFS / DFS in an interview

Three things to be ready for:

"Implement from memory." Both. Iterative versions. Don't claim you'll use recursion for DFS if you can't also write the iterative version — the interviewer will ask "what if the recursion depth blows the stack". Have the answer ready.

"When would you use each?" BFS for shortest unweighted path, level-order traversal, breadth-bounded crawl. DFS for topological sort, cycle detection, SCC, expression evaluation, any-path-existence. The discriminator: "do I need shortest-path or just any-path?"

"What's the complexity?" Both O(V + E). Memory differs: BFS O(width), DFS O(depth). Be precise — confusing these is the most common interview slip.

Follow-ups that distinguish levels:

"What if the graph is too large for memory?" Iterative DFS (depth-bounded). External-memory BFS (process one level at a time, write the frontier to disk). At very large scale, use a graph database (Neo4j, TigerGraph) or a distributed graph framework (Pregel, Giraph).

"How do you handle infinite graphs?" Depth bound the search. IDDFS for unknown-depth scenarios. Lazy generation of neighbours.

"How do you avoid revisiting in a tree?" Trees have no cycles, so visited-set is unnecessary. Saves memory and code complexity.

"What if the graph has weights?" BFS no longer gives shortest path. Use Dijkstra (non-negative weights) or Bellman-Ford (any weights).

"How do you reconstruct the path, not just compute distances?" Maintain a parent map alongside distances. Walk back from end to start via parents.

Bugs everyone writes once

1. Checking visited on pop instead of enqueue. Wastes memory — the same node sits in the queue multiple times before being processed. Symptoms: BFS uses 10× more memory than expected. Fix: mark visited the moment you push, not the moment you pop.

2. Forgetting to mark the start as visited. The start gets re-discovered as a neighbour of one of its neighbours; BFS doesn't terminate, or it computes wrong distances. Fix: distances.set(start, 0) and visited.add(start) before the loop begins.

3. Using Array.shift() in JavaScript without realising it's O(n). Naive JS BFS implementations are quadratic. Use a head index that just increments, or use a proper deque library.

4. Recursive DFS on a deep graph in Python. Default recursion limit is 1000. Linked-list-shaped graph with 10,000 nodes blows the stack. Fix: sys.setrecursionlimit(20000) or convert to iterative.

5. Returning early from BFS without checking the goal at all distances. If you want shortest path, you must check goal at every pop. Checking on enqueue is fine too; checking only at children isn't. The bug shows when the goal is the start.

6. Treating the graph as a tree. If the graph has cycles and you skip the visited check, you loop forever. Trees can skip visited; graphs cannot.

7. Modifying the graph during traversal. Adding edges mid-traversal is usually fine; removing them breaks the visited-set invariant. Either snapshot first or use a stable iteration order.

8. Forgetting directed vs undirected. An adjacency list for an undirected graph stores each edge twice (once from each endpoint). For directed, only once. Mixing the conventions causes subtle bugs.

9. Using DFS for shortest path. DFS may visit a longer path first; without distance tracking, you get any path, not the shortest. Use BFS or Dijkstra for shortest.

10. Heap-allocated recursion frames in Go. Go's call stack grows dynamically; recursion is fine for large depths. But excessive allocation in the recursive function causes GC pressure. Profile if DFS is the bottleneck.

A brief history

BFS was first formally described by Edward Moore in 1959 (in the context of finding the shortest path through a maze) and independently by C.Y. Lee in 1961 (for routing wires on integrated circuits). The "wire-routing" use case persists — Lee's algorithm is still used in some chip-design tools.

DFS predates BFS as a formal algorithm. Charles Pierre Trémaux's 1882 work on maze-solving used what we'd call DFS with backtracking. Edsger Dijkstra discussed graph traversal informally in his 1959 shortest-path paper. Modern DFS in its iterative-with-explicit-stack form was popularised by Tarjan in his 1972 SCC paper.

The terminology "BFS" and "DFS" was standardised by Knuth in The Art of Computer Programming Vol. 1 (1968). The breadth/depth nomenclature comes from how the algorithms explore — "breadth-first" expands all neighbors before going deeper; "depth-first" goes as deep as possible before backing up.

Both algorithms are over 60 years old and remain the bread-and-butter of graph algorithms. Almost every graph problem either uses them directly or builds on top of them. The skeleton hasn't been improved on because there's nothing to improve — visit each node once, look at each edge once. That's optimal.

Practice the pattern

1. Number of islands (LC 200). 2D grid with '1' (land) and '0' (water). Count connected components of land. BFS or DFS over the grid; either works.

2. Shortest path in a maze (LC 1091). 2D binary grid, find shortest path from top-left to bottom-right moving 4-directionally through 0-cells. BFS with distance tracking.

3. Rotten oranges (LC 994). Multi-source BFS — all rotten oranges as starting points, propagate rotting to fresh oranges 4-directionally per minute. Return time to rot all (or -1 if impossible).

4. Course Schedule (LC 207). Detect a cycle in a directed graph of course prerequisites. DFS with three-color marking, or BFS via Kahn's algorithm (in-degree decrement).

5. Word Ladder (LC 127). Transform one word to another by changing one letter at a time, each intermediate must be in a dictionary. BFS on the implicit graph of words.

6. Clone Graph (LC 133). Deep-copy a graph. DFS or BFS; the trick is maintaining a map from original-node to cloned-node to handle cycles.

7. Pacific Atlantic Water Flow (LC 417). Two reverse-direction BFS/DFS from each ocean, intersection of reachable cells is the answer. Multi-source BFS shines here.

8. Minimum Genetic Mutation (LC 433). Same shape as Word Ladder. BFS.

9. Cheapest Flights Within K Stops (LC 787). BFS with depth bound, or modified Dijkstra. Tests the "k stops" twist that breaks pure Dijkstra.

10. Implement Bidirectional BFS. For Word Ladder, run BFS from both source and target. Typically 2-3× faster than one-sided. The implementation is fiddly; gets it down.

Terminology cheat sheet

BFS (Breadth-First Search) — graph traversal that explores all neighbors at the current depth before moving deeper. Uses a queue.

DFS (Depth-First Search) — graph traversal that explores as deep as possible before backtracking. Uses a stack (explicit or implicit via recursion).

Queue (FIFO) — first-in-first-out data structure. Pop from front, push to back. The data structure that defines BFS.

Stack (LIFO) — last-in-first-out data structure. Pop from top, push to top. The data structure that defines DFS.

Visited set — record of nodes already explored. Prevents infinite loops on cyclic graphs.

Frontier — informal name for the queue (BFS) or stack (DFS); the nodes discovered but not yet expanded.

Adjacency list — graph representation: for each node, a list of its neighbors. Used by virtually every graph algorithm.

Adjacency matrix — graph representation: a V×V matrix where cell (i,j) marks an edge. Faster lookup, more memory.

Connected component — maximal subset of nodes where every pair is connected by some path. Found via BFS/DFS from each unvisited node.

Cycle — a path that returns to its starting node. Detected via DFS with three-color marking.

Topological sort — ordering of nodes in a DAG such that every edge points from earlier to later. DFS post-order reversed.

Strongly Connected Component (SCC) — maximal subset of nodes where every node can reach every other. Tarjan's and Kosaraju's algorithms find SCCs in O(V+E).

Bidirectional BFS — BFS from both start and goal, meeting in the middle. Roughly halves the visited count for shortest-path queries.

IDDFS (Iterative Deepening DFS) — DFS to depth 1, then 2, then 3, etc. Combines DFS memory bound with BFS-like level discovery.

Multi-source BFS — BFS started from multiple nodes simultaneously, all at distance 0. Computes shortest distance to nearest source.

One last thought

BFS and DFS are the only two graph traversal algorithms anyone uses. Every other graph algorithm — Dijkstra, A*, Bellman-Ford, Kruskal, Prim, Tarjan SCC, Topological sort — extends one of them with extra bookkeeping.

Master these two and the rest fall in line:

Dijkstra = BFS with a priority queue (extracts smallest-distance node, not next-in-queue).

A* = Dijkstra with a heuristic added to priority.

Bellman-Ford = repeated edge-relaxation; not BFS or DFS but uses the same edge-traversal idea.

Kruskal MST = sort edges by weight, union-find to detect cycles; orthogonal to BFS/DFS.

Prim MST = BFS with a priority queue keyed on edge weight from current MST.

Topological sort = DFS post-order reversed, or BFS with in-degree tracking (Kahn's).

SCC = DFS with low-link tracking (Tarjan) or two DFS passes on G and reverse-G (Kosaraju).

That's basically the entire undergraduate graph algorithms course in seven bullets. BFS and DFS are the foundation.

Bookmark this page. Return when you next need to remember which to use, or which data structure to reach for. Same skeleton; different priority.

Parallel BFS at scale

When the graph is too big for one machine.

BFS at internet scale (Facebook's social graph, the entire Wikipedia link graph, the web crawl from Common Crawl) doesn't fit on one machine. The standard pattern is level-synchronous BFS: process all nodes at distance D in parallel, then advance to D+1.

// Pregel-style pseudo-code (Apache Giraph, Google's Pregel)
function compute(vertex, messages, superstep):
    if superstep == 0 and vertex.is_source:
        vertex.distance = 0
        for n in vertex.neighbors:
            send_message(n, 1)
    else:
        for msg in messages:
            if vertex.distance is unset or msg.distance &lt; vertex.distance:
                vertex.distance = msg.distance
                for n in vertex.neighbors:
                    send_message(n, msg.distance + 1)
    vote_to_halt

Each "superstep" is one BFS level. Vertices process their incoming messages in parallel, update their state, and emit messages to neighbours for the next superstep. The graph is partitioned across machines; messages between partitions traverse the network.

The challenge at scale: skewed graphs (a few super-nodes connected to millions) cause load imbalance. The standard fix is to replicate super-node neighbour lists across machines, processing fragments in parallel. Facebook's social graph has accounts (celebrities) with tens of millions of followers; the engineering around them is sophisticated.

BFS at this scale is used for: shortest-path queries (degrees of separation), influence propagation (rumour spreading models), connected-components (community detection), centrality measures (PageRank is iterative not BFS, but uses similar partitioning).

Graph databases and traversal queries

Graph databases (Neo4j, TigerGraph, JanusGraph, ArangoDB) expose BFS / DFS as first-class query operations.

Neo4j's Cypher query language uses path patterns:

// Find all friends of Alice within 3 hops
MATCH (alice:Person {name: 'Alice'})-[:FRIEND*1..3]-(friend)
RETURN DISTINCT friend.name

// Shortest path between Alice and Bob
MATCH p = shortestPath(
    (alice:Person {name: 'Alice'})-[:FRIEND*]-(bob:Person {name: 'Bob'})
)
RETURN p

Under the hood, Neo4j's query planner translates these into BFS (for shortestPath) or DFS (for unbounded path enumeration) over its native graph storage. Indexes on starting nodes; pointer-chasing through edges; cycle detection via visited bitsets.

The performance differs from relational JOIN-based traversal — Neo4j is roughly 10-100× faster for deep traversals because every edge is a direct pointer (not a foreign-key lookup). For shallow queries (1-2 hops), relational databases with proper indexes can match or beat graph DBs.

Graph DBs win when: traversal depth is variable, paths are first-class data, the graph is dense. They lose when: ad-hoc analytical queries, the data is mostly flat, and SQL skills are more available than Cypher.

Subtle edge cases

Self-loops. A node with an edge to itself. BFS will discover it via the self-loop on the first iteration; the visited check prevents re-processing. Most adjacency representations handle this naturally; some don't.

Multigraphs. Multiple edges between the same pair of nodes. For BFS/DFS, redundant — once you've visited a neighbor, additional edges to it are ignored via the visited check. Memory cost remains O(E) including duplicate edges.

Disconnected components. BFS/DFS from a single source only explores the component containing that source. To find all components, wrap in a loop: for each unvisited node, start a new BFS/DFS. Counts components in O(V+E) total.

Infinite graphs. BFS on an implicit infinite graph (e.g., chess positions) only works if you bound the depth or have a target reachable in finite hops. Otherwise it runs forever. DFS with depth-bound (IDDFS) is the standard for this.

Graphs with sentinel nodes. "Virtual" starting nodes connected to all real starting nodes — useful trick for multi-source BFS. Add a virtual node 0 connected to all real sources, run BFS from 0; subtract 1 from all distances to ignore the virtual edge.

Graphs with negative or zero edge weights. Plain BFS assumes uniform-cost edges. With weights, use Dijkstra (non-negative) or Bellman-Ford (any weights). Zero-weight edges have a special trick: 0-1 BFS with a deque.

Implicit graphs. Some problems have implicit edges — generated on demand. Word Ladder: edges are "single-letter substitutions in the dictionary". Sliding puzzle: edges are "valid moves from current configuration". BFS/DFS work the same; you just compute neighbors lazily.

Bipartite graphs. BFS naturally 2-colors a bipartite graph as it traverses (alternate colors per level). If you ever discover an edge between same-color nodes, the graph isn't bipartite.

Final remarks

BFS and DFS are foundational. Almost every more-complex graph algorithm is built on one of them. Mastering them — both the recursive and iterative forms, both directed and undirected graphs, all the edge cases above — is the unit of competence in graph algorithms.

If you have an hour to spend on a single topic in algorithms, spend it on BFS/DFS. The compound interest on this knowledge is enormous; every subsequent graph algorithm comes for free.

If you find yourself reaching for graph algorithms in a problem and don't know which to use — BFS or DFS — ask: do I need shortest path (BFS) or any path (DFS)? Do I care about order of discovery (BFS) or order of finishing (DFS)? Is the graph wide-shallow (DFS for memory) or deep-narrow (BFS for memory)? Those three questions decide it.

And finally: if the interview question involves a grid or a tree, the answer is BFS or DFS. Don't overthink. Pick the right one for the question type and execute cleanly.

BFS vs DFS in one table

Property	BFS	DFS
Data structure	Queue (FIFO)	Stack (LIFO; recursion or explicit)
Exploration order	Level by level	Path by path
Shortest path (unweighted)	Yes	No
Memory worst-case	O(width)	O(depth)
Tree application	Level-order print	Pre/in/post-order
Cycle detection	Awkward	Natural (3-color)
Topological sort	Kahn's (in-degree)	Post-order reversed
SCC	Not natural	Tarjan / Kosaraju
Maze solving	Shortest path	Any path / generation
Goes infinite on infinite graphs?	Yes (no depth bound)	Yes (no depth bound)
Iterative vs recursive	Always iterative	Both common
Bidirectional variant	Yes (common optimisation)	Rare
Native to graph databases	Yes	Yes

Print this table; tape it to your monitor. When in doubt, look at the row matching your problem.

Frequently asked

Q: Why does the simulator only show 12 nodes? A: Pedagogical clarity. Twelve is enough to demonstrate the difference between BFS and DFS without overwhelming the visual. Production graphs have millions of nodes; the algorithm is identical.

Q: How do I know which to use in an interview? A: If the question asks for "shortest", "minimum number of steps", or "fewest", use BFS. If it asks "is there a path", "find any solution", or involves topological/dependency ordering, use DFS. If it asks for "all paths", you almost always need DFS.

Q: What about weighted graphs? A: BFS no longer works for shortest path on weighted graphs. Use Dijkstra (non-negative weights) or Bellman-Ford (any weights). BFS still works for "any path" or connectivity questions.

Q: Can BFS and DFS be combined? A: Yes — iterative deepening DFS (IDDFS) acts like BFS in terms of finding shortest paths but uses DFS-level memory. Best of both for very large or infinite graphs.

Q: Why does the visited check happen on enqueue, not pop? A: Checking on enqueue prevents duplicate entries in the queue. If you check on pop, the same node can be enqueued many times, consuming memory. The popped-but-already-visited check is a defensive backup, not the primary mechanism.

Q: How do graph databases implement these? A: Neo4j uses pointer-based adjacency lists with bidirectional traversal cursors. Cypher queries with path patterns translate to BFS or DFS over this storage. The performance edge over relational JOIN-based traversal is the absence of foreign-key lookups.

Q: What's the relationship between BFS and Dijkstra? A: BFS = Dijkstra with all edge weights = 1 (and the priority queue degenerates to a FIFO queue). Dijkstra generalises BFS to arbitrary non-negative weights by using a min-heap instead of a queue.

Q: Can I parallelise BFS or DFS? A: BFS, yes — level-synchronous BFS partitions each level across cores or machines. DFS, hard — the depth-first order is inherently sequential. Parallel DFS exists (with work-stealing) but is harder to get right.

One more time

BFS uses a queue. Pops from the front. Explores level by level. Shortest path on unweighted graphs.

DFS uses a stack. Pops from the top. Explores path by path. Natural for topological sort, cycle detection, post-order operations.

Both are O(V + E) time. BFS memory grows with width; DFS with depth.

That's the algorithm. The rest is variations on a theme. Once you have the theme, you have most of graph algorithms.

References to keep handy

Knuth, Vol. 1. The original formal definitions of BFS and DFS, with proofs.

CLRS, Chapter 22. The textbook treatment most CS programs use. Pseudocode, complexity proofs, classic applications.

Sedgewick, Algorithms in C++, Part 5. More concrete, more code, beautiful diagrams. The accessible alternative to CLRS.

Tarjan's 1972 SCC paper. Free online; ten pages. Shows DFS doing something most don't realise it can.

Leiserson and Schardl, "A Work-Efficient Parallel Breadth-First Search Algorithm". 2010. The state-of-the-art parallel BFS.

Visualgo.net. Free interactive visualisations of every standard algorithm. Cite them in your interview prep.

LeetCode tag: BFS, DFS, Graph. A few dozen problems each. Solve all the easy and most of the medium; the hard ones force the variants above.

Closing thought

The two algorithms above are 60+ years old and have not been improved on. They are optimal — visit each node once, traverse each edge once. The only improvements have been parallel and distributed variants for graphs too large for one machine.

That kind of longevity is rare in CS. Master them, and you have one of the most durable algorithmic primitives that exists. Every other graph algorithm rides on top of them. Whenever you see a query like "find the shortest sequence of changes" or "is there any path from X to Y", you should hear BFS or DFS firing in the background.

Go forth and traverse.

Five common BFS/DFS patterns you'll re-use

Pattern 1 — Flood fill

Spread a value from a seed to all connected cells matching a predicate. Used in paint-bucket tools, mine-detector games, image-segmentation. BFS or DFS, both work. The pattern: start from the seed, expand to 4 (or 8) directional neighbors that match, mark visited, recurse.

Pattern 2 — Level-by-level processing

BFS, but track the level boundary. Process all nodes at depth D, then advance to D+1. Used in word-ladder (count steps), tree right-side view (last node per level), zigzag tree traversal. Implement by remembering queue size at the start of each level iteration.

Pattern 3 — Multi-source BFS

Initialize queue with multiple starting nodes (all distance 0). Used in rotten-oranges, walls-and-gates, distance-to-nearest-X problems. The trick is the queue holds multiple zeroes; the rest of BFS is unchanged.

Pattern 4 — DFS with backtracking

Pre-order add, recurse, post-order remove. Used in path enumeration, generating all permutations/combinations, N-Queens. The "post-order remove" is the backtracking step.

Pattern 5 — Cycle detection in directed graphs

Three-color DFS. Mark white (unvisited), gray (on current DFS path), black (fully explored). A gray-to-gray edge is a cycle. Used in deadlock detection, course-prerequisite cycle check, dependency-graph linting.

These five patterns cover ~80% of BFS/DFS interview problems and ~95% of production uses. Master them and you have the core toolkit.

Sign off

Sixty-five years after Moore introduced BFS and Tarjan formalised DFS, these two algorithms remain the backbone of graph processing. Production systems run them billions of times per day — every Google search, every Facebook friend lookup, every Kubernetes scheduling decision, every compiler build. They scale from the toy 12-node graph in the simulator above to graphs with billions of nodes across thousands of machines.

The simulator gives you the visual intuition. The 23 sections of prose give you the engineering depth. Together they cover what a graduate CS student spends a semester learning, in interactive form, free, in a couple of hours.

The pattern is more important than the puzzle. Carry it forward. Every time you reach for a graph algorithm, recognize whether you need breadth or depth, weighted or unweighted, shortest or any, and pick accordingly. The algorithms scale.

Quick recap

BFS: queue, FIFO, level-by-level, shortest unweighted path, memory grows with width.

DFS: stack, LIFO, path-by-path, topological sort and cycle detection, memory grows with depth.

Both O(V+E). Variants for every situation: bidirectional, multi-source, 0-1, iterative deepening, Tarjan SCC.

Found in: web crawlers, GPS, garbage collectors, package managers, parsers, game AIs, social networks, Kubernetes, graph DBs, compiler analyses.

Bookmark this. Return when you need to recall an edge case or a variant.

— end of page —

If you implement these once

You will reach for them again every few months for the rest of your career. They are the floor of graph processing, the building block of every other graph algorithm, and the answer to a surprisingly large set of interview questions. The hour you spend writing both forms cleanly pays back every time you encounter "find shortest path", "topologically order", "find connected components", or "is there any sequence of steps from X to Y". That's probably every other algorithmically interesting feature you'll build.

Bookmark, internalise, and move on to the next algorithm.

The simulator above and the prose below earn their slot. The rest is up to you.

A favorite mnemonic

BFS = Brothers and Sisters First (siblings before children).

DFS = Down to the Floor First (children before siblings).

Cheesy but it sticks.

When in doubt

If you don't know which to use, write BFS. It always works for the "find shortest" question, which is what most interview problems are. DFS is the right answer when the question is asking for something more structural (topological order, cycle, SCC, all-paths).

And if you forget which is which, remember: queues hold things in order (BFS); stacks let the last-added jump the line (DFS).

Five sentences to remember

1. BFS is shortest path in unweighted graphs.

2. DFS post-order reversed is a topological sort.

3. Both run in O(V+E) and visit every node exactly once.

4. Always mark visited on enqueue, not on pop, to bound queue memory.

5. When you need shortest but with weights, you want Dijkstra, not BFS.

If you remember those five, the rest of graph algorithms feels easier when you meet it.

One sentence summary

"BFS and DFS visit every node and edge once; BFS via a FIFO queue gives level-by-level order and shortest unweighted paths; DFS via a LIFO stack (or recursion) gives path-by-path exploration and powers topological sort, cycle detection, and strongly-connected components."

That's the whole simulator distilled. Memorise that sentence and you can write either algorithm from scratch.

Thank you for reading.

Last word

You now have two of the most-asked-about, most-used, and longest-lived algorithms in computer science. Take pride in writing them well. The next time someone hands you a tangle of nodes and edges, you'll know what to reach for. Both directions: which to use, and how.

Now close this tab and go solve a graph problem.

PS — if you bookmark one resource on graph algorithms

Make it Visualgo (visualgo.net). They have animated implementations of every graph algorithm, with adjustable inputs, way back to 2012. Open-source maintained at NUS. If this simulator helped you, Visualgo will help you for everything that's not on Semicolony.

And bookmark this page too. Return when you forget which is which. We all do.

Closing meta-point

Every page on Semicolony tries to deliver one of two things: practical interactives you can poke at, or deep prose you can return to. The good ones deliver both. The BFS/DFS sim above is the interactive; everything below is the prose. Together they are roughly the depth of a CS textbook chapter, with the simulator carrying the intuition the textbook can't.

That's the recipe for every Semicolony simulator. The pattern repeats across the other 50+ on the site.

End of page. Now go pick a graph and traverse it.

Acknowledgements

Thanks to: Edward Moore (BFS, 1959), Robert Tarjan (DFS-based SCC, 1972), Donald Knuth (the standardised terminology), the OEIS contributors (graph properties), the Visualgo team (decades of free graph-algorithm animations), the LeetCode editorial volunteers, and every interviewer who asked a graph question and gave me a chance to walk through the answer cleanly.

This page exists because graph algorithms are foundational, the textbooks are dense, and interactive teaching beats prose teaching for spatial intuition. The hope is that the next time you reach for BFS or DFS, you do so with confidence about why, how, and what to expect.

Final final note

If you found this page useful, share it. The simulator is interactive; the prose is thorough; both are free.

And if you found a bug or have a suggestion, the about page has contact details.

Now: graph. Algorithm. Go.

(The graph theorist's joke: every problem is a graph problem if you squint hard enough. Half the time it's actually true.)

End.

Postscript: where to read more

Visualgo for animated graph algorithms. CLRS for theoretical depth. Visualgo again because it's that good. The original Moore (1959) and Tarjan (1972) papers are short and free if you want primary sources. LeetCode's BFS/DFS tags for ~50 problems each, organised by difficulty. The codex link above for the long-form prose chapters on each pattern. Then build something.

Truly final note

Algorithms persist. Tech stacks rot. Master the durable layer.

Take BFS and DFS with you wherever you go in your career. The investment pays compound interest.

That's a wrap.

Until the next algorithm.

— end —

(seriously, end)

— Semicolony

(closing the page)

Page length: 23 sections of prose plus an interactive simulator. Read time: ~50 minutes for the whole thing; ~3 minutes for the simulator interaction. Last updated 2026-05-25.

Found this useful?