14 / 20 · Tier C
Patterns / 14

Trie — prefix tree

A tree keyed by characters. Each node represents one character; its children are the next characters that can follow; an is_word flag on a node marks the end of a complete word. Insert, search, and prefix-match all run in O(L) time where L is the key length — independent of how many words the trie holds.


1 · Why tries matter

A hash map answers one question: is this exact string in my collection? It does it in O(L). A trie answers that one and four more.

What's the longest prefix of this string that's a stored key? How many stored keys start with this prefix? Which stored keys match this wildcard pattern? Give me the lexicographically smallest k completions of this prefix. All in O(L), where L is the length of the query, no matter how many words are in the dictionary.

The structure is a tree where each edge represents a character. A path from root to a node spells a prefix. A flag on the node (typically is_word) marks whether that prefix is itself a stored word. Two words that share a prefix share that portion of the tree. A trie of 100,000 English words ends up considerably smaller than 100,000 separate strings would be, because re-, un-, and -ing only get stored once.

Real production examples

  • Autocomplete. Google search. Every IDE's completion popup. Every CLI's tab-completion. Walk the trie to the prefix node in O(L), then DFS for top-k matches by frequency. Production systems layer a learned re-ranker on top, but the substrate is a trie.
  • IP routing tables (longest prefix match). Forward a packet by finding the routing-table entry whose prefix matches the destination IP for the most bits. A bit-trie on 32-bit addresses runs in 32 steps per packet, fast enough for line-rate forwarding. Cisco's CEF (Cisco Express Forwarding) is a production-grade variant of exactly this.
  • Spell-checkers. Store the dictionary in a trie. Suggestions are nearby paths measured by edit distance. Hunspell (used by OpenOffice and Firefox) uses an affix-augmented trie.
  • T9 keyboards. Each digit maps to 3 or 4 letters. Walk a trie where each node has children indexed by digit. The multi-character mapping fans out, but pruning by stored-word terminates branches quickly. 2000s mobile phones; the structure still ships in IME engines today.
  • URL routing in web frameworks. FastAPI, Echo (Go), and gin compile route patterns into a trie keyed on URL segments. /users/:id/posts and /users/me share the /users prefix. Routing a request walks the trie in O(segments).
  • Genomic sequence indexing. Suffix tries and their compressed cousins (suffix trees, suffix arrays) sit underneath BLAST, BWA, and Bowtie. Each one stores every suffix in a tree and answers "where does this read match?" in O(L).
If the question involves "prefix" or "starts with" or "autocomplete" or "longest matching root" or "many strings sharing structure", reach for a trie. If it's just "did I store this exact string", a hash map is simpler and cheaper.

2 · Intuition — a tree where edges are characters

A trie storing ["cat", "car", "cab", "do", "dog"]:

cabrtdogLegendrootintermediateis_word = trueStored wordscat, car, cab, do, dogObservations"cat", "car", "cab" share c→a"do" is a prefix of "dog"→ "do" needs is_word=true5 words, 9 nodessearch("cat") = truesearch("ca") = false · startsWith("ca") = truesearch("do") = true (intermediate node has is_word marked)
Why "do" needs is_word even though it's not a leaf. A node can be both an intermediate (children exist) and a terminal (the path to it spells a stored word). When "do" and "dog" are both stored, search("do") must return true even though there's a g child below. The flag is what distinguishes "this is a stored word" from "this is a prefix of one."

3 · How to recognise it

A trie fits when the problem is keyed on prefixes rather than exact strings. A few signals:

  • Prefix-search / autocomplete. "What stored words start with X?" Or: "the user typed X — what did they mean?" Walk to the prefix node in O(k), DFS from there.
  • Word search on a grid (LC 79 / LC 212) with a dictionary. Many target words, same board. Build them into a single trie. DFS the board carrying a trie pointer, and prune the instant the current path isn't a prefix of any word. One traversal does what would otherwise take N.
  • Longest prefix match. Find the longest stored prefix of a query string. IP routing, URL routing, linguistic stemming. Walk the trie character-by-character on the query and remember the deepest is_word node you saw.
  • "Find words with property X" across a dictionary. A trie lets you enumerate stored words in lexicographic order, filter by structural properties (length, character set), or compute aggregates per prefix (count of words sharing a prefix) in single DFS passes.
  • Wildcard matching (LC 211). Patterns like "c.t" matching "cat" / "cot" / "cut". At a wildcard position, recurse into every child of the current node. The structure makes this natural.
  • XOR maximisation on integers (LC 421). Build a 32-level bit trie. For each query number, walk greedily, picking the opposite bit at each level when possible. That maximises XOR with the query in O(32).
  • "Replace each word with its shortest root in the dictionary" (LC 648). Walk the word and stop the instant you hit an is_word node. The trie makes that lookup constant per character.
  • Streaming character-by-character matching (LC 1032). Insert dictionary words reversed. On each new character, walk backwards from the current stream position checking is_word at each step. A match means a stored word just ended at the current character.
Hash map vs trie in one line: hash map answers "is this exact string a key?" Trie answers that and "is any key a prefix of this string?" and "what keys start with this prefix?" — all in the same O(L). If only exact match matters, the hash map is the right answer.

4 · Sister algorithms

A trie is the entry point to a family of string-indexing structures. When the plain trie isn't a good fit (too memory-hungry, can't handle substring queries, can't match many patterns at once) one of the sisters takes over.

  • Aho-Corasick automaton. A trie of patterns plus "failure links" telling you where to fall back when the current path can't be extended. Matches K patterns against a text in one linear pass: O(|text| + total |patterns| + matches). Sits inside every log scanner, intrusion detection system, spam filter, and grep -F -f patterns.txt.
  • Suffix tree (Ukkonen's algorithm). A compressed trie containing every suffix of a single string. Builds in O(n). Answers "does this substring appear?", "how often?", "longest repeated substring?" in O(L) per query. Harder to implement than a basic trie. Worth it when substring queries are the hot path.
  • Suffix array. A flat sorted array of suffix-start-indices, an alternative to the explicit suffix tree. Less expressive but far more memory-efficient: N integers instead of many megabytes of tree. Built in O(n log n), or O(n) with SA-IS. Used by bioinformatics aligners (BWA, Bowtie) and full-text search engines.
  • Patricia trie / radix tree. A trie compressed by merging chains of single-child nodes into one labelled edge. Saves an order of magnitude of memory when many words share long unique extensions, like IP routing tables where most subnets have a single nested route. Linux's IP routing and the Go HTTP router are both built on radix trees.
  • DAWG (Directed Acyclic Word Graph). The end of the compression line. Also merges equivalent suffix subtrees, not just prefix paths. Stores the English dictionary in around 300 KB. Scrabble engines (Quackle, Maven) ship a DAWG of every valid word.
  • Ternary search trie. Each node has three children: less, equal, greater. The search-tree analogue of a trie. Less memory than a full-array trie, slightly slower than a hash-map trie. Niche.
  • Generalised suffix tree. Suffix tree built over multiple strings simultaneously. Answers "longest common substring across the input set" and similar multi-string queries in O(n).
An escalation ladder, if you find yourself outgrowing the plain trie. Memory tight? Patricia / radix tree. Substring queries (not just prefix)? Suffix tree or suffix array. Many patterns vs one streaming text? Aho-Corasick. Most compact form of a static dictionary? DAWG.

5 · Canonical example — Implement Trie (LC 208)

Problem. Implement a trie supporting three operations: insert(word), search(word) returning true if the exact word was inserted, and startsWith(prefix) returning true if any inserted word starts with the prefix.

trie.insert("apple")
trie.search("apple")     -> True
trie.search("app")       -> False     # "app" was never inserted as a word
trie.startsWith("app")   -> True      # but "apple" starts with "app"
trie.insert("app")
trie.search("app")       -> True

Walk the word character by character from the root. At each step, look up the current character in the node's children; create a new child node if it isn't there; descend. When you reach the end of the word, set is_word = True on the node you landed on. Search and startsWith walk the same way and either find every character (returning the node) or fall off the trie (returning failure).

6 · Reference implementation

class TrieNode:
    __slots__ = ("children", "is_word")

    def __init__(self):
        self.children: dict[str, "TrieNode"] = {}
        self.is_word: bool = False


class Trie:
    def __init__(self):
        self.root = TrieNode()

    def insert(self, word: str) -> None:
        node = self.root
        for ch in word:
            if ch not in node.children:
                node.children[ch] = TrieNode()
            node = node.children[ch]
        node.is_word = True                  # mark the terminal node

    def search(self, word: str) -> bool:
        node = self._walk(word)
        return node is not None and node.is_word

    def starts_with(self, prefix: str) -> bool:
        return self._walk(prefix) is not None

    def _walk(self, s: str) -> "TrieNode | None":
        node = self.root
        for ch in s:
            node = node.children.get(ch)
            if node is None:
                return None
        return node
Without is_word, search("ant") would return true the moment "antelope" was inserted, because the path for "ant" already exists. The flag is the only thing separating "this is a stored word" from "this is a prefix of one." Easy to forget on a whiteboard.

7 · Variations

The shape of children and what you DFS over changes by problem. The common ones:

VariationWhat changesExample
Array-of-26 childrenFixed-size children: list[TrieNode] = [None] * 26, indexed by ord(ch) - ord('a'). Faster than a dict, but assumes a known alphabet.Lowercase-only LC problems
Dict-of-childrenThe flexible version. Use when the alphabet is unicode, or sparse, or unknown ahead of time. Slower constant factor; usually still fine.LC 208 in Python
Trie + DFS on gridBuild the trie from the dictionary; DFS the board; at each step pass the current trie node down. Stop the moment the next character isn't a child.LC 212 Word Search II
Suffix trie / suffix treeInsert every suffix of a single string. Answers "does this substring occur" in O(L). The compressed form (suffix tree) builds in O(n) with Ukkonen's algorithm; rare in interviews.Substring queries
Aho-CorasickA trie of patterns plus failure links, for matching many patterns against one text in a single linear pass. Comes up in log scanning and intrusion detection, almost never in interviews.Multi-pattern search
Bit trie (XOR trie)Insert integers as 32-bit binary strings. Walking the trie greedily picking the opposite bit at each step finds the value maximising XOR with a query in O(32).LC 421 Max XOR

7a · A trie growing as you insert

Inserting the words "cat", "car", "card", "cards", "dog" one at a time into an empty trie. Each character is a node; the bold green nodes are is_word = true. Notice how "car" → "card" → "cards" shares the same prefix path — that's where tries beat hash maps on prefix-heavy data.

catrdsdogLegendintermediateis_word = trueStored wordscat, car, card, cards, dogNotice"car" and "card" share path c→a→r"cat" diverges at depth 3 (t vs r)5 words, only 11 nodes (vs hash: 5 strings)
Storing a dictionary of 100,000 English words in a trie uses roughly 1M nodes. The hash-map equivalent is 100k strings totalling around 700k bytes. For Latin-alphabet text, prefix sharing typically saves 30–50% of memory over the hash map, because thousands of words share short prefixes like re-, un-, -ing, and -tion.

7b · Trie vs hash map — when each wins

OperationHash mapTrie
Exact lookupO(L) hash + O(L) compare = O(L)O(L) traversal
Prefix lookup ("does any stored key start with X?")Impossible without O(N) scanO(L) — walk to the prefix node
Autocomplete (top-K with prefix)O(N × L) scan + filterO(L + K) — DFS from the prefix node
InsertO(L)O(L)
DeleteO(L)O(L) — mark is_word=false, optionally prune dead branches
MemoryO(N × L) — full string per keyO(unique prefix nodes) — usually 30-50% smaller
Iteration in sorted orderO(N log N) — sort after scanO(N) — DFS yields sorted
Range queries ("keys between car and cat")HardNatural — bounded DFS
Reach for a trie when prefix matters: autocomplete, IP routing, spell-checkers, dictionary lookups with wildcards. Reach for a hash map when only exact matches matter and memory is tighter than CPU. Trie wins on prefix. Hash wins on randomly distributed exact keys.

8 · Three worked problems

Three problems where the trie is the central character. Each highlights a different trick: pruning a grid DFS, switching the alphabet from characters to bits, and ranking completions.

8a · LC 212 — Word Search II

Problem. Given an m × n board of characters and a list of words, return all words from the list that appear on the board (can be built by adjacent cells, each cell used at most once per word).

Input:  board = [["o","a","a","n"],
                 ["e","t","a","e"],
                 ["i","h","k","r"],
                 ["i","f","l","v"]]
        words = ["oath","pea","eat","rain"]
Output: ["oath","eat"]

The naive approach is "run LC 79 Word Search for each word separately": O(words × m × n × 4^L). On 10k words and a 10×10 board, that times out.

The trie trick: build all dictionary words into one trie, then do a single DFS over the board carrying a trie pointer. At each step, if the current cell's character isn't a child of the current trie node, prune immediately. When you hit is_word, record the word.

type TrieNode struct {
    children map[byte]*TrieNode
    word     string // store the word at terminal nodes (saves rebuilding from path)
}

func findWords(board [][]byte, words []string) []string {
    // Build trie
    root := &TrieNode{children: map[byte]*TrieNode{}}
    for _, w := range words {
        node := root
        for i := 0; i < len(w); i++ {
            c := w[i]
            if node.children[c] == nil {
                node.children[c] = &TrieNode{children: map[byte]*TrieNode{}}
            }
            node = node.children[c]
        }
        node.word = w
    }

    rows, cols := len(board), len(board[0])
    found := []string{}

    var dfs func(r, c int, node *TrieNode)
    dfs = func(r, c int, node *TrieNode) {
        if r < 0 || r >= rows || c < 0 || c >= cols {
            return
        }
        ch := board[r][c]
        if ch == '#' {
            return // already visited
        }
        next := node.children[ch]
        if next == nil {
            return // PRUNE — no word in dictionary continues this way
        }
        if next.word != "" {
            found = append(found, next.word)
            next.word = "" // dedupe: don't report the same word twice
        }
        board[r][c] = '#' // mark visited
        dfs(r-1, c, next)
        dfs(r+1, c, next)
        dfs(r, c-1, next)
        dfs(r, c+1, next)
        board[r][c] = ch // restore
    }

    for r := 0; r < rows; r++ {
        for c := 0; c < cols; c++ {
            dfs(r, c, root)
        }
    }
    return found
}
Without the trie, each board cell is potentially the start of up to 10⁴ DFS searches (one per word). With it, the board is searched once from each cell, and at every step the trie tells you whether the current path is still a prefix of any stored word. The pruning is what makes the algorithm sub-quadratic in the number of words. It's one of my favourite interview problems for exactly this reason: combine two data structures and the cost drops by a factor of the dictionary size.

8b · LC 421 — Maximum XOR of Two Numbers in an Array

Problem. Given an integer array nums, find the maximum XOR of any two distinct elements. n ≤ 2 × 10⁵, values up to 2³¹.

Input:  nums = [3, 10, 5, 25, 2, 8]
Output: 28
Explanation: 5 XOR 25 = 28 (in binary: 00101 ^ 11001 = 11100)

The brute force is O(n²). The trie approach is O(n × 32). Build a 32-level binary trie where each level is one bit of the number (MSB first). For each query number, walk the trie greedily — at each level, try to go the way that makes the XOR bit a 1 (the opposite of the query bit). If that child doesn't exist, fall back to the same-bit child.

type BitTrie struct {
    children [2]*BitTrie
}

func findMaximumXOR(nums []int) int {
    root := &BitTrie{}

    // Insert each number bit-by-bit, MSB first
    for _, num := range nums {
        node := root
        for i := 31; i >= 0; i-- {
            bit := (num >> i) & 1
            if node.children[bit] == nil {
                node.children[bit] = &BitTrie{}
            }
            node = node.children[bit]
        }
    }

    // For each number, greedily pick opposite bits to maximise XOR
    best := 0
    for _, num := range nums {
        node := root
        cur := 0
        for i := 31; i >= 0; i-- {
            bit := (num >> i) & 1
            opp := 1 - bit
            if node.children[opp] != nil {
                cur |= (1 << i) // we got a 1 in the XOR at bit i
                node = node.children[opp]
            } else {
                node = node.children[bit]
            }
        }
        if cur > best {
            best = cur
        }
    }
    return best
}
Once you've seen a bit trie used for XOR, the same shape solves several other "maximise some bitwise function" problems: max-AND-of-subset, longest matching prefix in IP routing, finding numbers with specific bit properties. The trie's tree structure mirrors binary decisions exactly. If a problem asks for the max of an XOR over an array, this is almost always the answer.

8c · LC 1268 — Search Suggestions System (HARD)

Problem. Given a sorted list of products and a searchWord, return a list of lists: for each prefix of searchWord (lengths 1 to len), return up to 3 products that start with that prefix, in lexicographic order.

Input:  products = ["mobile","mouse","moneypot","monitor","mousepad"]
        searchWord = "mouse"
Output: [
  ["mobile","moneypot","monitor"],   // prefix "m"
  ["mobile","moneypot","monitor"],   // prefix "mo"
  ["mouse","mousepad"],              // prefix "mou"
  ["mouse","mousepad"],              // prefix "mous"
  ["mouse","mousepad"]               // prefix "mouse"
]

Two clean approaches. Both worth knowing.

  • Trie + DFS for top-3 lexicographic. Build a trie. At each terminal, optionally store the actual word. For each prefix length k, walk to the prefix node, then DFS in alphabetical order, collecting up to 3 words.
  • Sort + binary search. Sort the products. For each prefix, binary-search for the leftmost product starting with it; take up to 3 consecutive products that still start with the prefix. Simpler code; the canonical "two-pointer" alternative.
type SuggestNode struct {
    children [26]*SuggestNode
    // Suggestions: top-3 lex-smallest words passing through this prefix
    suggestions []string
}

func suggestedProducts(products []string, searchWord string) [][]string {
    sort.Strings(products)
    root := &SuggestNode{}

    for _, p := range products {
        node := root
        for i := 0; i < len(p); i++ {
            idx := int(p[i] - 'a')
            if node.children[idx] == nil {
                node.children[idx] = &SuggestNode{}
            }
            node = node.children[idx]
            if len(node.suggestions) < 3 {
                node.suggestions = append(node.suggestions, p)
            }
        }
    }

    result := make([][]string, len(searchWord))
    node := root
    for i := 0; i < len(searchWord); i++ {
        if node != nil {
            idx := int(searchWord[i] - 'a')
            node = node.children[idx]
        }
        if node != nil {
            result[i] = node.suggestions
        } else {
            result[i] = []string{}
        }
    }
    return result
}
The trick is to push the lexicographic top-3 down the trie while inserting. Because products is pre-sorted, the first 3 strings to pass through any prefix node ARE the top-3 for that prefix. At query time, you don't DFS at all. You just read the stored list. Preprocessing is O(total chars of products); each query is O(|searchWord|). Production autocomplete systems do exactly this: precompute candidates at every prefix node so the query path stays flat.

9 · How to solve hard trie problems step-by-step

Five questions before writing code. The trie is dead simple; the senior part is choosing what to put in each node.

  1. What's the prefix-query structure? Characters in a 26-letter alphabet (lowercase ASCII)? Bits (XOR / IP-routing problems)? Variable-length segments (URL routing: /users, :id)? Unicode strings? The choice drives whether children is an array-of-26, an array-of-2, or a map.
  2. What gets stored at terminal nodes? Just a boolean (is_word)? The original word string (for grid problems where you need to report the word)? A frequency count (for ranked autocomplete)? A weight (for routing)? Decide before coding — it shapes the node struct.
  3. Is the trie static or dynamic? Static (built once, queried many times) → precompute aggregates at each node (top-k suggestions, prefix counts). Dynamic (inserts and queries interleave) → keep the structure lean and accept O(k) work per query for DFS aggregations.
  4. Do you need pruning during a DFS over a grid? If the problem is "find dictionary words on a board" (LC 212-flavour), the trie's central value is pruning. Without pruning, complexity explodes to words × 4^L; with pruning, it's roughly m × n × 4^L where the trie limits how far each branch can go. Make sure your DFS carries the trie pointer.
  5. What's the memory budget? A naive trie of 1 million strings of average length 10 with array-of-26 children uses around 1M × 26 × 8 bytes = 200+ MB. If memory is tight, switch to a map-of-children (saves the empty-slot waste), use a Patricia tree (compresses single-child chains), or move to a suffix array. State the back-of-envelope in interviews.
Junior candidates jump to class TrieNode with the textbook 26-children array. Senior candidates ask the four questions first: what's the alphabet, what do I store at terminals, is it static or dynamic, and what's the memory budget. Then they pick the right variant. The textbook makes trie design look uniform; it isn't.

10 · Trie variants — comparison

Different constraints call for different node layouts. Pick wrong and you get either a 3–5× constant-factor slowdown (map-of-children when array-of-26 would do) or a correctness bug (array-of-26 when the input contains unicode).

VariantChildren layoutMemory / nodeBest for
Standard trie (array-of-26)children [26]*Node~208 B (26 × 8 B pointers)Lowercase ASCII English. The fastest variant on hot paths — array index by ch - 'a'.
Array-of-128 / 256children [128]*Node~1 KBMixed-case, digits, punctuation. ASCII range. Used when the full ASCII alphabet matters.
Map-of-children (hash map)children map[rune]*Node~64 B + per-child overheadUnicode (Chinese, emoji), sparse alphabets, very large alphabets. Constant factor 3-5× slower than array.
Compressed / Patricia trie (radix tree)Edges labelled with multi-char strings; chains mergedVariable; often 5-10× smaller than naive trie on long-tail dataIP routing tables, URL routers (Go's gin, FastAPI), prefix-overlapping dictionaries with long unique extensions.
Ternary search trieEach node: left, eq, right (BST-like)~24 B (3 pointers)Memory-constrained dictionaries; mid-ground between hash-map-trie (fast but sparse) and array-trie (fast but big).
Bit trie (XOR / routing)children [2]*Node~16 B (2 pointers)XOR maximisation (LC 421), longest-prefix-match for IP addresses, "find number with property X" on bit-strings.
DAWG (Directed Acyclic Word Graph)Suffixes merged in addition to prefixes~10-50% of plain trieStatic dictionaries (Scrabble dictionaries, spell-check). Build once, query many.
Suffix treeStores every suffix; compressed via Ukkonen's algorithmO(n)Substring search across one or more documents. BLAST, BWA, bioinformatics.
For interviews with lowercase ASCII input, default to the array-of-26 variant. It's the fastest and the simplest. Switch to map-of-children if the problem says "any character" or shows unicode in examples. The other variants (Patricia, DAWG, suffix tree) are name-drop knowledge: useful for signalling depth, but you wouldn't implement them in a 45-minute window.

11 · Five-problem progression

#LCProblemWhy it's here
1LC 208Implement Trie (Prefix Tree)The minimal trie. Insert, search, startsWith — the three operations that everything else builds on.
2LC 211Design Add and Search WordsSearch with . wildcards. Forces you to recurse over all children at a wildcard position; the first taste of trie + DFS.
3LC 212Word Search IITrie + DFS on grid. The point of the pattern: pruning the DFS the instant the current path stops being a prefix of any word.
4LC 648Replace WordsInsert dictionary roots; walk each query word; stop the moment you hit is_word. Clean illustration of "shortest matching prefix".
5LC 421Maximum XOR of Two NumbersBit trie. The variation that shows the trie isn't only for strings — it's for any keys with an ordered alphabet.

12 · Common pitfalls

  • Forgetting the is_word flag. Without it, search("ant") returns true whenever a longer word like "antelope" is stored. The flag is the only thing distinguishing a stored word from a prefix of one.
  • Using dict[str, TrieNode] when an array-of-26 would do. Not a correctness bug, but a constant factor of 3–5× on hot paths. If the problem guarantees lowercase ASCII, prefer the fixed array.
  • Skipping trie pruning in Word Search II. Running DFS for each dictionary word separately is O(words × board). The trie collapses that to one DFS over the board, pruning the moment the current path stops being a prefix. Without the trie, LC 212 TLEs.
  • Confusing prefix match with full match. startsWith("app") and search("app") are different questions. The first walks to a node and returns true. The second walks to a node and checks is_word.
  • Not clearing nodes after a match in DFS. In Word Search II, once a word is found, prune the terminal node so the same word isn't reported twice. Cleaner than maintaining a results set.
  • Building the trie inside the query loop. Build it once, query many times. The whole point of a trie is amortising the dictionary build across many lookups.
  • Alphabet size confusion (26 vs 52 vs 128 vs unicode). Indexing by ch - 'a' works only for lowercase ASCII. Capital letters give negative indices. Digits crash the array. Always state the alphabet upfront, then either normalise upstream or pick the right children layout.
  • Not pruning empty subtrees during DFS. In LC 212, once a subtree contains no remaining words (their terminal flags have been cleared), you can delete the whole subtree and skip it on future DFS branches. Easy 2× win on dictionary-heavy inputs.
  • Memory blow-up storing 1M strings of length 100 naively. Worst case for an array-of-26 trie is roughly 1M × 100 × 26 × 8 bytes, or 20 GB. In practice, prefix sharing brings this down dramatically. With adversarial inputs (no overlap), the worst case bites. Reach for radix tree or map-of-children if memory is the constraint.
  • Off-by-one when iterating bit positions (LC 421). Bits go MSB-to-LSB: for i := 31; i >= 0; i--. Iterating LSB-first builds a trie that doesn't preserve magnitude ordering, breaking the "pick opposite bit" greedy. Insert and query in the same bit order.
  • Recursion depth on long words. A recursive trie DFS over a word of length 1000 blows the default stack in some languages. Either iterate or bump the limit explicitly. Real-world dictionaries cap at around 50; problem-generated adversarial inputs run longer.
  • Forgetting to dedupe stored words on insert. Inserting the same word twice is harmless for correctness but wastes work. Make insertion idempotent in your unit tests.

13 · What to say in the interview

Things worth saying out loud, in roughly this order, when you spot a trie problem.

  1. "Many strings sharing prefixes. Sounds like a trie." Name the pattern. Doubly so if the question mentions autocomplete, prefix queries, or a dictionary of words.
  2. "O(L) per operation, where L is the key length." Independent of how many words are stored. That's the headline property.
  3. "Space is O(N × L) worst case." N words of length L. In practice much less when prefixes overlap. That's why a trie beats a hash map on English dictionaries.
  4. "The is_word flag distinguishes stored words from prefixes." Mention it when describing the node. Interviewers want to hear you've thought about it.
  5. "Trie plus DFS for grid word search." If the problem looks like LC 212, name it. Then explain the pruning: stop walking the moment the current path isn't a prefix of any stored word.
  6. "Bit trie for XOR problems." If the question is about maximising XOR over an array, mention the 32-level binary trie and the greedy opposite-bit walk.
  7. Edge cases: empty string (mark is_word on the root), duplicate inserts (make insertion idempotent), unicode keys (use the dict variant), case sensitivity (normalise upstream).

14 · Where this gets asked

CompanyCommon framingWhy it fits their stack
GoogleImplement Trie (LC 208), Word Search II (LC 212), Replace Words (LC 648), Word Squares (LC 425)Search-quality + autocomplete teams. Trie + DFS on a board is a signature Google L4 question.
MetaDesign Add and Search Words Data Structure (LC 211), Concatenated Words (LC 472)Wildcard-search trie is the Meta favourite — comes up in spell-check + search-typeahead pipelines.
AmazonStream of Characters (LC 1032), Longest Word in Dictionary (LC 720), Top K Frequent Words (LC 692)Streaming + autocomplete align with Alexa intent-matching + product-search prefix indexes.
MicrosoftMaximum XOR of Two Numbers in an Array (LC 421), Word Break II (LC 140)Bit trie for XOR is the Microsoft / Bloomberg specialty. Word Break II is the recurring onsite.
AppleImplement Trie (LC 208), Search Suggestions System (LC 1268)Search Suggestions especially common — mirrors Spotlight + App Store autocomplete.
Bloomberg / CitadelMaximum XOR Subarray (variant of LC 421), Auto-complete with frequency rankingBit-trie XOR is a quant-flavoured question; ticker autocomplete shows up regularly.
Four sub-shapes cover almost every trie question you'll see. Standard insert/search/startsWith (LC 208). Wildcard search (LC 211), where . branches into every child. Trie plus DFS on a grid (LC 212), the biggest payoff combination of two structures. Bit-trie over 32-bit integers (LC 421) for XOR maximisation. Same node, different traversal rules.

15 · Try it yourself

Implement Trie · LC 208
The canonical drill. Get this in your fingers. Hint: TrieNode has children: dict[str, TrieNode] and is_word: bool. Insert walks the path, creating nodes; search walks and checks is_word; startsWith walks and returns true if you reach the end.
Add and Search Words · LC 211
Wildcard search. Hint: recursive search — on ., recurse into every child; on a literal, follow the matching edge. Return true if ANY recursive branch succeeds. Cap the recursion at word length so wildcards can't loop.
Word Search II · LC 212
Trie plus DFS, the biggest payoff combination. Hint: build a trie of the dictionary, then DFS the board. At each cell, walk into the trie child for that letter. If no edge, prune. When you hit is_word, record the word and clear the flag to dedupe.
Maximum XOR of Two Numbers · LC 421
Bit trie. Hint: insert each number bit-by-bit (MSB first) into a 32-level trie. For each number, walk the trie greedily picking the OPPOSITE bit at each level when possible — that maximises the XOR. O(N · 32).
Stretch — Stream of Characters · LC 1032
Reverse-trie + streaming. Hint: insert each dictionary word REVERSED into the trie. On each new character, walk backwards from the current position checking is_word at each step. First match means a stored word just ended.
One way to practise: write a minimal Trie class from memory, 15 lines or fewer. Once insert, search, and startsWith are in your fingers, every trie problem becomes "build it, then loop with a small twist." The data structure is the easy part. Recognising when to reach for it is the skill.

Further reading

  • Sedgewick & Wayne — Algorithms, Chapter 5.2. The textbook treatment of tries and ternary search tries. Worth reading once for the proof of the O(L) bound and the comparison to hash maps.
  • CP-Algorithms — trie / Aho-Corasick page. Compact reference with code. The Aho-Corasick section is more readable than most textbook versions.
  • Adjacent: Hash map. The structure to reach for when you only need exact-match lookup. A trie pays for prefix queries you don't need.
  • Adjacent: DFS. The other half of every grid-word-search and wildcard-search trie problem. Worth being fluent in both.
  • Suffix tree. The compressed-suffix-trie cousin. Builds in O(n) with Ukkonen's algorithm; answers substring queries in O(L). Rare in interviews, useful to recognise by name.
  • Semicolony: Suffix tree simulator. Visualises suffix-tree construction step by step. Useful for building intuition about how shared structure compresses.
Found this useful?