Binary search
Two halves of one pattern. The first. Search a sorted array, is taught
on day one. The second. Search the answer space, when the answer is
monotonic in some predicate. Is the senior-interview lever. Every
LeetCode problem with constraint n ≤ 10⁹ is asking for one of
these two. Recognise which.
Try it. Search a sorted array
Drag through the canonical search. Each step halves the search range; the whole thing terminates in ⌈log₂ n⌉ steps.
n, after
k steps the range is n / 2ᵏ; the search ends when
this drops below 1, so k ≤ log₂(n). For n = 10⁹,
that's ≤ 30 steps. The interview corollary: if the constraint is
n ≤ 10⁹, you need an O(log n) approach — binary search is
usually the one.1 · How to recognise it: five distinct shapes
Binary search has five recognisable shapes in interview problems. The first two (sorted-array, answer-space) are the foundations; the next three (rotated, 2D, unimodal) are common variants that look different but reduce to "find the half that contains the answer". Naming the shape early prevents the wrong template.
Shape A: search a sorted structure
- The input is sorted (strictly or non-strictly).
- You're looking for a specific value, or the smallest/largest element matching some property.
- Constraint
n ≤ 10⁵–10⁹with O(log n) the only viable complexity. - "Find the position to insert", "find the first/last occurrence", "find the floor/ceiling".
Shape B: search the answer space
- The answer is bounded. There's a natural
[lo, hi]range it must lie in. - You can write a predicate
can(x)that returns true ifxis feasible. The predicate is monotone: once true, stays true. - "What is the smallest
ksuch thatP(k)holds?" The "minimise the maximum" / "maximise the minimum" framing. - Brute-force checking each candidate is too slow. Binary-searching candidates in O(log range × predicate cost) wins.
Shape C: rotated sorted array
- The array was sorted, then rotated by an unknown pivot. The strict ordering is broken at one point.
- The structural invariant still holds: at any
mid, at least one of the two halves[lo..mid]or[mid..hi]is fully sorted. Detect which, check whether the target falls inside it, recurse on the appropriate side. - Two sub-shapes: (1) search for a target (LC 33, LC 81), (2) find the minimum / rotation point (LC 153, LC 154). The minimum is "the unique point where
nums[i] > nums[i+1]". - Duplicates degrade the worst case to O(n), because
nums[lo] == nums[mid]hides which side is sorted (LC 81, LC 154). Skip the duplicate withlo++and continue.
Shape D: 2D matrix / sorted-rows
- Each row is sorted, each column is sorted, or the whole matrix flattened is sorted.
- LC 74 (matrix-flattened-is-sorted) → treat
m × nas a flat sorted array;mid = lo + (hi-lo)/2, convert to(mid/n, mid%n), do a normal Shape A search. - LC 240 (rows and columns each sorted, but not flattened-sorted) → start at the top-right corner; if the cell > target, move left; if < target, move down. O(m+n), not O(log mn) — but it's still the "binary-search-flavoured" approach.
- Tell: the input is presented as a matrix and contains the word "sorted" in any direction.
Shape E: peak / unimodal arrays
- The array is "bitonic". It increases then decreases (one peak), or decreases then increases (one valley). Like a unimodal function sampled at integer points.
- The "direction signal" at
mid: comparenums[mid]withnums[mid+1]. If ascending, the peak is to the right (or atmid); if descending, to the left. - LC 162 (find peak element), LC 852 (peak in mountain array), LC 1095 (find in mountain). All variants of the same shape.
- Generalises to ternary search on real-valued unimodal functions (golden-section search for smoother convergence).
2 · Sister algorithms
Binary search sits in a small family of "halve the search space" or "trisect the search space" algorithms. Each one applies when classic binary search doesn't quite fit. Knowing the neighbours prevents you from forcing binary search where ternary or exponential would be cleaner.
- Exponential search (also "galloping search"). When the array size is unknown, say you're reading from an infinite stream or a sorted iterator that only exposes
get(i). Start withi = 1, double untilget(i) > targetor out-of-bounds, then binary-search inside[i/2, i]. Total work:O(log p)wherepis the answer position — notlog nwherenis the unknown stream size. Used in LC 702 "Search in a Sorted Array of Unknown Size". - Ternary search. For unimodal functions (one peak / one valley). Compare values at the two trisection points
m1 = lo + (hi-lo)/3andm2 = hi - (hi-lo)/3; the trisection containing the lower value can be discarded.log₃of the search space. Slightly fewer iterations than binary search, but each iteration costs two function evaluations. The integer version: LC 162 "Find Peak Element" (though Shape E binary search also solves it). - Parametric search (Megiddo, 1983). The theoretical formalisation of "binary search on the answer". When the predicate
can(x)can be evaluated by simulating a parallel algorithm that runs inO(P)time usingO(W)processors, the optimisation problem can be solved inO(P · W · log W). Rarely implemented as written; the practical name for what falls out is just "binary search on the answer with a clever predicate". - Fractional cascading. Speeds up
ksimultaneous binary searches overksorted arrays fromO(k log n)toO(log n + k). Each array is augmented with pointers into the next; one search threads through all of them. Niche but elegant. Used in computational geometry (Chazelle's persistent search trees). - Golden section search. Ternary search's continuous-domain cousin for unimodal real-valued functions. Uses the golden ratio to reduce the search interval, so one of the two evaluation points is reused next iteration. Standard in numerical optimisation libraries (SciPy's
brentfalls back to it).
3 · Why binary search on the answer keeps coming up in product interviews
Shape B (binary search on the answer) is the most recognisable "L5-and-above" pattern in problem-solving rounds. Not because it's conceptually hard (the loop is six lines), but because writing the predicate exposes whether the candidate can frame an optimisation problem as a feasibility problem. That framing is the entire skill behind capacity planning, SLA design, and resource allocation.
Three reasons it dominates senior interviews:
- "Minimise the maximum / maximise the minimum" is the universal product framing. Every operational decision is "what's the smallest
Xthat still keepsYwithin bounds?": smallest replica count that holds p99 under SLA, smallest cache size that gives 90% hit rate, smallest worker pool that drains the queue. The shape is identical across domains. - It composes cleanly with greedy and DP. The outer binary search picks a candidate; the inner predicate uses any verification technique. LC 410 "Split Array Largest Sum" composes binary search with greedy splitting. LC 1631 "Path with Minimum Effort" composes binary search with BFS reachability. Composition is a senior-engineer signal.
- It turns intractable-looking constraints into log-time problems. When the constraint is
n ≤ 10⁹, the only viable complexity isO(log range × cheap_check). Recognising the predicate framing is what unlocks the constraint.
Real product examples
- Log timestamp lookup. Find the log entry closest to a given timestamp in a sorted (or near-sorted) log file. Shape A binary search on the timestamp index. Every observability tool (Datadog, Honeycomb, Loki) does this on every range query. The constant-factor tricks are big (skip lists, sparse indices), but the algorithm is binary search.
- Video seek. Given a target timestamp in a video, find the nearest keyframe (I-frame) to start decoding. The keyframe index is sorted by timestamp; lower_bound returns the start position. Every video player on the planet does this hundreds of times per session.
- Autoscaling thresholds. "Find the smallest number of workers such that 95% of requests finish in < 200ms." The predicate is "does N workers satisfy the SLA?", evaluated by simulating the queueing model. Binary search over
N ∈ [1, max]. Real-world systems (Kubernetes HPA, AWS Auto Scaling Groups) approximate this with linear feedback loops, but the offline planning version is literally Shape B. - Rate-limit capacity. "Find the largest steady request rate the service can sustain without a single 5xx in 10 minutes." Predicate: simulate at rate X, check error count. Binary search the rate. Used by every load-testing tool (Vegeta, k6, wrk) in some form.
- Build cache TTL tuning. "Find the smallest TTL that keeps cache hit rate above 80%." Predicate: replay last week's traffic against a simulated cache, measure hit rate. Binary search over TTL values. Exactly the "minimise X subject to Y" framing.
- Capacity-to-ship planning. LC 1011 isn't a contrived problem — it's the literal scheduling question Amazon's Operations Research team solves daily. Smallest container size that fits the shipment in N days, given variable package sizes.
4 · Canonical templates
Three templates cover ~95% of binary-search problems. They are pattern primitives, not problem solutions. Internalise the shapes; the rest is choosing which to specialise.
Template A: find target (closed interval, equality)
// Find the index of target in a sorted array, or -1 if absent.
func search(nums []int, target int) int {
lo, hi := 0, len(nums)-1 // closed interval [lo, hi]
for lo <= hi {
mid := lo + (hi-lo)/2 // avoid (lo+hi)/2 overflow
switch {
case nums[mid] == target:
return mid
case nums[mid] < target:
lo = mid + 1
case nums[mid] > target:
hi = mid - 1
}
}
return -1
}Template B: leftmost / lower_bound (half-open)
// Smallest index i such that nums[i] >= target.
// Returns len(nums) if no such i exists.
func lowerBound(nums []int, target int) int {
lo, hi := 0, len(nums) // half-open interval [lo, hi)
for lo < hi { // note: < not <=
mid := lo + (hi-lo)/2
if nums[mid] < target {
lo = mid + 1
} else {
hi = mid // <-- crucial: hi = mid, not mid - 1
}
}
return lo
}
// Mirror: upperBound returns smallest i with nums[i] > target.
// Count of target in nums = upperBound(nums, t) - lowerBound(nums, t).Template C. Search the answer space
// Smallest x in [lo, hi] such that can(x) is true,
// assuming can is monotone: false ... false true ... true.
func smallestFeasible(lo, hi int, can func(int) bool) int {
for lo < hi {
mid := lo + (hi-lo)/2
if can(mid) {
hi = mid // mid is feasible — might be the answer
} else {
lo = mid + 1 // mid too small
}
}
return lo // invariant: can(lo) is true at termination
}
// To use: define can(x), bound [lo, hi] correctly, call smallestFeasible.
// Time: O(log(hi - lo) × cost(can))Closed-closed vs half-open invariants, side-by-side
Most binary-search bugs trace back to mixing the two invariant styles. Either pick closed-closed everywhere or half-open everywhere; don't mix. Here are the two consistent styles, line by line:
| Aspect | Closed-closed [lo, hi] | Half-open [lo, hi) |
|---|---|---|
| Initial bounds | lo, hi := 0, len(nums)-1 | lo, hi := 0, len(nums) |
| Loop condition | for lo <= hi | for lo < hi |
| Shrink to left (mid not answer) | hi = mid - 1 | hi = mid |
| Shrink to right (mid not answer) | lo = mid + 1 | lo = mid + 1 |
| Result after loop | lo > hi (range empty); no implicit "answer" | lo == hi; lo is often the answer |
| Best for | Equality search (return mid on match) | Boundary search ("smallest i such that...") |
lo <= hi, then the answer (if any) is found
inside the loop and you return mid on match.
If your loop uses lo < hi, then the answer is at
lo (or lo - 1, depending on what you're
searching for) after the loop. Mixing the two (say,
lo < hi with hi = mid - 1) produces
infinite loops or off-by-ones. Pick one and never mix.5 · Five-problem progression
| Step | Problem | Difficulty | What's new |
|---|---|---|---|
| 1 | LC 704 · Binary Search | Easy | The Template A baseline. |
| 2 | LC 34 · First & Last Position | Medium | Lower_bound + upper_bound. Off-by-ones in the half-open variant. |
| 3 | LC 33 · Rotated Sorted Array | Medium | Sorted-but-rotated. Decide which half is sorted at each step. |
| 4 | LC 875 · Koko Eating Bananas | Medium | Binary search on the answer. The cleanest "is K feasible?" template. |
| 5 | LC 4 · Median of Two Sorted Arrays | Hard | Binary search on a partition position. The hardest classic. |
6 · Worked example. Koko (LC 875), live
Problem. Koko has piles of bananas of various sizes. She
eats k bananas per hour (any pile she picks). She has
h hours total. Find the smallest k such that
she finishes all piles in time.
Why binary search the answer. If k works,
any larger k also works. The predicate
can(k) = (hours_needed(k) ≤ h) is monotone. So binary
search over k ∈ [1, max(piles)].
can(k) = (Σ⌈piles[i] / k⌉ ≤ h) [false, false, ..., false, true, true, ..., true] — and
binary search on a sorted boolean array is what we just learned.Build the intuition. Three things to derive
- What's the answer space? The smallest possible
kis 1; the largest meaningful ismax(piles)(any larger doesn't help: Koko can finish a single pile per hour at that speed). Sok ∈ [1, max(piles)]. - What's the predicate?
can(k)= "does eating at speedkfinish all piles in ≤hhours?". For each pilep, hours = ⌈p/k⌉. - Why is
canmonotone? A higherknever increases hours. So oncecan(k)is true, every largerkalso is. The boolean array looks likeF F F ... F T T T ... T. Exactly what binary search wants.
Go scaffold. Combine Template C with your predicate
func minEatingSpeed(piles []int, h int) int {
// STEP 1 — bound the answer space
lo := 1
hi := 0
for _, p := range piles {
if p > hi {
hi = p
}
}
// STEP 2 — write the predicate
canFinish := func(k int) bool {
hours := 0
for _, p := range piles {
// TODO: ceil(p / k) without floats: (p + k - 1) / k
// TODO: accumulate into hours; early-exit if hours > h
}
return hours <= h
}
// STEP 3 — Template C: smallest feasible
for lo < hi {
mid := lo + (hi-lo)/2
if canFinish(mid) {
hi = mid
} else {
lo = mid + 1
}
}
return lo
}canFinish, once
hours exceeds h you know k is
infeasible. Return false immediately. Skipping the rest
of the piles can be a 5–10× speed-up on adversarial inputs. Worth
mentioning aloud in interviews even if not asked.The recipe. Answer-space binary search in three steps
- Identify the answer space. Bounds
[lo, hi]that contain the answer. For Koko: 1 (slowest) and max(piles) (eats biggest pile in 1 hour). - Write the predicate.
can(k)returns true iffkis feasible. For Koko:sum(⌈p / k⌉) ≤ h. - Confirm monotonicity. Verify (mentally) that
can(k)is false-then-true, not interleaved. For Koko: monotone because largerkonly reduces hours.
Other answer-space classics
- LC 1011: Capacity to Ship Packages in D Days. Same shape; predicate is "does this capacity finish in D days?".
- LC 410. Split Array Largest Sum. "Minimise the maximum subarray sum across k splits"; predicate is "can we split into ≤ k chunks each summing ≤ X?".
- LC 1631. Path with Minimum Effort. Binary search on the maximum edge weight; predicate is BFS reachability.
- LC 2616: Minimise Maximum of Pairs. Greedy + binary search; predicate counts feasible pairings.
- LC 1283. Find Smallest Divisor. Direct Koko variant.
7 · Walkthrough. Search in Rotated Sorted Array (LC 33)
LC 33 is the classic "binary search with a twist": the array is sorted
but rotated by some unknown pivot. The trick is that at every mid,
at least one half. Left or right of mid. Is fully sorted. Decide
which half is sorted; check whether the target falls in it; recurse on
the appropriate side.
Input: nums = [4, 5, 6, 7, 0, 1, 2], target = 0
Output: 4
Explanation: 0 is at index 4.The decision rule
At mid, compare nums[lo] with
nums[mid]:
- If
nums[lo] ≤ nums[mid]: the left half[lo..mid]is sorted. Check iftargetis in[nums[lo], nums[mid]]; if so, search left, else search right. - Else: the right half
[mid..hi]is sorted. Check iftargetis in[nums[mid], nums[hi]]; if so, search right, else search left.
Go scaffold. Fill in the four branches
func searchRotated(nums []int, target int) int {
lo, hi := 0, len(nums)-1
for lo <= hi {
mid := lo + (hi-lo)/2
if nums[mid] == target {
return mid
}
// KEY DECISION: which half is sorted?
if nums[lo] <= nums[mid] {
// LEFT half [lo..mid] is sorted
// TODO: is target inside [nums[lo], nums[mid]) ?
// if yes: hi = mid - 1; else: lo = mid + 1
} else {
// RIGHT half [mid..hi] is sorted
// TODO: is target inside (nums[mid], nums[hi]] ?
// if yes: lo = mid + 1; else: hi = mid - 1
}
}
return -1
}[nums[lo], nums[mid])
closed on the left, open on the right, because we already handled
nums[mid] == target above. Symmetrically for the right
half: (nums[mid], nums[hi]]. Forgetting the strict
inequality on one side gives wrong answers on edge cases like
[3, 1] with target 1.Trace through [4, 5, 6, 7, 0, 1, 2], target = 0
| lo | hi | mid | nums[mid] | Sorted half | Target in? | Action |
|---|---|---|---|---|---|---|
| 0 | 6 | 3 | 7 | left [4,7] | 0 not in [4,7] | lo = 4 |
| 4 | 6 | 5 | 1 | right [1,2] | 0 not in (1,2] | hi = 4 |
| 4 | 4 | 4 | 0 | found | return 4 |
nums[lo] == nums[mid] is ambiguous. You can't
tell which side is sorted. The fix: skip the duplicates with
lo += 1 and continue. Worst case degrades to O(n) when the
array is mostly the same value, but average case stays O(log n).8 · Two more worked problems: rotated minimum and peak
Two cousins of Shape C and Shape E. Both are short, but the loop- invariant choices are exactly where Shape A's habits will lead you astray.
8a · LC 153: Find Minimum in Rotated Sorted Array
Problem. A sorted array was rotated by an unknown pivot. Find the minimum element. The array contains no duplicates. O(log n).
Input: nums = [4, 5, 6, 7, 0, 1, 2]
Output: 0
Explanation: rotated from [0, 1, 2, 4, 5, 6, 7].The trick: at every mid, compare nums[mid]
with nums[hi]. If nums[mid] > nums[hi],
the rotation point is to the right of mid (so the minimum
is too); otherwise it's at mid or to the left. This shape
uses the half-open [lo, hi) invariant — different from
LC 33 (which uses [lo, hi]) — because we're not searching
for equality, we're searching for a position.
func findMin(nums []int) int {
lo, hi := 0, len(nums)-1
for lo < hi {
mid := lo + (hi-lo)/2
if nums[mid] > nums[hi] {
// minimum is strictly to the right of mid
lo = mid + 1
} else {
// minimum is at mid or to the left
hi = mid
}
}
return nums[lo]
}nums[hi], not nums[lo].
Comparing with lo works for unrotated arrays but breaks on
the rotated case where the left half is fully sorted. nums[hi]
is the canonical reference: if nums[mid] > nums[hi], the
rotation point must be in (mid, hi]; otherwise the array
from lo to mid contains the minimum. The
analogue in LC 33 (searching for a target) flips this — there we compare
with nums[lo] to decide which half is sorted, because we
need to know more than just "minimum lives where".8b · LC 162: Find Peak Element
Problem. An array where nums[i] != nums[i+1]
and the boundary is implicitly -∞. A peak is an element
strictly greater than both neighbours. Return the index of any peak in
O(log n).
Input: nums = [1, 2, 1, 3, 5, 6, 4]
Output: 1 or 5
Explanation: index 1 (value 2) is a peak; index 5 (value 6) is also a peak.
Either is acceptable.This is Shape E — unimodal-flavoured binary search. The direction
signal is "is the slope at mid ascending or descending?".
If nums[mid] < nums[mid+1], the slope is up at
mid — a peak must exist somewhere in (mid, hi]
(walk uphill long enough and you'll hit one, because nums[hi+1]
is effectively -∞). If nums[mid] > nums[mid+1],
the slope is down — a peak exists in [lo, mid].
func findPeakElement(nums []int) int {
lo, hi := 0, len(nums)-1
for lo < hi {
mid := lo + (hi-lo)/2
if nums[mid] < nums[mid+1] {
// ascending — peak is to the right
lo = mid + 1
} else {
// descending or equal — peak is at mid or to the left
hi = mid
}
}
return lo // lo == hi at termination
}[lo, hi]. At mid, the
slope tells you which half is guaranteed to contain a peak; the other
half is discarded. When lo == hi, the range has shrunk to
a single index — which must be the peak. The key insight is that the
invariant is preserved by both branches: ascending → peak in
(mid, hi]; descending → peak in [lo, mid].
Equality between neighbours doesn't happen (the problem guarantees
strict inequality).A rotated-array intuition diagram
Why "compare nums[mid] with nums[hi]" works
for rotated minimum:
8c · How to solve hard binary-search problems, step-by-step
- Classify the shape. Sorted (Shape A)? Sorted-but-rotated (Shape C)? Answer-space (Shape B)? 2D matrix (Shape D)? Unimodal/peak (Shape E)? Get this wrong and every subsequent decision is wrong. The right shape determines the loop invariant and the bounds.
- Identify the monotone predicate or the sorting axis. For Shape A: which axis is sorted? For Shape B: what's
can(x), and is it monotone? Write the predicate as a one-liner and convince yourself the boolean array looks likeF F F ... F T T T ... T(or all-true-then-all-false). If it'sF T F T, binary search is the wrong tool. - Set the bounds
[lo, hi]. For Shape A:[0, n-1]if you're using closed-closed;[0, n]if half-open. For Shape B:[min possible answer, max possible answer]. Bound mistakes here usually show up as "off by one at the boundary" or "predicate evaluated outside its defined range". - Pick the loop invariant. Two choices: (a)
lo <= hiwith closed-closed bounds andmid - 1/mid + 1updates (Template A — for equality search). (b)lo < hiwith half-open bounds andhi = mid/lo = mid + 1updates (Template B / C — for boundary search). Mix them and you get an infinite loop or an off-by-one. Pick one per problem and stick with it. - Write the loop, walk through n=1 and n=2 in your head. Single-element and two-element inputs catch most off-by-ones. If
lo < hiand the range has size 1, the loop should exit; if it doesn't, you have an infinite loop. If size 2, exactly one iteration should run.
mid
is included on the next iteration's range. Do those three in order and
the rest is bookkeeping. The hardest binary-search problems are hard
only because the shape is non-obvious (Shape D 2D matrix problems often
look like an O(mn) BFS problem; LC 1011 looks like simulation, not
binary search). Practice classifying.8d · Trace tables — see the loops execute step by step
Tracing on paper is the highest-use debugging skill for binary
search. Two traces follow — LC 153 on [4, 5, 6, 7, 0, 1, 2]
and LC 162 on [1, 2, 1, 3, 5, 6, 4].
LC 153 — rotated minimum trace.
| iter | lo | hi | mid | nums[mid] | nums[hi] | comparison | action |
|---|---|---|---|---|---|---|---|
| 1 | 0 | 6 | 3 | 7 | 2 | 7 > 2 | lo = 4 (min is right of mid) |
| 2 | 4 | 6 | 5 | 1 | 2 | 1 < 2 | hi = 5 (min at or left of mid) |
| 3 | 4 | 5 | 4 | 0 | 1 | 0 < 1 | hi = 4 (min at or left of mid) |
| 4 | 4 | 4 | — | — | — | lo == hi | exit; return nums[4] = 0 |
Three iterations on a length-7 input → log₂(7) ≈ 3 steps. The
invariant "the minimum is in [lo, hi]" is preserved on
both branches. The exit condition is lo == hi, which
can only happen if the single remaining cell is the minimum.
LC 162 — find peak element trace.
| iter | lo | hi | mid | nums[mid] | nums[mid+1] | direction | action |
|---|---|---|---|---|---|---|---|
| 1 | 0 | 6 | 3 | 3 | 5 | ascending | lo = 4 (peak in (mid, hi]) |
| 2 | 4 | 6 | 5 | 6 | 4 | descending | hi = 5 (peak at or left of mid) |
| 3 | 4 | 5 | 4 | 5 | 6 | ascending | lo = 5 (peak in (mid, hi]) |
| 4 | 5 | 5 | — | — | — | lo == hi | exit; return 5 (value 6, which is the peak) |
Three iterations again. The invariant "a peak exists in
[lo, hi]" is preserved by the slope-direction check.
Note: the array also has a peak at index 1 (value 2), but
the algorithm only needs to return any peak — index 5 is
equally valid.
8e · Why the loop terminates — invariant arithmetic
Every binary-search loop terminates because the range
hi - lo strictly decreases each iteration. The proof is
short but worth internalising:
- Half-open with
hi = mid: after the assignment,hi - loshrinks by(hi - mid) = (hi - lo)/2(or 1, whichever is larger). Always non-zero progress whenhi - lo >= 2; whenhi - lo == 1,mid == lo, andhibecomeslo— loop exits. - Half-open with
lo = mid + 1: shrinks by(mid + 1 - lo) = (hi - lo)/2 + 1. Always strictly positive. - Closed-closed with
lo = mid + 1orhi = mid - 1: each iteration removesmidfrom the range, shrinking by at least 1. Loop terminates whenlo > hi.
lo < hi with
lo = mid instead of lo = mid + 1. When
hi - lo == 1, mid == lo, and
lo = mid == lo — no progress, loop forever. (The
famous Joshua Bloch Arrays.binarySearch bug in Java,
fixed in JDK 6, was a different one: the integer overflow in
(low + high) / 2, corrected to
low + (high - low) / 2.)9 · Walkthrough — Median of Two Sorted Arrays (LC 4, HARD)
LC 4 is the hardest classic. Given two sorted arrays A and
B, find the median of their union. In O(log min(m,
n)). The naive merge-and-find-middle is O(m + n)
and easy; the logarithmic solution is the interview lever.
The partition idea
Imagine the merged-and-sorted union of A and B.
A median splits it into two halves: a left half of size
(m + n + 1) / 2 and a right half of the rest. We don't
need to actually merge. We just need to find a partition position
i in A and the corresponding
j in B such that
i + j = (m + n + 1) / 2 and the largest in the left halves
is ≤ the smallest in the right halves.
Binary-search on i in [0, m]; j
follows. The invariants to satisfy:
A[i−1] ≤ B[j] and B[j−1] ≤ A[i].
Go scaffold — partition-based median
import "math"
func findMedianSortedArrays(A, B []int) float64 {
// Binary-search on the SHORTER array — keeps complexity at O(log min)
if len(A) > len(B) {
A, B = B, A
}
m, n := len(A), len(B)
half := (m + n + 1) / 2
lo, hi := 0, m
for lo <= hi {
i := (lo + hi) / 2 // partition position in A
j := half - i // matching partition position in B
// GUARDS for the four boundary values
Aleft := math.MinInt
if i > 0 { Aleft = A[i-1] }
Aright := math.MaxInt
if i < m { Aright = A[i] }
Bleft := math.MinInt
if j > 0 { Bleft = B[j-1] }
Bright := math.MaxInt
if j < n { Bright = B[j] }
// CORRECT PARTITION CONDITION
// TODO: when do we know we've found the right partition?
// (Hint: every value on the left must be <= every value on the right)
if /* TODO: condition */ false {
// TODO: handle even vs odd total length to compute median
}
// OTHERWISE: which way to move i?
// TODO: if Aleft > Bright, we went too far right in A — narrow
// TODO: else, we went too far left in A — narrow other way
}
return 0 // unreachable for valid inputs
}Aleft ≤ Bright AND Bleft ≤ Aright.
Together they guarantee that everything in the left halves is ≤
everything in the right halves. So the median lives at the boundary.
Get one wrong and the binary search converges to garbage.O(m + n) version with confidence.
Either is a credible answer.9a · LC 4 — step-by-step partition derivation
LC 4 deserves an extra walkthrough because the partition trick is the kind of insight you either see immediately or chase for an hour. Here's the derivation in five short steps.
- Reframe the median as a partition. The median of a sorted array of length
nis the element at index(n-1)/2if odd, or the average of indicesn/2 - 1andn/2if even. Generalise: imagine cutting the merged array at positionk = (n+1)/2so the left half haskelements. The median ismax(left)if odd;(max(left) + min(right)) / 2if even. - Don't actually merge. A partition of the merged array can be expressed as a partition of A (at position
i) and a partition of B (at positionj = k - i). The four boundary values areA[i-1],A[i],B[j-1],B[j](using sentinels-∞/+∞for out-of-bounds). - State the correctness conditions. A valid partition has every left-half value ≤ every right-half value. Equivalently,
A[i-1] ≤ B[j]ANDB[j-1] ≤ A[i]. If both hold, the partition is valid; the median is computable from the four boundary values. - Binary search on
i. The bounds:i ∈ [0, m](m = len(A)). IfA[i-1] > B[j],iis too large — narrow right edge. IfB[j-1] > A[i],iis too small — narrow left edge. Otherwise, found the partition. Always binary-search on the shorter array to keep complexity atO(log min(m, n)). - Compute the median from the four boundary values. If
(m + n)is odd, the median ismax(A[i-1], B[j-1]). If even, it's(max(A[i-1], B[j-1]) + min(A[i], B[j])) / 2. The sentinel±∞handles the edges (i = 0,i = m, etc.) cleanly.
O(log min(m, n)) time and constant per-step work. Both
achieve the same complexity, but the partition version is what
interviewers expect because it generalises cleanly to "find the kth
element of the merger" — LC 4 is the special case k = (m+n)/2.10 · Floating-point binary search
The lo < hi termination rule fails on floats. They
don't form a strict total ordering with finite precision. Two strategies:
- Fixed iterations. Loop ~100 times for double precision (or ~50 for float). Each iteration halves the range; 100 iterations gives 2⁻¹⁰⁰ precision. Way past anything double-precision IEEE 754 represents.
- Tolerance threshold. Loop until
hi - lo < 1e-9(or whatever the problem demands). Risky if the function isn't well-behaved near the root. Can spin forever on plateaus.
Go skeleton — fixed-iteration float binary search
func sqrt(x float64) float64 {
lo, hi := 0.0, x
if hi < 1.0 {
hi = 1.0
}
for i := 0; i < 100; i++ { // 100 halvings -> 2^-100 precision
mid := (lo + hi) / 2
// TODO: if mid*mid <= x, the answer is in [mid, hi]
// TODO: else, the answer is in [lo, mid]
}
return lo
}LeetCode floating-point problems (LC 69 sqrt, LC 162 peak when relaxed to floats) all use the fixed-iteration pattern. Don't try to be clever with epsilons; 100 iterations is fast and bulletproof.
LeetCode floating-point problems (LC 69 sqrt, LC 162 peak, LC 153 minimum in rotated sorted with floats. Niche) all use the fixed-iteration pattern. Don't try to be clever with epsilons; 100 iterations is fast and bulletproof.
11 · Predicate-construction. The senior skill
The hardest part of binary-search-on-the-answer isn't the binary search it's writing the predicate. Three failure modes:
- Non-monotone predicate. If
can(k)can be true, false, then true again, binary search returns garbage. Re-frame: change what you're searching for, or use a different algorithm. - Predicate too expensive. If
can(k)is O(n²), the total is O(n² log range). Optimise the predicate first; binary search the answer can compose with greedy / DP / two-pointers. - Wrong polarity. "Minimise the maximum" vs "maximise the minimum" flip the search direction. Swap
hi = midandlo = mid + 1to switch sides.
| Problem family | Predicate template |
|---|---|
| "Smallest k that completes in time" | can(k) = (work_at(k) ≤ budget) |
| "Largest weight that fits" | can(w) = (combined ≤ capacity). Search for largest true |
| "Minimise max subarray sum, k splits" | can(s) = (greedy_split_count(s) ≤ k) |
| "Minimise max edge in path" | can(e) = reachable(s, t, only edges ≤ e) |
| "Find peak / unimodal" | can(i) = (a[i] > a[i+1]). Direction signal |
Three predicate-design techniques that come up repeatedly
Beyond the templates above, three concrete techniques recur in senior-interview Shape B problems. Each one is the difference between a predicate that works and one that's quietly O(n²).
- Greedy verification. For most "split / partition / pack" problems, the predicate is "can we achieve goal X with constraint Y?", and the verification is greedy: walk the input left-to-right, accumulating into chunks until the constraint forces a new chunk, count chunks at the end. LC 410 (Split Array Largest Sum) and LC 1011 (Capacity to Ship Packages) both use this. The greedy verification is O(n); the outer binary search is O(log range); total is O(n log range).
- Reachability verification (BFS / DFS). For path-flavoured problems like LC 1631 (Path with Minimum Effort), the predicate is "can we get from start to goal using only edges of cost ≤ E?". Verification is BFS or DFS restricted to those edges — O(V + E) per predicate call. Total: O((V + E) log range). Cheaper than running full Dijkstra in many cases.
- Counting verification (sorted-input two-pointers). For problems like LC 1283 (Smallest Divisor Given a Threshold), the predicate is "does dividing by k give a sum ≤ threshold?". Verification is a single linear pass over the input. When the input is sorted, a two-pointer or running-sum technique can verify in O(n) without re-sorting per binary-search step.
P of one predicate
evaluation? What's the binary-search range size R?
Total complexity is O(P · log R). If that exceeds the
problem's allowed budget (e.g. n · log n for
n ≤ 10⁵), you need to either shrink the range or
optimise the predicate before writing any code. Reaching this
analysis aloud during an interview is a clear senior signal.12 · Common pitfalls
- Integer overflow on
(lo + hi) / 2. Ifloandhiare near INT_MAX, the sum overflows. Uselo + (hi - lo) / 2. The classic Java bug. See Joshua Bloch's "Extra, extra — read all about it: Nearly all binary searches and mergesorts are broken" (2006). - Inclusive vs exclusive
hi. Template A useshi = n - 1with<=; Template B useshi = nwith<. Mixing them is half of all binary-search bugs. - Infinite loop. If you forget
lo = mid + 1(and instead dolo = mid) and the condition islo < hi, you'll loop forever whenloandhiare adjacent. - Predicate isn't monotone. Binary search on the answer requires the predicate to be monotone. If it isn't, you're searching in the wrong space. Re-frame the problem.
- Off-by-one when shrinking. "Smallest matching" uses
hi = mid(mid could be the answer). "Largest matching" useslo = mid + 1with a different return. Easy to flip. - Wrong range bounds. The
[lo, hi]range must contain the answer. Start withhi = max(possible), not arbitrary. Common bug:hi = len(nums)when the answer can be a value, not an index. - Floating-point binary search. Don't use
lo < hi. It'll never converge. Loop a fixed number of iterations (~100 for double precision) instead. - Rotated-array bug. Decide which half is sorted by comparing
nums[lo]withnums[mid]— notnums[mid]withnums[hi]. Both work but mixing them confuses corner cases. - Integer overflow on mid, the deeper version. Even
lo + (hi - lo) / 2overflows if you're searching the answer space andhican exceed2^62(e.g. binary search over a value range derived from a product of two large inputs). Solutions: use a wider type (Go'sintis 64-bit on most platforms, but Python ints are unbounded), or useuint64arithmetic with a sentinel. Worth thinking about when constraints saynums[i] <= 10^18. - Infinite loop from
lo < hioff-by-one. Whenhi - lo == 1,mid = lo. If your "go right" branch islo = mid, you make zero progress — the loop spins. The fix islo = mid + 1when shrinking to the right half. Symmetric: with closed-closed bounds andlo <= hi, you needhi = mid - 1on shrink-left, or you'll loop whenlo == hi. - Predicate evaluated at a boundary that's out of the defined range. If your predicate
can(x)assumesx >= 1but your initiallois 0 (a common mistake when copying the LC 875 template), it crashes with division-by-zero or a domain error. Always state the predicate's domain alongside the predicate itself; align the bounds to it. - Predicate boundary mishandling — the "exactly at the threshold" case. When
can(mid)is exactly on the boundary between feasible and not, the wrong branch picks the wrong half. Be explicit: for "smallest feasible", treatcan(mid) == trueas "mid is a candidate, search left for something smaller" →hi = mid. For "largest feasible", treat true as "mid works, search right" →lo = mid(and bias mid up withmid = lo + (hi-lo+1)/2to avoid infinite loop). Get the bias direction wrong and the loop never terminates. - Returning the wrong endpoint. At loop exit,
lo == hi(Template B/C) orlo > hi(Template A). For Template B/C,lois the answer; for "largest feasible" variants,himay be. Reread your invariant comment before deciding what to return — easy to confuse them.
13 · Complexity
| Variant | Time | Space | Notes |
|---|---|---|---|
| Search sorted array | O(log n) | O(1) | The baseline. |
| Search rotated sorted | O(log n) | O(1) | Same complexity, more cases. |
| Lower_bound / upper_bound | O(log n) | O(1) | Building blocks for "count occurrences" etc. |
Binary search on answer (predicate cost P) | O(log range × P) | Range is the answer space, often O(max input). | |
| 2D matrix sorted both ways | O(log(m × n)) | O(1) | Treat as flattened sorted array. |
| Median of two sorted arrays | O(log min(m, n)) | O(1) | Binary search on the partition. |
14 · Variants & related patterns
- Two pointers. Same precondition (sorted), different pattern. Use two pointers when the answer involves pairs; binary search when it's a single position.
- Exponential search. When the input is unbounded — find the right
hiby doubling, then binary search. Used for searching in streams. - Fractional cascading. Binary search across multiple sorted arrays in O(log n + k). Niche but elegant.
- Ternary search. For unimodal functions (one peak). Three-way partition; search the half without the peak.
- Adjacent: binary-search simulator. The site's interactive tool. Use it as the visual companion.
- Adjacent: DP. Some "minimise the max" problems can also be solved by DP. Binary search is usually faster but DP is the fallback when the predicate isn't easy.
15 · Where this gets asked
| Company | Common framing | Why it fits their stack |
|---|---|---|
| Search in Rotated Sorted Array (LC 33), Median of Two Sorted Arrays (LC 4), Split Array Largest Sum (LC 410) | Predicate-on-the-answer is a Google L5 signature. Median-of-two-arrays especially asked for hard onsites. | |
| Meta | Search Insert Position (LC 35), Find First and Last Position (LC 34), Capacity to Ship Packages (LC 1011) | Bounds-finding shows up in feed-ranking + slot-allocation. Ship Packages is a recurring Meta onsite. |
| Amazon | Koko Eating Bananas (LC 875), Capacity to Ship Packages Within D Days (LC 1011), Find Peak Element (LC 162) | "Minimise the maximum" frames as operational throughput at AWS. Koko is the textbook example. |
| Microsoft | Sqrt(x) (LC 69), Find Minimum in Rotated Sorted Array (LC 153), Search a 2D Matrix (LC 74) | Bounds-checking + rotated-array logic. Calibrated to test edge-case discipline. |
| Apple | First Bad Version (LC 278), Valid Perfect Square (LC 367) | API-bounded binary search. When you can only query, not iterate. Common Apple onsite pattern. |
| Uber / Lyft / Stripe | Allocate Books / Minimum capacity allocation, Time-based key-value store (LC 981) | Resource-allocation by binary-searching the answer is a fleet/dispatch staple. Time-based KV mirrors event-log queries. |
16 · Try it yourself
- Binary Search · LC 704
- The minimal template. Get this in your fingers exactly. Hint:
lo, hi = 0, len(nums) - 1; whilelo <= hi;mid = lo + (hi - lo) // 2; compare, narrow. Write it from memory until you can do it in 60 seconds. - Find First and Last Position · LC 34
- The bounds-finding drill. Hint: write two helpers.
lower_bound(leftmost index where target could go) andupper_bound(rightmost). Uselo < hiwithhi = midon match for lower;lo = mid + 1on no-match. Always converges. - Search in Rotated Sorted Array · LC 33
- Rotated-array logic. Hint: at each step, one HALF is guaranteed sorted. Determine which half (compare
nums[lo]tonums[mid]), check if target is in the sorted half's range, narrow accordingly. - Koko Eating Bananas · LC 875
- Binary-search-on-the-answer. Hint: the answer (eating speed) lies between 1 and
max(piles). Predicate: at speedk, total hours = sum ofceil(pile_i / k); feasible iff ≤h. Binary-search for the minimum feasiblek. - Stretch. Median of Two Sorted Arrays · LC 4
- The senior-loop hard. Hint: binary-search the partition of the SHORTER array. For a partition
ion the short array, the corresponding partition on the longer is(total_len + 1) / 2 - i. Valid iffmax(left) ≤ min(right)across both. O(log min(m, n)).
Arrays.binarySearch. Pick ONE template and stick with it. Mine is lo <= hi with explicit mid + 1/mid - 1 updates. Drill it on LC 704 until you can write it in your sleep, then specialise for boundary-finding and answer-space variants.Further reading
- Joshua Bloch. "Nearly all binary searches and mergesorts are broken" (Google Research, 2006). The original write-up of the (lo + hi) / 2 overflow bug. Required reading.
- CLRS. Introduction to Algorithms, Chapter 12. Binary search in trees; the same logic, applied to balanced BSTs.
- Sedgewick & Wayne. Algorithms (4th ed). Section 1.1 has the classic walkthrough of the algorithm.
- Codeforces blog. "Binary search on the answer". The cleanest free explanation of the answer-space technique. Search "binary search on answer Codeforces".
- Bentley. Programming Pearls, Column 4. An entire essay on the difficulty of writing binary search correctly. Required reading.
- Adjacent: sliding window. The other "monotonic structure" pattern.
- Adjacent: how binary search works. The site's mechanism explainer.