11 / 20 · Tier C
Patterns / 11

Monotonic stack & deque

Maintain a stack or deque whose contents are always in monotone order. Pop while the invariant breaks; push the new element. The pattern with the funny name. Once you recognise it, the implementation is short, and the amortised O(n) argument is the same every time: each element is pushed and popped at most once.


1 · Why monotonic stack matters: the intuition

Most "for each element, find the nearest one that satisfies P" problems look quadratic on first read. The naive loop nests one scan inside another and pays O(n²). Monotonic stack collapses that to O(n) by exploiting a single observation: once an element has been "shadowed" by a later element that beats it on the property we care about, it can never be the answer for anything further to the right. We're allowed to throw it away.

The stack stores "candidates that might still answer a future query". The monotone invariant (strictly increasing or strictly decreasing from bottom to top) is exactly the condition under which no candidate is dominated by any other candidate currently on the stack. When a new element arrives that invalidates some prefix of the candidates, we pop them off (each one finds its answer at pop time) and push the new arrival. Every element is pushed once, popped at most once. Total work: O(n) amortised, regardless of how adversarial the input looks.

The intuition in one sentence. The stack is a running set of "candidates for next-greater (or smaller) lookup", and the monotone invariant guarantees the candidate set is exactly the relevant ones. Every dominated candidate has already been evicted. Amortised O(1) per push / pop because each element transits the stack exactly once.

2 · How to recognise it: six distinct shapes

Monotonic stack tends to be the right tool when any of the following hold. The first two are the strongest signals; the rest support. Naming the shape early decides the invariant direction.

Shape A: next greater / smaller / previous greater / smaller

  • For each index i, find the nearest j with a value relationship. The defining shape. Direction (next vs previous) decides which way you scan; greater vs smaller decides the monotone direction.
  • Output shape: an array of indices (or distances, or values) the same length as the input. Default to -1 or 0 for indices with no answer: the elements that remain on the stack at the end.
  • Examples: LC 496 Next Greater Element I, LC 739 Daily Temperatures, LC 503 Next Greater Element II (circular), LC 1019 Next Greater Node in Linked List.

Shape B: histogram / largest rectangle

  • Maximise rectangle area in a bar chart. The width is bounded by the nearest shorter bar on each side; a monotonically increasing stack of indices gives you both boundaries on the same pass.
  • The 2D extension: build a row-by-row histogram (consecutive 1s above each cell) and run the 1D histogram algorithm per row. O(rows × cols) total.
  • Examples: LC 84 Largest Rectangle in Histogram, LC 85 Maximal Rectangle, LC 1504 Count Submatrices With All Ones (binary-matrix histogram variant).

Shape C: trapping rain water

  • Water trapped between two taller walls. Each "puddle" is bounded by a taller bar on the left and a taller bar on the right; the trapped depth is min(left_max, right_max) - h[i].
  • Two solutions: two-pointers (O(n) time, O(1) space) or monotonic decreasing stack. Pop a bar when a taller one arrives, compute the bounded water in between. Both are O(n); the stack version reuses the width-at-pop math from histogram.
  • Examples: LC 42 Trapping Rain Water, LC 407 Trapping Rain Water II (3D, uses a priority queue instead).

Shape D: stock span / streaming previous-greater

  • Online (streaming) variant. Elements arrive one at a time; for each, return the number of consecutive previous elements that are ≤ the new arrival.
  • Stack of (value, span) pairs. On a new value, pop all entries with smaller values, summing their spans into the new entry's span; push.
  • Examples: LC 901 Online Stock Span, LC 962 Maximum Width Ramp (offline variant).

Shape E: sliding window max / min (deque variant)

  • Fixed-size window slides one step at a time. You need the max (or min) inside the window per step in amortised O(1).
  • The deque variant. Push to back, pop from back while smaller (invariant), pop from front when the front index falls out of the window.
  • Examples: LC 239 Sliding Window Maximum, LC 1438 Longest Continuous Subarray With Absolute Diff (two deques: one for max, one for min).

Shape F: sum of subarray minimums (contribution counting)

  • "Sum over all subarrays of the min element". The trick is to count contribution per element: how many subarrays have arr[i] as their min? That count is (i - prev_smaller(i)) × (next_smaller(i) - i).
  • Why monotonic stack: the two boundaries (previous smaller, next smaller) come from monotonic stack passes. Total contribution per element is value × left-count × right-count.
  • Examples: LC 907 Sum of Subarray Minimums, LC 2104 Sum of Subarray Ranges, LC 828 Count Unique Characters of All Substrings.
Stack vs deque in one line. Stack handles "next / previous greater / smaller" and "histogram / sum-of-mins": questions where you only ever pop from one end. Deque handles sliding window max / min, where you need to evict from the front as the window advances. Both keep contents monotone; the difference is which end you pop from.

3 · Sister algorithms

Monotonic stack sits in a family of "linear-time order statistic" tools. Each one extends the basic idea in a different direction.

  • Monotonic deque. The two-ended cousin used for sliding window max / min. Same invariant, two pop sites. Beats a heap (O(n log k)) for fixed-size windows because each index is pushed once and popped at most twice (once from each end), giving O(n) amortised.
  • Cartesian tree. A binary tree built from an array such that an in-order traversal recovers the array and the root is the minimum (or maximum). Constructible in O(n) using a monotonic stack: scan left-to-right, popping while the top exceeds the new value, then attaching. The Cartesian tree's structure encodes the answer to "for each index, who's the next smaller on each side": exactly what monotonic stack computes per element.
  • Sum-of-all-subarray-min / max via contribution counting. The pattern from Shape F: compute previous-smaller and next-smaller for every index (two monotonic stack passes), then contribution = value × left-span × right-span. The same machinery solves LC 2104 (Sum of Subarray Ranges) as max-contribution minus min-contribution.
  • Sparse table for static RMQ. When the array doesn't change and you have many "range min" queries, preprocess in O(n log n) and answer each query in O(1) by overlapping two power-of-two intervals. Monotonic stack is O(n) per query in the worst case; sparse table dominates for >O(log n) queries on the same array. For RMQ with point updates, segment tree is the answer instead.
  • The "next-greater plus geometry" combo. Some convex-hull algorithms (Graham scan, Andrew's monotone chain) maintain a stack with a monotone cross-product invariant, popping while the top three points form a non-left turn. Same amortised argument: each point pushed once, popped at most once.
The composition rule. If the problem needs "next-greater plus a range query", precompute next-greater with monotonic stack, then use the precomputed array with a sparse table or segment tree. If it needs "next-greater on a streaming input", the stack-of-(value, span) form (Shape D) is the answer. It collapses repeated work into the span field. Convex hull and Graham scan are the geometric cousins that share the monotone-pop machinery.

4 · Canonical example: Daily Temperatures (LC 739)

Problem. Given an array temperatures of daily temperatures, return an array answer such that answer[i] is the number of days you have to wait after day i to get a warmer temperature. If no such day exists, put 0.

Input:  [73, 74, 75, 71, 69, 72, 76, 73]
Output: [ 1,  1,  4,  2,  1,  1,  0,  0]
Explanation: day 0 (73°) → next warmer is day 1 (74°), wait 1 day.
             day 2 (75°) → next warmer is day 6 (76°), wait 4 days.
             days 6 and 7 → no warmer day after, 0.

Walk left to right with a stack of indices whose temperatures are monotonically decreasing from bottom to top. When today's temperature beats the temperature at the top of the stack, today is the answer for that index. Pop it and record the distance. Keep popping while the invariant breaks, then push today. Each index is pushed once and popped at most once, so the whole pass is O(n).

4a · Walk the stack, frame by frame

Daily Temperatures on the input [73, 74, 75, 71, 69, 72, 76, 73]. The stack holds indices; the value next to each index is temps[i] for reference. Bold red entries are about to be popped because today's temperature beat them.

i=0 t=73i=1 t=74i=2 t=75i=3 t=71i=4 t=69i=5 t=72i=6 t=76i=7 t=73073174275275371275371469275572676676773push 0pop 0,ans[0]=1pop 1,ans[1]=1push 3push 4pop 4,3ans[4]=1ans[3]=2pop 5,2ans[5]=1ans[2]=4push 7answer →[ 1, 1, 4, 2, 1, 1, 0, 0 ]two unpopped indices (6 & 7) stay 0 — no warmer day after them.

Read across left-to-right. Each frame is the stack after processing index i. Notice that indices 6 and 7 never get popped. They're the trailing pair with no warmer future day, so their answer stays at the initial 0.

5 · Reference implementation

def daily_temperatures(temps: list[int]) -> list[int]:
    n = len(temps)
    answer = [0] * n
    stack = []   # indices; temps[stack] is monotonically decreasing top-down

    for i, t in enumerate(temps):
        # Invariant: every index left in the stack has a temperature
        # strictly greater than every later index above it.
        while stack and temps[stack[-1]] < t:
            j = stack.pop()
            answer[j] = i - j        # distance from j to its next-warmer day
        stack.append(i)

    # Anything still on the stack has no warmer day after it; answer stays 0.
    return answer
The amortised argument. The while inside the for looks like it could be O(n²), but each index is pushed exactly once and popped at most once across the entire run. Total work is bounded by 2n stack operations, so the algorithm is O(n) time and O(n) space. State this out loud; it's the part interviewers want to hear.

6 · Variations

The direction you scan and the direction of monotonicity change by question. Six common variations:

VariationWhat changesExample
Next greater elementScan left to right, stack monotonically decreasing by value. Pop when current beats top.LC 496 Next Greater Element I
Previous greater elementScan left to right, stack monotonically decreasing. The element below the top after pushing is the previous greater.LC 901 Online Stock Span
Next smaller elementSame shape, stack monotonically increasing by value. Pop when current is below top.LC 84 (used as a subroutine)
Sliding window maximumMonotonic deque, not stack. Push to the back, pop from the back while smaller, pop from the front when the index falls out of the window.LC 239 Sliding Window Maximum
Largest rectangle in histogramMonotonic increasing stack of indices. Append a sentinel 0 at the end so everything flushes.LC 84 Largest Rectangle in Histogram
Stock spanPrevious greater-or-equal; the span at index i is i - prev_greater(i). Online variant keeps a stack of (price, span) pairs.LC 901 Online Stock Span

7 · Five-problem progression

#LCProblemWhy it's here
1LC 739Daily TemperaturesThe minimal monotonic stack. Indices on the stack, decreasing temperatures, distance on pop.
2LC 496Next Greater Element ISame shape with a hash-map lookup on top. Locks in the "stack of indices, monotone by value" mental model.
3LC 84Largest Rectangle in HistogramThe hard one. Increasing stack plus sentinel; width is computed from indices at pop time. Read this twice.
4LC 239Sliding Window MaximumWhere the deque variant earns its keep. Pop from both ends; one for the invariant, one for the window.
5LC 901Online Stock SpanOnline / streaming version. Stack of (value, span) pairs collapses repeated work.

7a · The deque variant: step it yourself

Sliding window maximum (LC 239) is the same invariant with a second pop site. The deque holds indices whose values are decreasing from front to back; the front is always the current window's max. Push to the back, pop from the back while the new value beats the tail, pop from the front when the index falls out of the window. Step through the animator below; edit the array or the window size k to see the deque reorganise itself.

Interactive · Monotonic deque · sliding window max deque holds indices, values strictly decreasing front→back
array
0
4
1
1
2
7
3
3
4
5
5
9
6
2
7
6
8
8
deque (front → back)
i=0 4
r=0, value=4 · push idx 0
emitted maxes
(none yet — window not full)
step 1 / 9
Why O(n). Each index is pushed exactly once and popped at most once — 2n operations total across the whole walk. Inside the inner pop-back loop you might pop several elements at once, but the amortised work per outer step is O(1). The front of the deque is always the index of the current window's maximum, so reading it is O(1).
Why the deque, not a heap? A max-heap also solves sliding window max — at O(n log k). The monotonic deque is O(n) amortised because each index is pushed once and popped at most twice (once from each end). On a 100,000-element window, the heap costs ~10× more operations. The interview answer for LC 239 is the deque; the heap is acceptable but earns a follow-up "can you do better?".

7b · Reference in three languages

Same Daily Temperatures algorithm in Python, Go, and TypeScript. The shape of the code is identical; the array + stack primitives are the bits that vary. Worth knowing the Go version specifically because interview rooms that use Go tend to ask specifically for the slice-based stack idiom.

# Python — list-as-stack, .append() and .pop()
def daily_temperatures(temps: list[int]) -> list[int]:
    n = len(temps)
    answer = [0] * n
    stack: list[int] = []
    for i, t in enumerate(temps):
        while stack and temps[stack[-1]] < t:
            j = stack.pop()
            answer[j] = i - j
        stack.append(i)
    return answer
// Go — slice-as-stack, idiomatic [:len-1] for pop
func dailyTemperatures(temps []int) []int {
    n := len(temps)
    answer := make([]int, n)
    stack := make([]int, 0, n) // preallocate cap; amortised O(1) append

    for i, t := range temps {
        for len(stack) > 0 && temps[stack[len(stack)-1]] < t {
            j := stack[len(stack)-1]
            stack = stack[:len(stack)-1]
            answer[j] = i - j
        }
        stack = append(stack, i)
    }
    return answer
}
// TypeScript — Array.prototype.push / .pop, identical shape
function dailyTemperatures(temps: number[]): number[] {
    const n = temps.length;
    const answer = new Array(n).fill(0);
    const stack: number[] = [];
    for (let i = 0; i < n; i++) {
        while (stack.length && temps[stack[stack.length - 1]] < temps[i]) {
            const j = stack.pop()!;
            answer[j] = i - j;
        }
        stack.push(i);
    }
    return answer;
}
The one Go-specific tip. Preallocate the stack with make([]int, 0, n). Without it, every append past the small initial capacity triggers a slice grow + copy, turning the amortised O(1) into O(log n) per push. Visible on inputs over ~1000 elements in benchmarks.

7c · Worked example: Largest Rectangle in Histogram

LC 84 is the hard canonical. Heights [2, 1, 5, 6, 2, 3]. The stack holds indices whose heights are monotonically increasing. When a shorter bar arrives, we pop and compute the maximum rectangle that used the popped bar as its shortest height. The width is bounded by the new shorter bar on the right and the index below the popped one on the stack on the left.

heights:  [2, 1, 5, 6, 2, 3]
indices:   0  1  2  3  4  5

i=0 (h=2): stack [0]                         — push, 2 is the first bar
i=1 (h=1): 1 < heights[0]=2 → pop 0
             width = i = 1, area = 2*1 = 2
           stack [1]                         — push, h=1 is the new floor
i=2 (h=5): push, stack [1, 2]
i=3 (h=6): push, stack [1, 2, 3]
i=4 (h=2): 2 < 6 → pop 3
             width = i - stack.top() - 1 = 4 - 2 - 1 = 1
             area = 6 * 1 = 6
           2 < 5 → pop 2
             width = 4 - 1 - 1 = 2, area = 5 * 2 = 10 ← MAX
           stack [1, 4]                       — push h=2
i=5 (h=3): push, stack [1, 4, 5]

End: flush remaining (treat sentinel as h=0)
   pop 5: width = 6 - 4 - 1 = 1, area = 3
   pop 4: width = 6 - 1 - 1 = 4, area = 2*4 = 8
   pop 1: width = 6, area = 1*6 = 6

Maximum: 10
The width formula. When you pop index j: the right boundary is i (the bar that triggered the pop, or n at end of array). The left boundary is the index now on top of the stack (or -1 if the stack is empty). Width = right - left - 1. That subtraction is the bug everyone makes the first time.

A sentinel trick saves the end-of-array flush: append a 0 bar to the input. Every remaining index will pop because 0 is smaller than any real height. Same final answer, no duplicated post-loop block. Worth using consistently; it's the kind of small tidiness interviewers notice.

7d · Wrong solutions, and why they look right

The monotonic-stack family attracts plausible-looking wrong answers because the brute-force is short and reads correctly to a fresh eye. The four most common dead-ends, ranked by how often they show up in real interview transcripts:

Attempted solutionWhy it looks rightWhy it isn't
Nested for loops scanning forward One-liner per index; obviously correct. O(n²). Fails on n=10⁵. Acceptable as a warmup before stating the better idea, not as a final answer.
Sort the array, look up positions Feels like it should work since we want "next bigger". Sorting loses positional information. "Next greater after i" depends on order in the original array, not the sorted order.
Max-heap holding (value, index) and pop while top.index < i Heap gives O(log n) per insert; feels efficient. O(n log n) at best, and the indexing logic is tricky. Beaten by the deque-based O(n) for sliding window max, and unnecessary for next-greater (the stack already gives O(n) amortised).
Pushing values onto the stack instead of indices Cleaner-looking code; the value is "what matters". Most monotonic-stack questions need a distance or a width. You can't recover those from values alone. Store indices and look up the value via arr[i].
The interview tell. When you state "this looks O(n²) but each element is pushed and popped at most once, so the total is O(n) amortised", you're saying the magic phrase. Without it, even a correct implementation gets graded as "got the answer but didn't justify the complexity": the same outcome as not solving it for the strict graders.

7e · Where this shows up in production

Monotonic stack + deque are quietly load-bearing in several high-throughput systems. Recognising the pattern in the wild makes the interview question feel less abstract.

  • Bloomberg / Citadel, stock span. "How many consecutive previous bars had a lower close?" is monotonic stack in its purest form (LC 901). Bloomberg's tickerplant runs this on every symbol per tick; the per-symbol stack is bounded by ~30 entries in practice.
  • Time-series databases, max-by-window queries. InfluxDB's MAX() over a sliding window and Prometheus's max_over_time both reduce to monotonic deque for non-trivial windows. Without it, every query is O(window-size) per point; with it, O(1) amortised.
  • Image processing, morphological dilation. The 1-D sliding-window-maximum is a primitive in fast morphological filters. OpenCV's dilate() uses van Herk / Gil-Werman algorithm which is monotonic deque applied row-by-row.
  • Audio processing, peak detectors. "Recent peak value" on a buffer is the same problem. Real-time DSP libraries (JUCE, Pure Data externals) ship this as a primitive.
  • Network telemetry, burst detection. "Largest sustained rate in the last K seconds" reduces to maximum subarray with monotonic deque. Used by CDN edge nodes to detect DDoS pulse patterns.
  • Job schedulers, span-of-priority queries. Kubernetes scheduler uses span-style queries to decide which pending pod has been waiting longest relative to a priority threshold.

7f · Monotonic stack vs related patterns

The hard part of the family is recognising which sub-pattern applies. This table is the decision tree, condensed.

SymptomReach forWhy
"Next/previous greater/smaller element for each i"Monotonic stackDefining shape. O(n) amortised.
"Max/min in a sliding window of fixed size k"Monotonic dequeSame invariant, two pop sites. O(n) amortised.
"Max/min in a sliding window of variable size (sum ≤ target)"Sliding window + monotonic dequeWindow expands/contracts; deque tracks the extreme inside.
"Top-k largest in a stream"Min-heap of size kHeap, not stack: you need ordered access to ALL k, not just the extreme.
"Largest rectangle in histogram / max-area sub-matrix"Monotonic increasing stack + sentinelThe width-at-pop trick. O(n) per row; O(rows × cols) for 2-D.
"Subarray sum constraints (≤ k, exact k)"Prefix-sum + hash / two pointersDifferent shape. Monotonic stack is the wrong tool: sums aren't monotone like values are.
"Trapping rain water"Monotonic stack OR two pointersLC 42 has both solutions. The stack version uses the same width-at-pop math as histogram.
"Range minimum query (offline, no updates)"Sparse table or segment treeMonotonic stack is O(n) per query; a sparse table is O(1) after O(n log n) preprocessing.

8 · Worked problem: LC 503 Next Greater Element II (circular)

Problem. Given a circular integer array nums, return, for each element, the next greater element in the circular order (or -1 if none exists). Circular means the search wraps from the end back to the start.

Input:  nums = [1, 2, 1]
Output: [2, -1, 2]
Explanation:
  index 0 (1): next greater is 2 (at index 1) → 2
  index 1 (2): no greater anywhere; -1
  index 2 (1): wrap to index 0, find 2 (at index 1) → 2

The trick. Walk the array twice (modulo n). On the first pass we may not resolve every index; values near the end might find their answer at the start of the array. On the second pass we only need to pop remaining stack entries when a wrap-around value beats them; we do not push anything new (each index belongs to exactly one output slot).

Go scaffold: fill in the two passes

func nextGreaterElements(nums []int) []int {
    n := len(nums)
    ans := make([]int, n)
    for i := range ans {
        ans[i] = -1
    }
    stack := make([]int, 0, n) // indices, monotonically decreasing by value

    // Walk 2*n iterations, index = i % n
    for i := 0; i < 2*n; i++ {
        idx := i % n
        for len(stack) > 0 && nums[stack[len(stack)-1]] < nums[idx] {
            j := stack[len(stack)-1]
            stack = stack[:len(stack)-1]
            ans[j] = nums[idx]
        }
        // On the FIRST pass (i < n), push the index.
        // On the SECOND pass, only resolve; don't push again.
        if i < n {
            stack = append(stack, idx)
        }
    }
    return ans
}
// Time: O(n) — each index pushed once, popped at most once
// Space: O(n) for the stack + output
Why only one push. If we pushed on both passes, the stack could hold an index twice and we'd overwrite ans[j] with the wrong value (or fail to overwrite when expected). Pushing only on the first pass keeps the invariant "every stack entry corresponds to a still- unresolved output slot". The second pass is purely a resolver.

Alternative: duplicate the array

A cleaner-reading variant: concatenate nums with itself, run a normal monotonic stack pass over the doubled array, then take the first n answers. Same complexity, more memory, fewer index gymnastics. Use whichever your fingers remember.

9 · Worked problem: LC 84 Largest Rectangle in Histogram (the hard canonical)

Problem. Given an array heights of bar heights (uniform width 1), return the area of the largest rectangle that fits entirely inside the histogram. The rectangle can span multiple bars but its height is bounded by the shortest bar inside its span.

Input:  heights = [2, 1, 5, 6, 2, 3]
Output: 10
Explanation: the 5-and-6 bars together form a 2×5 = 10 rectangle.

The width-at-pop trick

Walk left-to-right with a stack of indices whose heights are monotonically increasing from bottom to top. When a bar arrives that is shorter than the top of the stack, pop the top and compute the rectangle that uses the popped bar as its limiting height. The rectangle's right boundary is the bar that triggered the pop (exclusive); the left boundary is the index now on top of the stack (exclusive); width = right - left - 1.

Go implementation: with the sentinel trick

func largestRectangleArea(heights []int) int {
    // SENTINEL — append a 0 bar so the final flush happens inside the main loop
    heights = append(heights, 0)
    n := len(heights)

    stack := make([]int, 0, n)
    best := 0

    for i := 0; i < n; i++ {
        for len(stack) > 0 && heights[stack[len(stack)-1]] > heights[i] {
            top := stack[len(stack)-1]
            stack = stack[:len(stack)-1]

            // Width: bounded by the new shorter bar (right) and the
            // index now on top of stack (left). If the stack is empty,
            // the popped bar extends all the way to the left edge.
            var leftBound int
            if len(stack) == 0 {
                leftBound = -1
            } else {
                leftBound = stack[len(stack)-1]
            }
            width := i - leftBound - 1
            area := heights[top] * width
            if area > best {
                best = area
            }
        }
        stack = append(stack, i)
    }
    return best
}
// Time: O(n) — each bar pushed once, popped once
// Space: O(n) for the stack

Stack state diagram for [2, 1, 5, 6, 2, 3, 0]

Why this is the hard canonical. The width-at-pop formula (right - left - 1) is the bug everyone makes the first time. The sentinel trick (append a 0) eliminates a duplicated end-of-loop block. LC 85 (Maximal Rectangle) is the 2D extension: for each row of a binary matrix, compute the histogram of consecutive 1s above, then run LC 84 on that row's histogram. Total O(rows × cols).

10 · Worked problem: LC 907 Sum of Subarray Minimums (contribution counting)

Problem. Given an array arr, return the sum over every contiguous subarray of min(subarray), modulo 10⁹ + 7.

Input:  arr = [3, 1, 2, 4]
Output: 17
Explanation: subarray mins are
  [3]→3, [1]→1, [2]→2, [4]→4,
  [3,1]→1, [1,2]→1, [2,4]→2,
  [3,1,2]→1, [1,2,4]→1,
  [3,1,2,4]→1
  Sum = 3+1+2+4+1+1+2+1+1+1 = 17

The reframe: count contributions, not subarrays

Iterating all O(n²) subarrays is too slow for n ≤ 30000. Reframe: instead of summing mins per subarray, ask "how many subarrays have arr[i] as their minimum?" Then contribution is arr[i] × count(i).

A subarray has arr[i] as its min when it includes i and contains no strictly smaller element. Let L(i) = distance to previous smaller (exclusive), R(i) = distance to next smaller (exclusive). Then the number of valid subarrays containing i is L(i) × R(i), and contribution is arr[i] × L(i) × R(i).

Why strict on one side and non-strict on the other. If the array has duplicates and you use strict-less on both sides, two equal values double-count the subarrays that contain both. Convention: strict on one side, non-strict on the other (e.g. previous smaller-or-equal, next strictly smaller). That picks each duplicate group's "minimum representative" exactly once. The contribution formula is the same; only the comparator changes.

Go scaffold: two monotonic stack passes

func sumSubarrayMins(arr []int) int {
    const MOD = 1_000_000_007
    n := len(arr)

    // L[i] = i - prev_smaller_or_equal(i), R[i] = next_smaller(i) - i
    L := make([]int, n)
    R := make([]int, n)

    // PASS 1 — previous smaller-or-equal
    stack := []int{}
    for i := 0; i < n; i++ {
        // pop while top is STRICTLY GREATER than arr[i]
        for len(stack) > 0 && arr[stack[len(stack)-1]] > arr[i] {
            stack = stack[:len(stack)-1]
        }
        if len(stack) == 0 {
            L[i] = i + 1            // no previous smaller; extends to start
        } else {
            L[i] = i - stack[len(stack)-1]
        }
        stack = append(stack, i)
    }

    // PASS 2 — next strictly smaller
    stack = stack[:0]
    for i := n - 1; i >= 0; i-- {
        for len(stack) > 0 && arr[stack[len(stack)-1]] >= arr[i] {
            stack = stack[:len(stack)-1]
        }
        if len(stack) == 0 {
            R[i] = n - i
        } else {
            R[i] = stack[len(stack)-1] - i
        }
        stack = append(stack, i)
    }

    // ACCUMULATE — contribution of each index
    total := 0
    for i := 0; i < n; i++ {
        total = (total + arr[i]*L[i]*R[i]) % MOD
    }
    return total
}
// Time: O(n) — two stack passes plus an accumulation
// Space: O(n)

Same machinery solves LC 2104 (Sum of Subarray Ranges): split into max-contribution and min-contribution and subtract. The contribution- counting trick generalises whenever an "aggregate over all subarrays" can be re-expressed as an "aggregate over per-element contributions".

10a · How to solve hard monotonic-stack problems: step by step

  1. Decide the stack's monotone direction. "Next greater" / "previous greater" → strictly decreasing stack (pop while smaller). "Next smaller" / "previous smaller" → strictly increasing stack (pop while bigger). Write this down before touching code; mixing it up produces a plausible algorithm that returns nonsense.
  2. Decide what's stored on the stack. Index (most common, gives you width / distance for free), value (only when distance doesn't matter), or pair (index + accumulated count for the stock-span / contribution shape). Default to index; switch to pair only when you need to collapse "the next smaller plus how many we walked past" into one O(1) lookup.
  3. Decide when to pop. "Incoming breaks the invariant": write the comparator out loud. Strict vs non-strict matters when duplicates exist (see Shape F discussion). For "next greater" with strict-greater desired, pop while top < current. For "next greater-or-equal", pop while top <= current.
  4. Decide what's recorded at pop time. Distance (current_index - popped_index), area (popped_height × width), the value itself (arr[current_index]), or a contribution accumulation. Most monotonic-stack problems do exactly one thing at pop time; if you need to do several, you may be conflating two passes.
  5. Decide how to handle the leftover stack. Indices that never get popped have no answer in the input: default to -1 / 0 / left unset. For histogram-style problems, append a sentinel (0) so the flush happens inside the main loop; for next-greater problems, leftover indices map to "no greater element exists".
  6. Sanity-check on small inputs. Run the algorithm by hand on n=1 (no pops), n=2 with both increasing and decreasing inputs, and an all-equal input. The all-equal case is where strict vs non-strict bugs surface; the increasing input is where "never pops" can mask a missing flush.
The mental shortcut. When stuck, restate the problem as "for each index, what's the nearest X on which side?". If you can fill in X (greater value, smaller value, etc.) and which side (left or right), the algorithm is a mechanical specialisation of the six steps above. If you can't phrase the problem that way, monotonic stack might not be the right tool — sliding window or two-pointers usually is.

11 · Common pitfalls

  • Pushing values instead of indices. Most of the questions in this family want a distance or a width, both of which need position. Push indices, look up the value with arr[stack[-1]].
  • Wrong direction of monotonicity. "Next greater" needs a decreasing stack; "next smaller" needs an increasing one. Flipping it gives a plausible-looking algorithm that returns nonsense. State the invariant in words before you write the comparator.
  • Forgetting the sentinel for histogram. Largest rectangle in histogram leaves indices on the stack at the end whose rectangles never get computed. Either run a flush loop after the main pass, or append a 0 bar that forces everything to pop.
  • Mistaking the deque pattern for the stack pattern. Sliding window max is not a monotonic stack — it's a monotonic deque, because you also evict from the front as the window slides. Same invariant, two pop sites.
  • Claiming the wrong complexity. The nested while looks like O(n²); the amortised O(n) argument is "each element is pushed and popped at most once". Without that sentence, the analysis isn't complete.
  • Strict vs non-strict comparisons. < vs <= changes which of equal-valued elements get popped. Usually doesn't change the answer, sometimes does; worth a moment of thought on duplicates.
  • Forgetting the sentinel, and the off-by-one on width. For histogram, the formula width = right - left - 1 assumes both bounds are exclusive. If you use the popped index as the left bound (instead of the index now on top of the stack), you'll silently undercount by 1 on every pop. When the stack is empty after a pop, the "left bound" is -1 (one past the imaginary bar at the array's left edge), not 0. Test on a strictly-increasing input like [1, 2, 3, 4] where the answer is 6 (the rectangle 4×1 + 3×2 + 2×3 = 6); getting that wrong means the left-edge sentinel is wrong.
  • Duplicate handling: the "which equal element wins" question. For LC 907 (Sum of Subarray Minimums), if you use strict-less on both sides, equal values double-count the subarrays containing both. The convention is strict on one side, non-strict on the other: pick previous strictly-smaller AND next smaller-or-equal, or vice versa. The same trap exists in LC 84 with duplicate heights: equal bars should NOT pop each other, or the width calculation gets confused.
  • Pushing before popping. The invariant must hold for the stack before the new element is pushed. Push first and you'll have a temporary violation; if the loop body reads the stack expecting the invariant, you'll get garbage. Sequence: pop while broken, THEN push.
  • Storing too little at pop time. If the problem needs both the popped index AND the index below it (e.g. histogram width), you must read the new top of stack after popping. Reversing the order (reading the new top before the pop) gives the wrong left bound.
  • Allocating the stack inside a hot loop. In Go, stack := []int{} at the top of the function is fine, but inside a per-row loop in LC 85 it's a fresh allocation every row. Hoist it out and reset with stack = stack[:0]: same code, ~3-5× faster on a 200-row grid.

12 · What to say in the interview

  1. "Next or previous greater or smaller → monotonic stack." Name the pattern out loud the moment you see the shape. Interviewers are listening for the recognition.
  2. State the invariant. "The stack holds indices whose values are monotonically decreasing from bottom to top." Saying it aloud catches direction bugs before you type them.
  3. "I'll push indices, not values." Mentions why: distance and width need positions.
  4. "Amortised O(n) — each index is pushed once and popped at most once." The whole reason this beats the obvious O(n²). Don't skip it.
  5. "Sliding window max is the deque variant." Shows you know stack and deque are the same idea with two pop sites instead of one.
  6. Edge cases to mention. Empty input, single element, all-equal values (strict vs non-strict comparator), strictly decreasing input (stack never pops until the end), the sentinel for histogram.

13 · Where this gets asked

CompanyCommon framingWhy it fits their stack
GoogleDaily Temperatures (LC 739), Largest Rectangle in Histogram (LC 84), Trapping Rain Water (LC 42)Indexing + temporal sequence problems.
MetaNext Greater Element I/II (LC 496/503), Remove K Digits (LC 402)Range queries on feed time-series.
AmazonSum of Subarray Minimums (LC 907), Online Stock Span (LC 901)Order-book / pricing tail aggregations.
MicrosoftMaximal Rectangle (LC 85 — histogram extension), Asteroid Collision (LC 735)Office grid layout + games engine teams.
BloombergStock Span (LC 901), Sliding Window Maximum (LC 239 — deque variant)Financial signal pipelines run these patterns constantly.
Uber / AirbnbDaily Temperatures variants for pricing-window questionsTime-bucket aggregation pipelines.
Pattern of patterns. Three sub-shapes: (1) next-greater (or next-smaller) element via decreasing/increasing stack, (2) largest rectangle in histogram (the canonical area-problem), (3) monotonic deque for sliding-window max/min. Master these three and most monotonic-stack questions are adaptations.

14 · Try it yourself

Daily Temperatures · LC 739
The canonical next-greater. Push indices onto a decreasing stack; pop and record the gap when a larger temperature arrives. Hint: store indices, not values — you need indices for the gap calculation.
Next Greater Element II · LC 503
Circular variant. Walk the array twice, modulo the length. Hint: only push on the first pass; on the second pass, only pop. This keeps the stack the right shape across the wraparound.
Largest Rectangle in Histogram · LC 84
The hard canonical. Stack of increasing heights; when a smaller bar arrives, pop and compute the area. Hint: append a 0 sentinel to flush the stack at the end — saves a duplicated post-loop block.
Stretch — Maximal Rectangle · LC 85
2D extension. For each row, compute the histogram of consecutive 1s above; run LC 84 on it. Hint: this is the most-asked "you've solved LC 84, now combine it with another idea" interview escalation. Worth recognising on sight.
How to practise. Three passes: scratch, aloud, memory. The hard part of monotonic-stack is not the code; it's recognising "I need next-greater or next-smaller" from the problem statement. Reading 5 problem statements and naming the trigger phrase out loud is half the work.

Further reading

  • CP-Algorithms, Stack with the minimum / maximum. The clearest short writeup of the invariant and the amortised argument. Worth a single read.
  • Competitive Programmer's Handbook (Laaksonen), Chapter 8. Sliding window minimum via deque, with the histogram problem one section over.
  • Adjacent: Two pointers. The other "amortised O(n) by walking once" pattern. Different shape, same complexity argument.
  • Adjacent: Sliding window. The sliding window max problem sits on the boundary between these two patterns; the deque is what makes it O(n).
  • Adjacent: Heap. The obvious-but-slower alternative for sliding window max — O(n log k) with a heap, O(n) with a monotonic deque.
  • Semicolony: Algorithm visualiser. Stepping through Daily Temperatures with the stack drawn out makes the invariant click faster than reading does.
Found this useful?