Performance

Hedged requests

Send a duplicate request to a second server if the first is slow; take whichever returns first.


In plain terms

Dean & Barroso 2013. Cuts tail latency at modest cost. Tied requests cancel the loser.

Origin

Dean & Barroso, "The Tail at Scale" (CACM 2013). Send a duplicate at p95; take whichever returns first; cancel the loser. Cuts p99 dramatically.

Where it shows up in production
  • Google web search In-paper benchmarks: hedged requests cut tail latency by 30-50% at a few percent extra cost.
  • gRPC Native support since 2018 — set retryPolicy.hedgingPolicy in the service config.
On Semicolony
Sources & further reading
Found this useful?