Hedged requests
Send a duplicate request to a second server if the first is slow; take whichever returns first.
Origin
Dean & Barroso, "The Tail at Scale" (CACM 2013). Send a duplicate at p95; take whichever returns first; cancel the loser. Cuts p99 dramatically.
Where it shows up in production
- Google web search In-paper benchmarks: hedged requests cut tail latency by 30-50% at a few percent extra cost.
- gRPC Native support since 2018 — set retryPolicy.hedgingPolicy in the service config.
On Semicolony
Sources & further reading
Found this useful?