Performance

Hedged requests

Send a duplicate request to a second server if the first is slow; take whichever returns first.

In plain terms

Dean & Barroso 2013. Cuts tail latency at modest cost. Tied requests cancel the loser.

Origin

Dean & Barroso, "The Tail at Scale" (CACM 2013). Send a duplicate at p95; take whichever returns first; cancel the loser. Cuts p99 dramatically.

Where it shows up in production

Google web search In-paper benchmarks: hedged requests cut tail latency by 30-50% at a few percent extra cost.
gRPC Native support since 2018 — set retryPolicy.hedgingPolicy in the service config.

On Semicolony

Sources & further reading

Found this useful?