Performance

p99 latency

99th percentile response time.

In plain terms

For fan-out queries, system latency is dominated by the slowest. p99 of one becomes p50 of 100. Hedged requests cut the tail at modest cost.

Origin

Practitioner term that crystallised around 2010 alongside Dean & Barroso's "The Tail at Scale" (2013). The exact percentile (99 vs 99.9 vs 99.99) depends on the service's fanout.

Where it shows up in production

Amazon retail Customer-facing services target p99 under 100ms — anything slower starts hurting conversion.
Hedged requests Send a duplicate after p95 latency elapses; cuts the tail dramatically at modest extra cost.

On Semicolony

Paper papers

Sources & further reading

Paper Dean & Barroso — Tail at Scale (CACM 2013)
Paper Tail at Scale annotated (Semicolony)

Found this useful?