Performance

Rate limiter

Caps request rate per key.


In plain terms

Token bucket, leaky bucket, fixed window, sliding window. Each has different burst tolerance.

Origin

Token bucket originated in ATM networking (1990s). The algorithm is universal now; only the operational choices differ — sliding window vs fixed window, per-key vs per-IP.

Where it shows up in production
  • Stripe API 100 read req/s, 100 write req/s per account, token-bucket. Documented in their engineering blog.
  • GitHub API 5000 requests per hour per token; rate-limit headers in every response.
  • Cloudflare rate limiting Per-IP / per-URL / per-cookie rules at the edge.
On Semicolony
Sources & further reading
Found this useful?