Idempotence at scale
"Exactly-once" is a marketing term. The network gives you at-most-once or at-least-once, never both. Real systems pick at-least-once delivery, give every request an idempotency key, and let the receiver dedupe. Stripe, Twilio, Kafka, and Temporal all run versions of the same trick. This page covers how to fake exactly-once well enough that nobody notices.
Why exactly-once is a fiction
TCP, RPC, and every message bus in the world give you a choice: at-most-once or at-least-once. Not both. The sender ships a message and waits for an ack. If the ack doesn't arrive, the sender can't tell whether the message was lost, the ack was lost, or the receiver crashed mid-handling. The only two options are "give up" (at-most-once, and you lose messages) or "retry" (at-least-once, and the receiver sees duplicates).
No clever protocol closes this gap on its own. Two Generals shows it's impossible in the worst case. What looks like "exactly-once delivery" in marketing material is always one of two things: at-least-once delivery plus idempotent processing, or a closed system where the producer, the broker, and the consumer all coordinate (Kafka EOS, for example). The honest framing is that exactly-once is a property of the application layer, not the wire.
Idempotency keys
The standard answer. Each request carries a unique key, usually a UUIDv4 the
client generates before the first attempt. The server keeps a table mapping
(key → response). The first request with a given key runs the operation and
records the response. Any duplicate request with the same key skips the work and returns
the cached response.
Stripe's API is the textbook version: every POST accepts an
Idempotency-Key header, and the docs are explicit that clients should send
one on every retryable request. GitHub does the same with X-GitHub-Delivery.
AWS calls them "idempotency tokens" and uses them in API Gateway and Lambda's
request-id model.
POST /v1/charges HTTP/1.1
Host: api.stripe.com
Idempotency-Key: 4b9d8f2a-1c3e-4f5a-b2d7-8e1c9a4b6f30
Content-Type: application/x-www-form-urlencoded
amount=2000¤cy=usd&source=tok_visa&description=Order+1834The key has to be unique per logical operation, not per attempt. A client that generates a fresh UUID on every retry has defeated the whole mechanism, and that's the most common incident pattern. Generate the key once when the user clicks the button, keep it across retries, and throw it away only when the request finally succeeds or is explicitly abandoned.
The dedup window
How long do you keep the (key → response) table? Too short and a duplicate
that arrives after the window expires runs the operation again. Too long and the table
balloons. The rule of thumb: longer than the longest plausible retry
budget.
Stripe keeps idempotency keys for 24 hours. Twilio keeps them for 7 days. Cloudflare's Workers API keeps them for 30 minutes. The right answer depends on how long your clients keep retrying. A mobile app that retries on next launch needs days; an internal RPC with a 30-second budget needs minutes. Pick a number, write it in the docs, and make sure clients give up before that window closes.
The outbox pattern
Idempotency keys cover the inbound side. The harder problem is the outbound side: a local database write that has to trigger an external message, like "charge succeeded → send receipt email" or "order placed → publish to Kafka". You now have two atomic operations across two systems, which is a distributed transaction, and those are exactly the thing the rest of this site warns against.
The outbox pattern sidesteps it. Inside the local transaction, you write the message into a same-database table called the outbox. The transaction commits atomically: both the business state and the outbox row are saved together, or neither is. A separate worker process then polls the outbox, publishes each row to the external system with an idempotency key, and marks the row as published once the destination acks.
Sagas
A saga is a long-running workflow built from a sequence of idempotent steps, each with an explicit compensation action that undoes its effect. "Book flight, book hotel, charge card." If the card charge fails, the saga runs "cancel hotel" then "cancel flight" instead of trying to roll back a distributed transaction that was never really atomic in the first place.
Pat Helland's Building on Quicksand is the canonical paper. The practical implementations are Temporal (and its predecessor Cadence at Uber), AWS Step Functions, and Netflix Conductor. They all give you the same shape: define each step as an idempotent activity, define the compensation, and let the orchestrator handle retries, timeouts, and failure recovery. The orchestrator's job is to remember where it was and resume safely after a crash, which it can only do because every step is idempotent.
Effectively-once
The honest term for what the industry actually delivers. At-least-once delivery plus idempotent processing equals exactly-once-looking semantics at the application layer. Kafka Streams uses this exact framing. Kafka's "exactly-once semantics" feature, added in 0.11 (KIP-98), is the canonical example: it combines a transactional producer that dedupes by producer ID and sequence number with transactional consumer offsets, so a read-process-write loop inside Kafka behaves as if each input were processed exactly once.
The fine print is that Kafka EOS only holds within Kafka. The moment you read from Kafka and write to Postgres, you're back to at-least-once and need idempotency keys again. The protocol gives you exactly-once within a closed system; the application has to carry it to the edges.
Where to put the idempotency key
| Location | Example | Trade-off |
|---|---|---|
| HTTP header | Stripe's Idempotency-Key, GitHub's X-GitHub-Delivery | Clean separation from payload, no schema change. Easy for clients to forget. |
| Request body | Deterministic ID derived from content hash | No client cooperation required. Mistakes (wrong hash inputs) silently re-execute. |
| URL path | PUT /orders/{order_id} | Naturally idempotent at the HTTP level. Best when the resource ID is client-known. |
| Producer-assigned ID | Kafka producer ID + sequence number | Invisible to the application; broker handles dedup. Closed-system only. |
Most public APIs go with the header pattern because it lets clients adopt idempotency
gradually without breaking the existing payload schema. Internal services often
prefer URL paths with client-generated resource IDs (ULIDs work well) because
PUT is idempotent by definition and needs no extra machinery.
Real-world failures
- Duplicate Stripe charges. Client retries a
POST /chargeswithout an idempotency key after a timeout. The first request actually succeeded; the second creates a second charge. Every payment company has hit this at some point. The fix is mandatory idempotency keys on the client SDK with a timeout-and-retry wrapper that reuses the same key. - Email duplicate-send. A transactional email worker crashes after
calling the SMTP provider but before marking the job done. On restart the job
re-runs and the user gets two copies. The standard fix is a dedup table
keyed on
(recipient, message_hash, day)at the send layer. - Job-queue double-processing. A consumer pulls a job, starts processing, then crashes before acking. The broker redelivers, a second worker picks it up, and the work runs twice. Fix: a per-job idempotency key plus a status table the worker checks before doing anything externally visible.
- Webhook replay. Stripe, GitHub, and Shopify all retry webhooks for up to three days on non-200 responses. Receivers that don't dedupe by the event ID header will process the same event more than once. Every webhook handler should start with "have I seen this event ID before".
Common patterns
The vocabulary that shows up in code reviews:
- PUT for idempotent, POST for non-idempotent. The REST convention.
A
PUT /orders/{id}with the same body should be safe to retry; aPOST /orderscreates a new order on every call unless you add an idempotency key. - Unique constraint + insert-or-skip. In Postgres, a unique constraint
on the idempotency key column plus
INSERT ... ON CONFLICT DO NOTHINGmakes the database the dedup oracle. MySQL hasINSERT IGNORE. - Conditional writes. DynamoDB's
ConditionExpression, Spanner'sMutation.insert(vsinsertOrUpdate), and any CAS primitive let you say "only write if the row doesn't already exist", which gives you per-key idempotency without a separate table.
-- Postgres: dedup on idempotency key with a unique constraint
CREATE TABLE charge_attempts (
idempotency_key UUID PRIMARY KEY,
charge_id BIGINT NOT NULL,
response_body JSONB NOT NULL,
created_at TIMESTAMPTZ DEFAULT now()
);
INSERT INTO charge_attempts (idempotency_key, charge_id, response_body)
VALUES ($1, $2, $3)
ON CONFLICT (idempotency_key) DO NOTHING
RETURNING charge_id;
-- If RETURNING is empty, the row already existed:
-- look it up and return the cached response instead of re-charging.What the big systems do
| System | Mechanism | Window |
|---|---|---|
| Stripe | Idempotency-Key header, server caches response | 24 hours |
| Twilio | Per-message MessagingServiceSid + dedup ID | 7 days |
| GitHub | X-GitHub-Delivery on retried webhooks | 3 days (delivery window) |
| AWS DynamoDB | ConditionExpression for CAS, idempotency tokens on writes | 10 minutes (token TTL) |
| Kafka EOS | Producer ID + sequence number, transactional offsets | Within a Kafka transaction |
| Temporal | Workflow ID dedup + activity-level idempotency keys | Workflow retention period |
The honest rule
Every external-facing API with any real-world cost (payments, emails, SMS, push notifications, fulfilment, shipping labels) needs an idempotency key. No exceptions. Designing it in from day one is cheap; retrofitting it after the first duplicate-charge incident is expensive and reputational.
Internal APIs benefit too. A microservice call that can't be safely retried is a
latent outage waiting for the next network hiccup. The cheap version is "use
PUT with a client-known ID where you can"; the proper version is "every
mutating call carries an idempotency key". Either way, the receiver should handle
the same request twice without harm.
Further reading
- Pat Helland — Building on Quicksand — the foundational essay on retries, idempotence, and why exactly-once is a property of the application, not the protocol.
- Stripe — Idempotent Requests — the reference docs for the canonical implementation.
- AWS Builders' Library — Making retries safe with idempotent APIs — Amazon's writeup on idempotency tokens and how they handle retries internally.
- Temporal — Activities and idempotency — how Temporal expects activity code to behave under retry, with the workflow ID as the outer dedup key.
- Kafka KIP-98 — Exactly Once Delivery and Transactional Messaging — the proposal that introduced Kafka's transactional producer and the basis for "effectively-once" stream processing.
- Jay Kreps — The Log: What every software engineer should know about real-time data — "I Heart Logs" in long form. The piece that frames every event-driven system as a replicated log and idempotent consumers.
- Chris Richardson — Transactional Outbox pattern — the canonical writeup of the outbox pattern with implementation sketches.