API design best practices
Most of the decisions that determine how an API ages are not about the protocol you pick. They are the cross-cutting choices every endpoint shares: how resources are named, which HTTP method does what, what an error looks like on the wire, how lists paginate, how a client retries safely, and how rate limits surface. None of this is hard. A small amount of deliberate design here saves years of cleanup later, because a public API is the one part of a system you cannot quietly refactor once people depend on it. This page walks the whole set with the depth a senior engineer needs to make the calls and defend them in review.
Resource naming and URL design
A URL is the most permanent thing you publish. Long after you have rewritten the handler,
swapped the database, and changed teams twice, the path is still in someone's code. So spend
the design effort here. The rule that travels furthest: model your API as a set of nouns
(resources) and let HTTP verbs do the acting. A charge, a customer, an invoice — those are
resources, each with a stable identity and a predictable address. The verb you reach for is
the HTTP method, not a word in the path. POST /charges creates a charge;
GET /charges/ch_001 reads one. You should almost never need a path like
/createCharge or /getChargeById, because the method already carries
that meaning.
A few conventions are worth holding the line on, because mixing them inside one API is the fastest way to make it feel cheap:
- Plural collection nouns.
/charges, not/charge. The collection is a list; an item is addressed by id underneath it. Pick plural and never deviate. - Lowercase, hyphenated, no file extensions.
/payment-methods, not/PaymentMethodsor/payment_methods.json. The response format belongs in theAcceptheader, not the path. - Nest only to show ownership, and only one level.
/customers/cus_01/payment-methodsreads well. Three or four levels of nesting become impossible to route and impossible to remember; once a child has its own stable id, give it a top-level path too. - Identifiers are opaque. Treat ids as strings the client never parses.
Prefixed ids (
ch_,cus_) are a small kindness: a log line tells you what kind of thing went wrong without a schema lookup.
The same idea extends to the few cases that are not CRUD. A "refund this charge"
action is not a field you PATCH; it is a sub-resource you create:
POST /charges/ch_001/refunds. Treating the action as a thing that gets created
keeps the model consistent and, as a bonus, gives the action its own id, its own audit trail,
and a natural place to attach an idempotency key. Reserve true RPC-style verb endpoints for
the handful of operations that resist the noun model, and name them honestly when you do.
HTTP methods and status codes
HTTP already encodes a great deal of intent. Using it correctly means clients, proxies, caches, and your own retry logic all behave the way they were built to without extra configuration. The methods split along two axes that matter for correctness: whether they change state (safe), and whether running them twice has the same effect as running them once (idempotent).
| Method | Use for | Safe | Idempotent |
|---|---|---|---|
| GET | Read a resource or collection. Never mutate. | yes | yes |
| POST | Create a resource, or trigger an action | no | no |
| PUT | Replace a resource wholesale at a known id | no | yes |
| PATCH | Apply a partial update | no | no* |
| DELETE | Remove a resource | no | yes |
Those properties are a contract, not trivia. A GET must be free of side effects, because browsers prefetch them, proxies cache them, and crawlers follow them. A PUT that replaces a resource at a known id is idempotent, so a client that times out can simply send it again. A naive POST that creates a resource is not idempotent, which is the whole reason idempotency keys exist further down this page. (PATCH is idempotent only if the patch itself is, such as "set status to closed"; a relative patch like "add 5 to the balance" is not, which is a good reason to prefer absolute updates.)
Status codes deserve the same care. Clients branch on them, alerting systems count them, and retry libraries decide whether to try again based on the class. The full list is large; the set you actually reach for is small.
| Code | Meaning | When |
|---|---|---|
| 200 / 201 / 204 | OK / Created / No Content | Success. 201 on create, 204 when there is nothing to return |
| 400 | Bad Request | The request is malformed or fails validation |
| 401 / 403 | Unauthorized / Forbidden | Not authenticated, versus authenticated but not allowed |
| 404 | Not Found | No such resource, or you are hiding its existence on purpose |
| 409 | Conflict | State clash: duplicate, version mismatch, idempotency-key reuse |
| 422 | Unprocessable Entity | Well-formed but semantically invalid |
| 429 | Too Many Requests | Rate limited; pair with Retry-After |
| 500 / 503 | Server Error / Unavailable | Your fault. 503 when it is temporary and worth a retry |
The two distinctions people most often get wrong are 401 versus 403 (who you are versus what you may do) and 400 versus 422 (the bytes are wrong versus the meaning is wrong). Getting them right lets a client write one error handler instead of a special case per endpoint. And one rule that prevents a class of outages: a 4xx means "do not retry this as-is," a 5xx and 429 mean "you may retry, ideally with backoff." If your server returns 400 for a transient internal failure, well-behaved clients will give up on something they should have retried.
Pagination — cursors over offsets
Offset pagination (?page=5&per_page=20) feels natural but breaks under
load. The database has to skip the first N rows for every page, which gets
expensive past a few hundred thousand. And if rows are inserted or deleted between
page loads, items shift across pages — users see duplicates and missing entries.
Cursor pagination uses an opaque token that encodes "where I left off". The server decodes it, queries past that position, and returns a new cursor for the next page. The difference is not a micro-optimisation; it is the line between a list endpoint that stays fast at the millionth row and one that gets slower the deeper anyone scrolls.
# request
GET /charges?limit=20
# response
{
"data": [ { "id": "ch_001" }, ..., { "id": "ch_020" } ],
"next_cursor": "eyJpZCI6ImNoXzAyMCJ9",
"has_more": true
}
# next page
GET /charges?limit=20&cursor=eyJpZCI6ImNoXzAyMCJ9The cursor is just {"id": "ch_020"} base64-encoded — but treat it as
opaque to the client so you can change the encoding later. The server-side query
becomes WHERE id > 'ch_020' ORDER BY id LIMIT 20, which is index-cheap
at any depth. Two details make a cursor scheme correct rather than merely fast. First, the
sort must be on a unique, monotonic column, or a column plus a unique tiebreaker (sort by
created_at, break ties on id); otherwise two rows with the same
timestamp can straddle a page boundary and one gets skipped or repeated. Second, encode
everything the query needs into the cursor — the sort field, the direction, the last seen
values — so the server is stateless and you never have to remember a client's position on
your side.
Offset is not always wrong. If a dataset is small and bounded, or you need to jump to "page 47" of a fixed report, offset is simpler and fine. The failure mode is reaching for offset by default on a collection that grows without limit. The honest summary: cursors for anything that scales, offset only where the total is small and the "jump to page" affordance is worth the cost.
next_cursor and has_more rather than making clients infer the end
from a short page. A page that happens to return exactly limit items on the last
page is indistinguishable from a full one without an explicit flag.Errors — RFC 9457 problem details
RFC 9457
(originally RFC 7807) defines an error envelope for HTTP APIs. Adopt it. Every error
your API emits should follow this shape, with the type URI as a stable
machine-readable code:
HTTP/1.1 422 Unprocessable Entity
Content-Type: application/problem+json
{
"type": "https://api.example.com/problems/invalid-currency",
"title": "Unsupported currency",
"status": 422,
"detail": "The requested currency 'XYZ' is not supported.",
"instance": "/charges/ch_001",
"field": "currency",
"request_id": "req_5f9a..."
}type; the human reads title and detail; support reads request_id.A good error has four properties:
- Stable machine-readable code —
typeas a URI. Clients switch on this; never ontitleordetail. - Human-readable explanation — what happened and why, in plain English, safe to surface to a user or developer.
- Actionable detail — which field, what value, what to fix.
- A request ID — so the developer can ask you about it. See below.
The point of a stable type is that it is a promise. Once a client ships code
that branches on invalid-currency, that string is now part of your contract as
surely as any field name. You can improve the title and detail
wording freely — they are for humans and nobody should parse them — but the type
is forever. Keep a registry of your error types the same way you keep a list of endpoints,
and treat removing or renaming one as a breaking change that belongs on the
versioning roadmap, not a quiet patch.
For validation failures, return all the problems at once, not the first one. A form with
four bad fields should come back with four entries (RFC 9457 allows an array of nested
problems under an errors member), so the client can highlight every field in one
round trip instead of playing whack-a-mole. And never leak internals in the
detail: stack traces, SQL, and internal hostnames belong in your logs keyed by
the request id, not in a response a stranger can read.
Rate limits
If your API is public, it has rate limits whether you advertise them or not — at minimum from your CDN, your reverse proxy, and from your origin's capacity. Make them explicit. Send the current state on every response:
HTTP/1.1 200 OK
RateLimit-Limit: 100
RateLimit-Remaining: 87
RateLimit-Reset: 12 # seconds until the bucket refillsWhen a client is over the limit, return 429 Too Many Requests with a
Retry-After header indicating how long to wait. The IETF
RateLimit headers draft
is converging on a standard form; until then, GitHub's
X-RateLimit-* shape is the most widely-implemented and a reasonable choice.
The 429 plus Retry-After pair is a contract, and both sides have to keep it. The
server promises that waiting the stated time will help; the client promises to actually wait
rather than hammering the endpoint. A client that retries a 429 immediately is just turning a
soft limit into a hard outage. The right client behaviour is to honour Retry-After
when present, and otherwise back off exponentially with jitter so a fleet of clients does not
all wake up and retry in lockstep. If you want to feel the difference between a token bucket
that absorbs bursts and a fixed window that drops them, the
rate limiter simulator lets you turn the
knobs and watch requests pass or fail in real time.
A couple of design choices matter more than the exact header names. Rate limit per principal (the API key or user), not per IP, or one customer behind a shared gateway can starve another. And decide whether you are shaping bursts or enforcing a hard ceiling: a token bucket lets a client spend a saved-up burst and is friendlier to bursty real workloads, while a fixed window is simpler but creates a stampede at the boundary of each window. Most public APIs land on a token bucket for exactly that reason.
Idempotency keys and safe retries
Networks fail in the worst possible way: the request arrives, the server does the work, and
the response is lost on the way back. The client cannot tell that apart from a request that
never landed, so it retries — and now you have charged the card twice. The fix is an
Idempotency-Key header. The client generates a unique key per logical operation
and sends it with every attempt. The server stores the result of the first request under that
key for some window (24 hours is typical) and replays the same response on any retry. This is
the same principle that makes distributed systems survivable; the deeper treatment lives on
the idempotence page,
and on the REST page in API terms.
Getting this right means thinking about the gaps between steps. The "reserve, do work, store" sequence has to be atomic enough that two retries racing each other cannot both slip through. The usual pattern is to insert the key into a unique-constrained table before doing the work: the first request wins the insert and proceeds; a concurrent duplicate hits the constraint and waits for, then replays, the stored result. Done this way the database, not your application code, enforces the at-most-once guarantee.
Two non-obvious rules:
- Hash the request body and store the hash with the cached response. If a later call
uses the same key with a different body, return
409— that's a client bug masquerading as a retry, and replaying the old response would hide it. - Document the dedup window. Clients that retry beyond it (e.g. resuming a job after 48 hours) need to know to generate a fresh key, and that an old key may no longer be deduped.
Scope the key correctly too. An idempotency key is meaningful within one authenticated account and one endpoint; the same random string from two different customers must never collide. The practical key is therefore "account id plus endpoint plus the client's key," even though the client only sends the last part. And only require keys where a duplicate does harm — creating a charge, sending mail, provisioning a resource. A GET needs no key because it changes nothing, and a PUT to a known id is already idempotent by construction.
Request IDs
Every request gets a unique ID. The server logs it. The server returns it on every response (header and error envelope). The client can quote it when reporting an issue. Internal services pass the same ID through their own logs, so a single string lets you follow a single request through every system it touched.
# client may supply one; if not, server generates
Request-ID: req_5f9a4c8e7d3b2a1f8e7d6c5b4a39281
# response always echoes
HTTP/1.1 200 OK
Request-ID: req_5f9a4c8e7d3b2a1f8e7d6c5b4a39281If you're already using
W3C Trace Context,
use that instead — traceparent and tracestate headers cover
the same need plus distributed tracing semantics. The traceparent trace ID
is itself a perfectly good request ID for log correlation.
Filtering, sorting, partial responses
Three smaller patterns worth standardising up front:
- Filtering. Query parameters that match field names —
?status=open¤cy=USD. For ranges, suffixes:?created_after=...&created_before=.... Avoid clever DSLs; every client has to learn them. - Sorting. A single
?sort=-created_at,idparameter. Hyphen prefix for descending. Stable: a tiebreaker likeidkeeps page boundaries deterministic. - Partial responses. A
?fields=id,amount,statusparameter that whitelists fields (a sparse fieldset). Saves bandwidth on large objects and lets you deprecate fields gracefully — clients that never asked for a field do not notice when it goes away.
The common thread is restraint. Each of these is a small, guessable surface that a client can learn once and apply to every collection. The moment you invent a bespoke query language — nested boolean operators, a custom filter grammar in a string parameter — you have built a second API inside your API that nobody can use without reading a manual, and that you now have to parse safely. If you need rich querying, that is a sign the use case wants a real query endpoint or a different protocol, not a clever string parameter bolted onto a list.
Good defaults, nullability, consistency
The fastest way to make an API feel professional is to be boringly consistent. Pick one casing
for field names (snake_case or camelCase) and never mix them. Use one
format for timestamps everywhere — RFC 3339 / ISO 8601 in UTC,
2026-06-07T14:30:00Z — not a Unix integer here and a date string there. Represent
money as an integer count of minor units (cents) plus a currency code, never a float, because
floating-point cents quietly lose money. These are not matters of taste once you have picked
them; the value is that a client who has parsed one of your responses can parse all of them.
Nullability deserves an explicit policy. Decide, per field, whether absent and
null mean the same thing, and write it down. The cleanest rule: omit a field that
has no value rather than sending null, and reserve null for "this
field exists and is deliberately empty." For collections, return an empty array, never
null — a client iterating a list should never have to null-check it first. Small
as it sounds, inconsistent nullability is one of the most common sources of client crashes.
Defaults are part of the contract too. If limit defaults to 20, document it and
cap it, so a client cannot ask for a million rows and take the service down. If a new field is
added later, give it a default that preserves the old behaviour, so existing callers see no
change. The guiding idea is to favour conservative choices: when in doubt, the default should be
the safe, small, backward-compatible one, and new capability should be opt-in.
Versioning and backward compatibility
You will change the API. The only question is whether your changes break the people who depend on it. The discipline that keeps an API stable is knowing which changes are safe and which are not. Adding an optional field, adding a new endpoint, adding a new enum value the client can ignore, adding a new optional query parameter — these are additive and safe. Removing a field, renaming one, tightening validation, changing a type, changing the meaning of an existing value, or making an optional field required — these break callers and need a new version.
The corollary on the client side is just as important, and worth telling your consumers plainly: read tolerantly. A client that rejects a response because it contains a field it did not expect turns every additive, safe change you make into a breakage on their end. Ignore unknown fields, treat unknown enum values as a documented "other," and you can both keep moving. The mechanics of how to express versions — in the path, a header, or a date stamp — and how to run two versions side by side, are the whole subject of the versioning page; the practice that matters here is to make additive changes the default and reserve a version bump for the rare change that cannot be additive.
OpenAPI from day one
Whatever protocol you choose for HTTP/JSON APIs, write an OpenAPI document for it from the start. Three things become possible:
- Generated SDKs. Tools like Speakeasy, Fern, and the open-source OpenAPI Generator produce typed clients in every major language without you writing a single SDK.
- Documentation that doesn't drift. The schema is the source of truth; the docs site reads from it. Adding an endpoint and forgetting to document it stops being a possible failure mode.
- Server validation. Many frameworks can read the OpenAPI document and reject invalid requests at the edge before they reach your handlers.
The deeper win is treating the document as the contract rather than as documentation generated after the fact. When the OpenAPI file is the source of truth, you can diff it in code review and see, mechanically, whether a change is additive or breaking — which turns the backward-compatibility rules above from a habit you hope people remember into a check a tool runs. Whether you write the spec first and generate handlers from it, or generate the spec from annotated code, the test is the same: if the spec and the running server ever disagree, that is a bug, and it should fail a build, not surprise a customer.
Security basics that are not optional
Two security rules cut across every endpoint, and skipping either is the kind of mistake that
ends up in an incident review. The first: authorise on every endpoint, every time, on the
server, against the authenticated principal. Authentication answers "who are you"; authorisation
answers "may you do this to this specific object." A staggering share of real breaches are not
broken crypto but a missing object-level check — an endpoint that confirms you are logged in but
never confirms that charge ch_001 belongs to you before returning it. The
defence is to make the ownership check part of the query, not an afterthought: fetch the object
scoped to the caller (WHERE id = ? AND account_id = ?) so an object you do not own
simply does not exist as far as you are concerned, and you return a clean 404.
The second: keep secrets out of URLs. API keys, tokens, and session ids do not belong in the
path or query string, because URLs are logged everywhere — in access logs, proxy logs, browser
history, the Referer header sent to third parties. Credentials go in the
Authorization header, over TLS, and nothing else. While you are at it: never reflect
a secret back in an error message, never put personally identifying data in a cacheable GET URL,
and apply input limits (max body size, max array length, max string length) so a single request
cannot exhaust memory. None of this is exotic. It is the floor, and the audience for these notes
is exactly the engineer who is expected to know it without being told.
A summary checklist
- Plural, lowercase, hyphenated resource paths; nouns in the path, verbs as HTTP methods.
- Correct method per intent; one status code per outcome, and never 200 on a failure.
- 4xx means do not retry, 5xx and 429 mean retry with backoff.
- Cursor pagination, never offset for collections that grow.
- Errors as RFC 9457 problem details, with stable type URIs and no leaked internals.
- Rate-limit headers on every response;
Retry-Afteron 429. - Idempotency keys on every state-mutating POST that would do harm if duplicated.
- Request IDs (or W3C Trace Context) on every request and response.
- Standard filter/sort/fields parameters; document them once, reuse everywhere.
- Consistent casing, UTC timestamps, money as integer minor units; empty arrays, never null.
- OpenAPI document checked in, updated with every change, generating SDKs and docs.
- Additive changes by default; version bump only when a change cannot be additive (see versioning).
- Authorise per object on every endpoint; secrets in headers, never in URLs.
Further reading
- RFC 9457 — Problem Details for HTTP APIs
- Draft — RateLimit headers
- W3C — Trace Context
- Google API Improvement Proposals — particularly AIP-158 (pagination), AIP-160 (filtering), AIP-193 (errors).
- Stripe API reference — the de facto reference for clean HTTP API conventions.
- Microsoft REST API Guidelines — the most thorough public guidelines document.