REST
REST is a set of architectural constraints that, when applied to HTTP, give you a uniform interface for addressable resources. Most APIs that call themselves RESTful only follow some of the constraints. In day-to-day usage the term has drifted to mean "HTTP plus JSON, with URLs that look like nouns," which is a perfectly reasonable way to ship a product. This page walks both versions: the style as Roy Fielding defined it, and the working definition every engineer actually builds against.
Where REST comes from
REST — Representational State Transfer — comes from Roy Fielding's 2000 dissertation, chapter 5. It describes six architectural constraints. An API only earns the "REST" label if it meets all of them:
- Client–server separation. The two evolve independently.
- Stateless. Each request carries everything the server needs to handle it. No session memory.
- Cacheable. Responses must declare whether they can be cached.
- Uniform interface. Resources are identified by URIs; representations carry their own metadata.
- Layered system. A client can't tell whether it's talking to the origin server or a CDN/proxy.
- Code on demand (optional). The server can ship runnable code to the client.
In practice, almost no production API satisfies all six. The word stuck anyway. When someone says "REST" today they usually mean: HTTP, JSON bodies, resource-shaped URLs, and verbs that match HTTP methods. That's the working definition this page uses.
The reason these constraints are worth knowing even when you ignore half of them is that each one buys a concrete property. Statelessness is what lets you put a load balancer in front of ten identical servers and route any request to any of them, because no server is holding a half-finished conversation in memory. Cacheability is what lets a CDN answer a read without ever touching your origin. The layered system is what lets you slip an API gateway, a proxy, or a cache into the path without rewriting either end. The uniform interface is the one that makes all of this composable: because every resource is reached the same way, the same generic machinery (caches, proxies, auth, logging) works against all of them. You are not buying into a religion when you follow REST. You are buying back a stack of infrastructure that already understands HTTP.
The uniform interface, drawn out
The constraint that gives REST its shape is the uniform interface, and it is worth slowing down on because the other constraints lean on it. The idea is that a client manipulates a resource through a representation of it, using a fixed, small set of operations that mean the same thing everywhere. A resource is the concept (the order, the user, the invoice); a representation is one concrete serialisation of that resource at a point in time (the JSON you get back, or the same order rendered as XML, or as a PDF). The resource is stable; representations are negotiable.
Accept, the server replies with one representation and labels it in Content-Type.That split is what content negotiation runs on. The client sends an
Accept: application/json header to say which representation it wants, and the
server picks the best match it can produce and stamps it with Content-Type.
The same mechanism handles Accept-Language for localised text and
Accept-Encoding for gzip or brotli compression. Most APIs only ever speak JSON,
and that is fine, but the model is bigger than JSON: the URI names a thing, and the bytes
you get back are just one view of it.
The other half of the uniform interface is that the operations are fixed. You do not invent
a new verb per resource. You reuse the small HTTP method set, and that reuse is exactly what
lets a cache know that a GET is safe to store and a DELETE is not.
Self-descriptive messages round it out: each response carries enough metadata (status code,
content type, cache directives) that an intermediary can act on it without understanding
your application at all.
Resources, not actions
The single biggest design lever is to model your domain as resources (nouns) rather than actions (verbs). A resource is anything you can give a URI: a user, an order, an invoice line, a search result.
POST /createUser POST /getUser?id=42 POST /updateUserEmail POST /deleteUser
POST /users GET /users/42 PATCH /users/42 DELETE /users/42
Once your domain is a graph of resources, idempotency, caching, and pagination fall out for free. Path templates compose. URL hierarchies survive contact with reality. Verb- style URLs grow combinatorially and force every client into RPC-like coupling.
The deeper reason this works is that there are only so many things you ever do to a piece of
data: create it, read it, replace it, change part of it, delete it, and list a collection of
it. HTTP already has a verb for each. When your URLs are nouns and your verbs are HTTP's, a
new endpoint is just a new noun, and the operations come for free. When your URLs are verbs,
you have to invent and document and version every operation by hand, and the surface area
grows with the product of resources and actions instead of the sum. A team that ships
/createUser, /getUser, and /disableUser will keep
adding paths forever; a team that ships /users and lets the method carry the
intent rarely adds a path at all.
Hierarchy is the other lever. Nesting expresses ownership: /users/42/orders
reads as "the orders belonging to user 42," and that relationship is legible without a
schema. Keep the nesting shallow, though. Two levels is usually plenty;
/users/42/orders/7/lines/3/notes is a sign the deep thing deserves its own
top-level collection (/notes/981) with a reference back up. Deep paths are
brittle: they bake a navigation route into the URL, and they break the moment the
relationship changes.
HTTP verbs at a glance
| Verb | Use for | Idempotent | Safe |
|---|---|---|---|
GET | Read | Yes | Yes |
HEAD | Read headers only | Yes | Yes |
OPTIONS | Discover allowed verbs / CORS preflight | Yes | Yes |
POST | Create / non-idempotent action | No | No |
PUT | Replace whole resource | Yes | No |
PATCH | Partial update | Sometimes | No |
DELETE | Remove | Yes | No |
Idempotent means calling N times has the same effect as calling once.
Safe means the call doesn't change server state. These two distinctions
drive retry logic everywhere. Clients can safely retry idempotent calls on a network
hiccup, but must be careful with POST.
Those two words are doing more work than they look. Think about what happens when a request
times out: the client has no way to know whether the server processed it. The TCP
connection dropped after the request went out but before the response came back, and from
the client's seat the two outcomes (server got it, server missed it) are indistinguishable.
If the call was a GET or a PUT or a DELETE, the safe
move is just to fire it again, because doing the operation twice lands in the same state as
doing it once. If it was a POST that creates an order or charges a card, a blind
retry can double-charge. This is why "is this verb idempotent" is not pedantry. It is the
single property that decides whether your retry layer is safe or dangerous.
PUT can be replayed blindly; a POST needs help.One subtle case sits in the table: PATCH is "sometimes" idempotent. A patch that
sets a field to an absolute value (status = "shipped") is idempotent, because
applying it twice leaves the field at the same value. A patch expressed as a delta
(increment balance by 10) is not, because two applications add twenty. If you
design PATCH bodies as absolute assignments rather than relative operations,
you keep idempotency and you keep retries simple. It is one of those choices that costs
nothing at design time and saves an incident later.
Status codes worth knowing
HTTP defines about sixty status codes. About fifteen of them carry their weight in practice. The rest are either historical, redirect-specific, or tied to long-deprecated protocols. The codes group by their first digit, and that grouping is the part to keep in your head: 2xx means it worked, 3xx means look elsewhere, 4xx means the client got something wrong and should not retry unchanged, 5xx means the server got something wrong and a retry might help. A client that only understands those five buckets already handles most of what it will meet.
Within those buckets, a handful of specific codes earn their keep. Memorize these:
200 OK- Generic success.
201 Created- Resource created. Include a
Locationheader pointing at it. 202 Accepted- Work queued. Reply with a polling URL.
204 No Content- Success, nothing to return (DELETE, idempotent PUT).
301 / 308- Permanent redirect.
308preserves the verb. 302 / 307- Temporary redirect.
307preserves the verb. 400 Bad Request- Malformed request.
401 Unauthorized- Not authenticated. Include
WWW-Authenticate. 403 Forbidden- Authenticated, not authorized.
404 Not Found- Resource doesn't exist (or you don't have permission to know).
409 Conflict- Write conflicts. Optimistic concurrency, version mismatch.
410 Gone- Permanently removed. Tells caches to drop it.
422 Unprocessable Entity- Well-formed JSON, semantically invalid (failed validation).
429 Too Many Requests- Rate limited. Include
Retry-After. 500 Internal Server Error- Your fault. Don't ever return 5xx for input the client can fix.
503 Service Unavailable- Temporary; client should back off and retry.
The original sin: 200 OK with {"error":"..."} in the body.
Tooling, intermediaries, retries, and clients all assume the status code is the truth.
An API that returns 200 for everything cannot be cached, cannot be retried sensibly, and
cannot be load-balanced by health checks. Use real codes; put structured detail in the
body.
Idempotency keys on POST
POST isn't idempotent — but you can make it idempotent with an
Idempotency-Key header. Stripe popularised the pattern. The server stores
the result of the first request keyed by that header; subsequent retries return the
cached response.
POST /v1/payments
Authorization: Bearer sk_live_...
Idempotency-Key: 8d2b1c0a-7e4f-4c12-9f6e-...
Content-Type: application/json
{"amount": 4200, "currency": "usd", "source": "tok_..."} Server-side, the typical implementation is:
# pseudocode
key = request.header['Idempotency-Key']
hash = sha256(canonical(request.body))
if rec := storage.get(key):
if rec.body_hash != hash:
return 409 # same key, different body — client bug
return rec.cached_response # replay
response = handle(request)
storage.put(key, body_hash=hash, response=response, ttl=24h)
return response Without idempotency keys, network retries on payments either lose transactions or charge twice. Build this in from day one for any state-mutating endpoint with money, storage, or external systems on the line.
Conditional requests with ETag and If-Match
Optimistic concurrency without a database transaction. The server stamps every resource
with an ETag — usually a hash of the contents, or a version number.
The client echoes it back on writes via If-Match; the server rejects with
412 Precondition Failed if the resource has moved on since the read.
# read
GET /orders/42
→ 200 OK
ETag: "v17"
{...order...}
# write
PATCH /orders/42
If-Match: "v17"
{"status": "shipped"}
→ 200 OK if ETag still v17
→ 412 Precondition if someone else wrote v18 first ETags also do double duty for caching. If-None-Match on a GET
lets the server reply 304 Not Modified without re-sending the body. CDNs
use this; reverse proxies use this; well-built clients use this. It's free bandwidth.
Versioning without breaking clients
Every API that lives long enough has to change in a way that would break existing clients, and the moment you have callers you do not control, you cannot just edit the response shape. There are three common places to put the version, and the choice is more about taste and tooling than correctness.
| Approach | Looks like | Trade-off |
|---|---|---|
| URI path | /v2/orders/42 | Obvious, cache-friendly, easy to route. Purists dislike that the URI for "order 42" now changes per version. |
| Header / media type | Accept: application/vnd.acme.v2+json | Keeps URIs stable and is the most "correct" by the spec, but is invisible in a browser and easy to forget. |
| Query parameter | /orders/42?version=2 | Simple, but pollutes caching and is easy to drop by accident. |
The pragmatic default for a public API is the version in the path. It is visible, it routes
cleanly through gateways and CDNs, and a developer can paste a URL into a browser and see
what they get. Whichever you pick, the discipline that actually matters is keeping changes
additive within a version. Adding a field is safe, because well-built
clients ignore fields they do not know. Removing a field, renaming one, changing a type, or
tightening validation is a breaking change and needs a new version. Date-based versions
(Stripe's 2023-10-16 style, pinned per account) are a clever middle path: the
URL stays the same, and the client opts into a snapshot of behaviour that never shifts under
it.
HATEOAS and the Richardson maturity model
Leonard Richardson described a four-rung ladder that measures how much of REST an API actually uses. It is a useful map even though almost nobody climbs to the top, because each rung names a real design decision.
Level 3 is HATEOAS — Hypertext As The Engine Of Application State — the constraint Fielding cared about most. The server tells the client what's possible next, via links in the response:
{
"id": 42,
"status": "draft",
"_links": {
"self": { "href": "/articles/42" },
"publish": { "href": "/articles/42/publish", "method": "POST" },
"delete": { "href": "/articles/42", "method": "DELETE" }
}
} In theory this lets the server move endpoints freely; clients only follow links.
In practice almost nobody does this, because clients hard-code URL templates anyway.
HATEOAS shows up most cleanly in PayPal's API and in Spring HATEOAS — and arguably in
the way the web itself works (the browser doesn't hard-code URLs; it follows
<a> tags).
The pragmatic version: include links in responses where they help. Pagination cursors, related resources, action endpoints. Don't bet your client design on the server moving them.
Where REST stops being a good fit
- Mobile / over-fetching. A "user profile" page ends up making 5 round-trips for half the data each. That's GraphQL's whole pitch.
- Streaming. Long-lived bidi streams don't fit into the request/response model. Use WebSockets, SSE, or gRPC.
- Internal microservices. Inside a binary fence, gRPC's binary framing and strict schemas usually win on latency and CPU.
- Action-shaped operations. "Refund this charge" doesn't model
cleanly as a resource. Many large APIs (Stripe, GitHub) just accept "actions as
POSTs to nouns" and move on:
POST /charges/{id}/refund. - RPC with state. If your API is fundamentally a remote procedure call ("sum these numbers", "compile this code"), wrapping it in resources is ceremony. JSON-RPC or gRPC may fit better.
REST vs gRPC vs GraphQL
The three protocols you reach for most are not really competitors so much as answers to different pressures. REST optimises for reach and cacheability over plain HTTP. gRPC optimises for throughput and tight contracts between services you control. GraphQL optimises for clients that want to ask for exactly the data they need in one round trip. Picking well is mostly a matter of naming which pressure you actually have.
| REST | gRPC | GraphQL | |
|---|---|---|---|
| Transport | HTTP/1.1 or 2, text JSON | HTTP/2, binary Protobuf | Usually HTTP, single POST endpoint |
| Contract | OpenAPI (optional) | Protobuf schema (required) | GraphQL schema (required) |
| Fetching | Fixed per endpoint; can over/under-fetch | Fixed per method | Client picks fields exactly |
| HTTP caching | Built in, free | None (it's all POST-like) | Hard (one POST URL) |
| Streaming | Weak (SSE bolt-on) | First-class, bidirectional | Subscriptions, awkward |
| Best at | Public APIs, browser clients, edge caching | Internal service-to-service, low latency | Aggregating many sources for a UI |
The honest rule of thumb: expose REST at the edge where the open web, browsers, and third parties live and where HTTP caching pays off; use gRPC behind the gateway between your own services where you control both ends and care about latency and CPU; reach for GraphQL when a single screen needs to stitch together data from several backends and you are tired of either making five REST calls or shipping a bespoke aggregate endpoint per screen. Plenty of real systems run all three, with a GraphQL or REST gateway out front fanning out to gRPC services behind it.
The REST-versus-GraphQL decision is the one most teams actually agonise over, because both can sit at the edge. The deciding question is usually who controls the clients and how varied their data needs are. The REST vs GraphQL comparison walks the full trade space, including the operational costs (query depth limits, caching, N+1 resolvers) that the headline pitch tends to skip.
Pagination — cursor vs offset
Offset pagination (?page=12&size=25) is the obvious approach
and almost always the wrong one at scale. The query
SELECT ... LIMIT 25 OFFSET 5000 still has to scan 5000 rows before
it returns anything; latency grows linearly with page number. Worse, an insert
between requests shifts every page boundary, so the same row can appear in two
pages or none.
Cursor pagination encodes the position in an opaque token — usually base64 of
(sort key, primary key) — and the client passes it back as ?after=eyJp....
The server translates that into a WHERE (created_at, id) > (?, ?)
predicate that uses the same index every other query uses, with constant cost
regardless of page depth.
| Choice | Latency at page N | Stable under writes? | Can jump to arbitrary page? |
|---|---|---|---|
| Offset / limit | O(N) | No | Yes |
| Cursor (keyset) | O(1) per page | Yes | No — only next/prev |
| Time-based cursor | O(1) per page | Yes | By time, not by row |
| Snapshot tokens (with a revision id) | O(1) | Yes, isolated | No — fixed snapshot |
The right default is cursor pagination. Use offset only when the use case explicitly needs "jump to page 47" — e.g. an admin UI for a small table. Twitter, Slack, Stripe, GitHub all expose cursor APIs. The standard envelope shape:
GET /v1/messages?limit=20
200 OK
{
"data": [ ... ],
"has_more": true,
"next_cursor": "Y3Vyc29yOnYxOjE2ODY3MjA="
}Caching headers, the part most APIs skip
HTTP has a sophisticated caching model that almost no API actually uses. That's
a missed performance win — a single Cache-Control header can turn
a query that does ~50 ms of database work into a 304 Not Modified that doesn't
touch the database at all.
| Header | What it does | When to set |
|---|---|---|
Cache-Control: public, max-age=300 | Allow shared caches to serve this response for 5 minutes. | List endpoints, public reference data. |
Cache-Control: private, no-store | Per-user data; never cache. | Account-scoped resources, sensitive data. |
Cache-Control: max-age=0, must-revalidate | Force conditional revalidation on every request. | Mostly-static but mutable resources (product catalog). |
ETag: "v3-abc12..." | Strong revision tag; client sends back as If-None-Match. | All GETs of significant resources. |
Last-Modified + If-Modified-Since | Weaker, time-based variant of ETag. | When you don't have a content hash. |
Vary: Authorization, Accept-Language | Tell caches that the response differs by these headers. | Whenever the response varies on something other than URL. |
Vary trap. If you serve different bodies based
on Authorization and forget Vary: Authorization, a CDN
or shared cache will serve user A's response to user B. Several real-world
incidents have been traced to this. Whenever a header influences the response,
it goes in Vary — or use private + no-store
to opt out entirely.Rate limits, Retry-After, and the 429
Production APIs need rate limits. The convention is to return
429 Too Many Requests when a client exceeds its budget, along with
these headers that tell the client what to do:
429 Too Many Requests
Retry-After: 12
RateLimit-Limit: 1000
RateLimit-Remaining: 0
RateLimit-Reset: 1700000000Retry-After is the only one most clients respect by default.
Anywhere you can name the budget exactly (e.g. "1000 req/min, you've spent
1000, the bucket resets at this unix time"), spend the bytes — it lets the
client back off intelligently instead of polling tighter and tighter.
The standardisation effort here is the IETF RateLimit-* header
draft. Until it lands, the de-facto pattern is X-RateLimit-Limit,
X-RateLimit-Remaining, X-RateLimit-Reset. Use both
forms — clients in the wild expect either.
Error responses — RFC 9457 problem+json
The single biggest interoperability win on REST in the last decade is the problem-details format (RFC 9457, formerly 7807). Every error has the same shape, carries a stable machine-readable type, and is friendly to both humans and SDK code:
HTTP/1.1 422 Unprocessable Content
Content-Type: application/problem+json
{
"type": "https://api.example.com/errors/insufficient-balance",
"title": "Insufficient balance",
"status": 422,
"detail": "Account 12345 has 0.50 USD; the requested charge is 1.00 USD.",
"instance": "/v1/payments/pi_abc123",
"balance_usd": 0.50,
"minimum_required_usd": 1.00
}The type is a URL that uniquely identifies the error kind — clients
switch on it like an enum. Any number of additional fields can be added
(balance_usd, etc.) to give the application enough context to act.
Stripe, GitHub, and most modern public APIs ship this shape now.
type matters. Error messages get
rewritten as the product evolves; type URLs don't. Code that does
if err.type === '.../insufficient-balance' works across years of
copy edits. Code that does if err.message.includes('insufficient')
breaks the moment someone tweaks the phrasing.A pragmatic checklist
- ✅ Use plural nouns:
/users, not/user. - ✅ Lowercase, hyphenated paths.
/order-line-items. - ✅ Filter via query string:
GET /orders?status=open&limit=50. - ✅ Wrap collections:
{ "data": [...], "next_cursor": "..." }. - ✅ Use
If-Match+ETagfor optimistic concurrency. - ✅ Always include
Content-Typeand a stableX-Request-Id. - ✅ Document with OpenAPI from day one — generate clients, don't write them.
- ✅ Idempotency keys on every state-mutating
POST. - ✅ Real status codes. RFC 9457
application/problem+jsonerror envelopes. - 🚫 Don't return 200 with
{"error":"..."}. - 🚫 Don't tunnel everything through POST.
- 🚫 Don't bake auth tokens into URLs (they end up in logs).
- 🚫 Don't reuse paths for different resource types — once
/users/42is a user, it stays a user forever.
Further reading
- Roy Fielding — Architectural Styles and the Design of Network-based Software Architectures (Ch. 5)The dissertation. Read chapter 5 for the constraints, then chapters 6–7 for what they cost.
- RFC 9110 — HTTP SemanticsThe unified normative HTTP semantics. §9 (methods) and §15 (status codes) are the canonical reference.
- Google API Improvement ProposalsAIP-122 (resource names), AIP-132 (standard methods), AIP-158 (pagination), AIP-200 (errors). Read these before designing your own conventions.
- Stripe — Idempotency keysThe reference implementation of POST idempotency. Eight paragraphs; saves you from a class of production incidents.
- RFC 9457 — Problem Details for HTTP APIsThe right error envelope. Every API should ship this shape and a stable error code.