01 / 11

Protocols / 01

REST

REST is a set of architectural constraints that, when applied to HTTP, give you a uniform interface for addressable resources. Most APIs that call themselves RESTful only follow some of the constraints. In day-to-day usage the term has drifted to mean "HTTP plus JSON, with URLs that look like nouns," which is a perfectly reasonable way to ship a product. This page walks both versions: the style as Roy Fielding defined it, and the working definition every engineer actually builds against.

Where REST comes from

REST — Representational State Transfer — comes from Roy Fielding's 2000 dissertation, chapter 5. It describes six architectural constraints. An API only earns the "REST" label if it meets all of them:

Client–server separation. The two evolve independently.
Stateless. Each request carries everything the server needs to handle it. No session memory.
Cacheable. Responses must declare whether they can be cached.
Uniform interface. Resources are identified by URIs; representations carry their own metadata.
Layered system. A client can't tell whether it's talking to the origin server or a CDN/proxy.
Code on demand (optional). The server can ship runnable code to the client.

In practice, almost no production API satisfies all six. The word stuck anyway. When someone says "REST" today they usually mean: HTTP, JSON bodies, resource-shaped URLs, and verbs that match HTTP methods. That's the working definition this page uses.

The reason these constraints are worth knowing even when you ignore half of them is that each one buys a concrete property. Statelessness is what lets you put a load balancer in front of ten identical servers and route any request to any of them, because no server is holding a half-finished conversation in memory. Cacheability is what lets a CDN answer a read without ever touching your origin. The layered system is what lets you slip an API gateway, a proxy, or a cache into the path without rewriting either end. The uniform interface is the one that makes all of this composable: because every resource is reached the same way, the same generic machinery (caches, proxies, auth, logging) works against all of them. You are not buying into a religion when you follow REST. You are buying back a stack of infrastructure that already understands HTTP.

The uniform interface, drawn out

The constraint that gives REST its shape is the uniform interface, and it is worth slowing down on because the other constraints lean on it. The idea is that a client manipulates a resource through a representation of it, using a fixed, small set of operations that mean the same thing everywhere. A resource is the concept (the order, the user, the invoice); a representation is one concrete serialisation of that resource at a point in time (the JSON you get back, or the same order rendered as XML, or as a PDF). The resource is stable; representations are negotiable.

Content negotiation: the client asks for a media type with Accept, the server replies with one representation and labels it in Content-Type.

That split is what content negotiation runs on. The client sends an Accept: application/json header to say which representation it wants, and the server picks the best match it can produce and stamps it with Content-Type. The same mechanism handles Accept-Language for localised text and Accept-Encoding for gzip or brotli compression. Most APIs only ever speak JSON, and that is fine, but the model is bigger than JSON: the URI names a thing, and the bytes you get back are just one view of it.

The other half of the uniform interface is that the operations are fixed. You do not invent a new verb per resource. You reuse the small HTTP method set, and that reuse is exactly what lets a cache know that a GET is safe to store and a DELETE is not. Self-descriptive messages round it out: each response carries enough metadata (status code, content type, cache directives) that an intermediary can act on it without understanding your application at all.

Resources, not actions

The single biggest design lever is to model your domain as resources (nouns) rather than actions (verbs). A resource is anything you can give a URI: a user, an order, an invoice line, a search result.

Verb-style — avoid

POST /createUser
POST /getUser?id=42
POST /updateUserEmail
POST /deleteUser

Resource-style — prefer

POST   /users
GET    /users/42
PATCH  /users/42
DELETE /users/42

Once your domain is a graph of resources, idempotency, caching, and pagination fall out for free. Path templates compose. URL hierarchies survive contact with reality. Verb- style URLs grow combinatorially and force every client into RPC-like coupling.

The deeper reason this works is that there are only so many things you ever do to a piece of data: create it, read it, replace it, change part of it, delete it, and list a collection of it. HTTP already has a verb for each. When your URLs are nouns and your verbs are HTTP's, a new endpoint is just a new noun, and the operations come for free. When your URLs are verbs, you have to invent and document and version every operation by hand, and the surface area grows with the product of resources and actions instead of the sum. A team that ships /createUser, /getUser, and /disableUser will keep adding paths forever; a team that ships /users and lets the method carry the intent rarely adds a path at all.

Hierarchy is the other lever. Nesting expresses ownership: /users/42/orders reads as "the orders belonging to user 42," and that relationship is legible without a schema. Keep the nesting shallow, though. Two levels is usually plenty; /users/42/orders/7/lines/3/notes is a sign the deep thing deserves its own top-level collection (/notes/981) with a reference back up. Deep paths are brittle: they bake a navigation route into the URL, and they break the moment the relationship changes.

HTTP verbs at a glance

Verb	Use for	Idempotent	Safe
`GET`	Read	Yes	Yes
`HEAD`	Read headers only	Yes	Yes
`OPTIONS`	Discover allowed verbs / CORS preflight	Yes	Yes
`POST`	Create / non-idempotent action	No	No
`PUT`	Replace whole resource	Yes	No
`PATCH`	Partial update	Sometimes	No
`DELETE`	Remove	Yes	No

Idempotent means calling N times has the same effect as calling once. Safe means the call doesn't change server state. These two distinctions drive retry logic everywhere. Clients can safely retry idempotent calls on a network hiccup, but must be careful with POST.

Those two words are doing more work than they look. Think about what happens when a request times out: the client has no way to know whether the server processed it. The TCP connection dropped after the request went out but before the response came back, and from the client's seat the two outcomes (server got it, server missed it) are indistinguishable. If the call was a GET or a PUT or a DELETE, the safe move is just to fire it again, because doing the operation twice lands in the same state as doing it once. If it was a POST that creates an order or charges a card, a blind retry can double-charge. This is why "is this verb idempotent" is not pedantry. It is the single property that decides whether your retry layer is safe or dangerous.

Why the verb's idempotency decides retry safety. A timed-out PUT can be replayed blindly; a POST needs help.

One subtle case sits in the table: PATCH is "sometimes" idempotent. A patch that sets a field to an absolute value (status = "shipped") is idempotent, because applying it twice leaves the field at the same value. A patch expressed as a delta (increment balance by 10) is not, because two applications add twenty. If you design PATCH bodies as absolute assignments rather than relative operations, you keep idempotency and you keep retries simple. It is one of those choices that costs nothing at design time and saves an incident later.

Status codes worth knowing

HTTP defines about sixty status codes. About fifteen of them carry their weight in practice. The rest are either historical, redirect-specific, or tied to long-deprecated protocols. The codes group by their first digit, and that grouping is the part to keep in your head: 2xx means it worked, 3xx means look elsewhere, 4xx means the client got something wrong and should not retry unchanged, 5xx means the server got something wrong and a retry might help. A client that only understands those five buckets already handles most of what it will meet.

One request, one response, and the five buckets every client should understand. The status code is the truth; the body is the detail.

Within those buckets, a handful of specific codes earn their keep. Memorize these:

200 OK: Generic success.
201 Created: Resource created. Include a Location header pointing at it.
202 Accepted: Work queued. Reply with a polling URL.
204 No Content: Success, nothing to return (DELETE, idempotent PUT).
301 / 308: Permanent redirect. 308 preserves the verb.
302 / 307: Temporary redirect. 307 preserves the verb.
400 Bad Request: Malformed request.
401 Unauthorized: Not authenticated. Include WWW-Authenticate.
403 Forbidden: Authenticated, not authorized.
404 Not Found: Resource doesn't exist (or you don't have permission to know).
409 Conflict: Write conflicts. Optimistic concurrency, version mismatch.
410 Gone: Permanently removed. Tells caches to drop it.
422 Unprocessable Entity: Well-formed JSON, semantically invalid (failed validation).
429 Too Many Requests: Rate limited. Include Retry-After.
500 Internal Server Error: Your fault. Don't ever return 5xx for input the client can fix.
503 Service Unavailable: Temporary; client should back off and retry.

The original sin: 200 OK with {"error":"..."} in the body. Tooling, intermediaries, retries, and clients all assume the status code is the truth. An API that returns 200 for everything cannot be cached, cannot be retried sensibly, and cannot be load-balanced by health checks. Use real codes; put structured detail in the body.

Idempotency keys on POST

POST isn't idempotent — but you can make it idempotent with an Idempotency-Key header. Stripe popularised the pattern. The server stores the result of the first request keyed by that header; subsequent retries return the cached response.

POST /v1/payments
Authorization: Bearer sk_live_...
Idempotency-Key: 8d2b1c0a-7e4f-4c12-9f6e-...
Content-Type: application/json

{"amount": 4200, "currency": "usd", "source": "tok_..."}

Server-side, the typical implementation is:

# pseudocode
key  = request.header['Idempotency-Key']
hash = sha256(canonical(request.body))

if rec := storage.get(key):
  if rec.body_hash != hash:
    return 409  # same key, different body — client bug
  return rec.cached_response  # replay

response = handle(request)
storage.put(key, body_hash=hash, response=response, ttl=24h)
return response

Without idempotency keys, network retries on payments either lose transactions or charge twice. Build this in from day one for any state-mutating endpoint with money, storage, or external systems on the line.

The 409 on body mismatch is not optional. If a client reuses the same key with a different body, that is a client bug — usually a buggy retry that picked up fresh data. Returning 409 surfaces the bug at staging time. Returning 200 silently masks it until production.

Conditional requests with ETag and If-Match

Optimistic concurrency without a database transaction. The server stamps every resource with an ETag — usually a hash of the contents, or a version number. The client echoes it back on writes via If-Match; the server rejects with 412 Precondition Failed if the resource has moved on since the read.

# read
GET /orders/42
→ 200 OK
  ETag: "v17"
  {...order...}

# write
PATCH /orders/42
If-Match: "v17"
{"status": "shipped"}

→ 200 OK              if ETag still v17
→ 412 Precondition    if someone else wrote v18 first

ETags also do double duty for caching. If-None-Match on a GET lets the server reply 304 Not Modified without re-sending the body. CDNs use this; reverse proxies use this; well-built clients use this. It's free bandwidth.

Versioning without breaking clients

Every API that lives long enough has to change in a way that would break existing clients, and the moment you have callers you do not control, you cannot just edit the response shape. There are three common places to put the version, and the choice is more about taste and tooling than correctness.

Approach	Looks like	Trade-off
URI path	`/v2/orders/42`	Obvious, cache-friendly, easy to route. Purists dislike that the URI for "order 42" now changes per version.
Header / media type	`Accept: application/vnd.acme.v2+json`	Keeps URIs stable and is the most "correct" by the spec, but is invisible in a browser and easy to forget.
Query parameter	`/orders/42?version=2`	Simple, but pollutes caching and is easy to drop by accident.

The pragmatic default for a public API is the version in the path. It is visible, it routes cleanly through gateways and CDNs, and a developer can paste a URL into a browser and see what they get. Whichever you pick, the discipline that actually matters is keeping changes additive within a version. Adding a field is safe, because well-built clients ignore fields they do not know. Removing a field, renaming one, changing a type, or tightening validation is a breaking change and needs a new version. Date-based versions (Stripe's 2023-10-16 style, pinned per account) are a clever middle path: the URL stays the same, and the client opts into a snapshot of behaviour that never shifts under it.

Cheapest versioning is no versioning. Most "we need v2" moments are really "we need one more field," which is additive and needs no version bump at all. Reserve a true version change for the rare day you must remove or reshape something. Versions are expensive: every one you ship is a code path you maintain for years.

HATEOAS and the Richardson maturity model

Leonard Richardson described a four-rung ladder that measures how much of REST an API actually uses. It is a useful map even though almost nobody climbs to the top, because each rung names a real design decision.

Richardson's ladder. Level 2 (real resources, real verbs, real status codes) is the practical target; level 3 is the part of REST almost no one ships.

Level 3 is HATEOAS — Hypertext As The Engine Of Application State — the constraint Fielding cared about most. The server tells the client what's possible next, via links in the response:

{
  "id": 42,
  "status": "draft",
  "_links": {
    "self":    { "href": "/articles/42" },
    "publish": { "href": "/articles/42/publish", "method": "POST" },
    "delete":  { "href": "/articles/42", "method": "DELETE" }
  }
}

In theory this lets the server move endpoints freely; clients only follow links. In practice almost nobody does this, because clients hard-code URL templates anyway. HATEOAS shows up most cleanly in PayPal's API and in Spring HATEOAS — and arguably in the way the web itself works (the browser doesn't hard-code URLs; it follows <a> tags).

The pragmatic version: include links in responses where they help. Pagination cursors, related resources, action endpoints. Don't bet your client design on the server moving them.

Where REST stops being a good fit

Mobile / over-fetching. A "user profile" page ends up making 5 round-trips for half the data each. That's GraphQL's whole pitch.
Streaming. Long-lived bidi streams don't fit into the request/response model. Use WebSockets, SSE, or gRPC.
Internal microservices. Inside a binary fence, gRPC's binary framing and strict schemas usually win on latency and CPU.
Action-shaped operations. "Refund this charge" doesn't model cleanly as a resource. Many large APIs (Stripe, GitHub) just accept "actions as POSTs to nouns" and move on: POST /charges/{id}/refund.
RPC with state. If your API is fundamentally a remote procedure call ("sum these numbers", "compile this code"), wrapping it in resources is ceremony. JSON-RPC or gRPC may fit better.

REST vs gRPC vs GraphQL

The three protocols you reach for most are not really competitors so much as answers to different pressures. REST optimises for reach and cacheability over plain HTTP. gRPC optimises for throughput and tight contracts between services you control. GraphQL optimises for clients that want to ask for exactly the data they need in one round trip. Picking well is mostly a matter of naming which pressure you actually have.

	REST	gRPC	GraphQL
Transport	HTTP/1.1 or 2, text JSON	HTTP/2, binary Protobuf	Usually HTTP, single POST endpoint
Contract	OpenAPI (optional)	Protobuf schema (required)	GraphQL schema (required)
Fetching	Fixed per endpoint; can over/under-fetch	Fixed per method	Client picks fields exactly
HTTP caching	Built in, free	None (it's all POST-like)	Hard (one POST URL)
Streaming	Weak (SSE bolt-on)	First-class, bidirectional	Subscriptions, awkward
Best at	Public APIs, browser clients, edge caching	Internal service-to-service, low latency	Aggregating many sources for a UI

The honest rule of thumb: expose REST at the edge where the open web, browsers, and third parties live and where HTTP caching pays off; use gRPC behind the gateway between your own services where you control both ends and care about latency and CPU; reach for GraphQL when a single screen needs to stitch together data from several backends and you are tired of either making five REST calls or shipping a bespoke aggregate endpoint per screen. Plenty of real systems run all three, with a GraphQL or REST gateway out front fanning out to gRPC services behind it.

The REST-versus-GraphQL decision is the one most teams actually agonise over, because both can sit at the edge. The deciding question is usually who controls the clients and how varied their data needs are. The REST vs GraphQL comparison walks the full trade space, including the operational costs (query depth limits, caching, N+1 resolvers) that the headline pitch tends to skip.

Pagination — cursor vs offset

Offset pagination (?page=12&size=25) is the obvious approach and almost always the wrong one at scale. The query SELECT ... LIMIT 25 OFFSET 5000 still has to scan 5000 rows before it returns anything; latency grows linearly with page number. Worse, an insert between requests shifts every page boundary, so the same row can appear in two pages or none.

Cursor pagination encodes the position in an opaque token — usually base64 of (sort key, primary key) — and the client passes it back as ?after=eyJp.... The server translates that into a WHERE (created_at, id) > (?, ?) predicate that uses the same index every other query uses, with constant cost regardless of page depth.

Choice	Latency at page N	Stable under writes?	Can jump to arbitrary page?
Offset / limit	O(N)	No	Yes
Cursor (keyset)	O(1) per page	Yes	No — only next/prev
Time-based cursor	O(1) per page	Yes	By time, not by row
Snapshot tokens (with a revision id)	O(1)	Yes, isolated	No — fixed snapshot

The right default is cursor pagination. Use offset only when the use case explicitly needs "jump to page 47" — e.g. an admin UI for a small table. Twitter, Slack, Stripe, GitHub all expose cursor APIs. The standard envelope shape:

GET /v1/messages?limit=20

200 OK
{
  "data": [ ... ],
  "has_more": true,
  "next_cursor": "Y3Vyc29yOnYxOjE2ODY3MjA="
}

Caching headers, the part most APIs skip

HTTP has a sophisticated caching model that almost no API actually uses. That's a missed performance win — a single Cache-Control header can turn a query that does ~50 ms of database work into a 304 Not Modified that doesn't touch the database at all.

Header	What it does	When to set
`Cache-Control: public, max-age=300`	Allow shared caches to serve this response for 5 minutes.	List endpoints, public reference data.
`Cache-Control: private, no-store`	Per-user data; never cache.	Account-scoped resources, sensitive data.
`Cache-Control: max-age=0, must-revalidate`	Force conditional revalidation on every request.	Mostly-static but mutable resources (product catalog).
`ETag: "v3-abc12..."`	Strong revision tag; client sends back as `If-None-Match`.	All GETs of significant resources.
`Last-Modified` + `If-Modified-Since`	Weaker, time-based variant of ETag.	When you don't have a content hash.
`Vary: Authorization, Accept-Language`	Tell caches that the response differs by these headers.	Whenever the response varies on something other than URL.

The Vary trap. If you serve different bodies based on Authorization and forget Vary: Authorization, a CDN or shared cache will serve user A's response to user B. Several real-world incidents have been traced to this. Whenever a header influences the response, it goes in Vary — or use private + no-store to opt out entirely.

Rate limits, Retry-After, and the 429

Production APIs need rate limits. The convention is to return 429 Too Many Requests when a client exceeds its budget, along with these headers that tell the client what to do:

429 Too Many Requests
Retry-After: 12
RateLimit-Limit: 1000
RateLimit-Remaining: 0
RateLimit-Reset: 1700000000

Retry-After is the only one most clients respect by default. Anywhere you can name the budget exactly (e.g. "1000 req/min, you've spent 1000, the bucket resets at this unix time"), spend the bytes — it lets the client back off intelligently instead of polling tighter and tighter.

The standardisation effort here is the IETF RateLimit-* header draft. Until it lands, the de-facto pattern is X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset. Use both forms — clients in the wild expect either.

Error responses — RFC 9457 problem+json

The single biggest interoperability win on REST in the last decade is the problem-details format (RFC 9457, formerly 7807). Every error has the same shape, carries a stable machine-readable type, and is friendly to both humans and SDK code:

HTTP/1.1 422 Unprocessable Content
Content-Type: application/problem+json

{
  "type": "https://api.example.com/errors/insufficient-balance",
  "title": "Insufficient balance",
  "status": 422,
  "detail": "Account 12345 has 0.50 USD; the requested charge is 1.00 USD.",
  "instance": "/v1/payments/pi_abc123",
  "balance_usd": 0.50,
  "minimum_required_usd": 1.00
}

The type is a URL that uniquely identifies the error kind — clients switch on it like an enum. Any number of additional fields can be added (balance_usd, etc.) to give the application enough context to act. Stripe, GitHub, and most modern public APIs ship this shape now.

Why a stable type matters. Error messages get rewritten as the product evolves; type URLs don't. Code that does if err.type === '.../insufficient-balance' works across years of copy edits. Code that does if err.message.includes('insufficient') breaks the moment someone tweaks the phrasing.

A pragmatic checklist

✅ Use plural nouns: /users, not /user.
✅ Lowercase, hyphenated paths. /order-line-items.
✅ Filter via query string: GET /orders?status=open&limit=50.
✅ Wrap collections: { "data": [...], "next_cursor": "..." }.
✅ Use If-Match + ETag for optimistic concurrency.
✅ Always include Content-Type and a stable X-Request-Id.
✅ Document with OpenAPI from day one — generate clients, don't write them.
✅ Idempotency keys on every state-mutating POST.
✅ Real status codes. RFC 9457 application/problem+json error envelopes.
🚫 Don't return 200 with {"error":"..."}.
🚫 Don't tunnel everything through POST.
🚫 Don't bake auth tokens into URLs (they end up in logs).
🚫 Don't reuse paths for different resource types — once /users/42 is a user, it stays a user forever.

REST

Where REST comes from

The uniform interface, drawn out

Resources, not actions

HTTP verbs at a glance

Status codes worth knowing

Idempotency keys on POST

Conditional requests with ETag and If-Match

Versioning without breaking clients

HATEOAS and the Richardson maturity model

Where REST stops being a good fit

REST vs gRPC vs GraphQL

Pagination — cursor vs offset

Caching headers, the part most APIs skip

Rate limits, Retry-After, and the 429

Error responses — RFC 9457 problem+json

A pragmatic checklist

Further reading