API Gateway.
AWS's managed API front-door. Two flavours — REST API (the old, full-featured one) and HTTP API (the newer, cheaper, simpler one). The choice is mostly about features you'd reach for: WAF, Cognito user-pool auth, request transforms, and per-method throttling on REST; everything else on HTTP API at a third the cost.
1 · What API Gateway actually is (and isn't)
The mental model that survives every conversation: API Gateway is a managed HTTP request-transformation pipeline. Every request goes through the same shape — terminate TLS at the edge, authenticate, throttle, transform the request, forward to an integration target, transform the response, return to the caller. The interesting product decisions sit in the transform steps and in which features you turn on at each stage.
What API Gateway isn't: a reverse proxy you control. There's no nginx config, no Lua hooks, no way to drop a sidecar into the request path. The transforms you get are the ones AWS provides — Velocity Template Language (VTL) mapping templates on REST API, simpler parameter mappings on HTTP API. If you need request rewriting that doesn't fit those primitives, you reach for a Lambda integration and write the logic in code.
What API Gateway is also: three different products that share a name. REST API (announced 2015) is the original, full-featured, expensive one — VTL transforms, request validation, Cognito user-pool authorizers, AWS WAF integration, API Gateway caching, usage plans, the works. HTTP API (announced 2019) is the simpler, ~3x cheaper, lower-latency successor, deliberately stripped of the VTL machinery — most requests go through unchanged to an integration. WebSocket API is a separate persistent-connection product with a route-key model for client→server messages. The pricing and feature comparison below assumes you know which one you're talking about.
| API Gateway is good for | API Gateway is bad for |
|---|---|
| Putting a managed HTTPS endpoint in front of Lambda | p99 latency under 10 ms (each hop adds ~10–30 ms; use a Lambda Function URL or ALB) |
| Auth, throttling, and CORS without writing them yourself | Long-running requests (29-second hard timeout, no escape hatch) |
| Multi-stage deployments (dev/staging/prod from the same definition) | Heavy request bodies (10 MB hard cap on payload) |
| Public-internet APIs where AWS handles DDoS / WAF / TLS | WebSocket + REST in one product (they're separate APIs) |
| Selling API access with per-customer usage plans and keys | Streaming responses (no SSE / chunked transfer — use a Function URL or ALB) |
2 · How a request flows — the transformation pipeline
Every API Gateway request traverses roughly the same path; the differences between REST and HTTP API are mostly which boxes are present and how rich each box's logic is.
A few practical implications fall out of this shape. First, authorizer caching matters a lot at scale. A Lambda authorizer invoked on every request will outcost the function it protects; the default 5-minute cache by identity-source cuts authorizer cost by 99% on a typical API. Second, request transforms are where REST API earns its 3.5x price premium. If you're proxying JSON straight to Lambda you don't need VTL — HTTP API is fine. Third, stages aren't free. Each stage runs its own throttle bucket and its own CloudWatch log group; deploying to dev/staging/prod gives you three URLs and three sets of metrics, not three views of one deployment.
3 · REST API vs HTTP API
| REST API | HTTP API | |
|---|---|---|
| Price (per request) | ~3.5× HTTP API | 1× (baseline) |
| Latency | ~30 ms p50 | ~10 ms p50 |
| WAF | Yes | No (use CloudFront WAF in front) |
| Authorizers | Cognito User Pools, IAM, Lambda | JWT (any OIDC IdP), Lambda |
| Request transforms | Velocity templates (full rewrites, painful) | Limited mapping |
| Caching | Built-in API Gateway cache (optional, $) | No native cache; use CloudFront |
| Private endpoints | Yes (private API behind VPC endpoint) | Yes via VPC Link |
| WebSocket API | Separate WebSocket-only product | Not WebSocket-shaped |
| Reach for it when | You need WAF, Cognito user pools, complex transforms, request validation | Anything else (which is most things) |
4 · Routing and integrations
A route in HTTP API (or method on a resource in REST API) maps "HTTP verb + path pattern" to an integration — the backend that handles the request:
- Lambda — invoke a function, return its response. Most common pattern.
- HTTP — proxy to another HTTPS endpoint (internal ALB, external service).
- VPC Link — proxy to a private NLB or service inside your VPC.
- AWS service — call any AWS service directly (S3 GetObject, DynamoDB Query). Bypasses Lambda for simple proxies.
- Mock — return canned responses. Useful for development, OPTIONS preflight, deprecation messages.
Paths support parameters (/users/{userId}) and greedy matches (/{proxy+}). The greedy match is the "I have one Lambda that handles everything" pattern — your Lambda runs an Express-style router internally. Trade-off: gives up API Gateway's per-route features (per-method throttling, per-route authorizers).
5 · Authorizers — JWT, Lambda, IAM
| Type | How it works | Use |
|---|---|---|
| JWT (HTTP API) | Validates a JWT against an OIDC issuer's public keys; matches audience | Anything using Auth0, Okta, Cognito, Firebase Auth, your own OIDC |
| Cognito User Pool (REST) | Validates a Cognito-issued JWT | You're already using Cognito for users |
| Lambda authorizer | Your Lambda checks the request and returns Allow/Deny + IAM policy | Custom auth (API keys with rate limits, mTLS, signed payloads) |
| IAM (SigV4) | Caller signs requests with AWS creds; API Gateway verifies | Service-to-service inside AWS. Hard for browsers. |
| None | Public route | Health checks, public read APIs |
The JWT authorizer is the most common pick and the one worth understanding end-to-end. The mechanism: client obtains a JWT from an OIDC provider (Cognito, Auth0, Okta), sends it as Authorization: Bearer <jwt>, API Gateway verifies the signature against the issuer's published JWKS, checks the audience claim, and either passes the request through or returns 401.
Lambda authorizers cache their result by identity source (default 5 min) — without that, you'd pay for an authorizer Lambda invocation on every request, doubling your Lambda cost and adding ~30 ms to every call. Set the TTL to your token's worst-case acceptable staleness (usually 1–5 min). The authorizer Lambda returns both an IAM policy and a free-form context object; the context is exposed to the downstream integration as $context.authorizer.<key>, which is the standard place to pass userId, tenantId, or any other already-validated claim downstream — your application Lambda doesn't need to re-parse the JWT.
6 · Throttling — the token bucket model
API Gateway throttling is a classic token bucket: each route has a bucket of capacity B (burst) that refills at rate R (req/s). A request consumes one token; if the bucket is empty, the request is rejected with HTTP 429 (TooManyRequestsException). The two levers are R and B, evaluated in order: per-method (REST only) → per-stage → per-account.
| Scope | Default | Where it lives |
|---|---|---|
| Account, region | 10,000 req/s steady · 5,000 burst | Service Quotas — raise via ticket for high-volume APIs |
| Per-stage | Inherits account; configurable lower | Stage settings → throttling |
| Per-route / per-method (REST) | Inherits stage; configurable lower | Route-level override on REST API only |
| Per-API-key (REST, via usage plans) | None until you attach a usage plan | Usage plans + API key |
The shape of the bucket matters. A bucket of 5,000 burst + 10,000 refill/s can absorb a 5,000-request instantaneous spike on top of steady 10,000 RPS — useful when traffic is bursty but bounded. Anything above either limit returns 429 and your retry logic kicks in. The SDK retries with exponential backoff on 429 by default; browsers do not. If your API is browser-facing, surface the rate limit to the client UI (don't just let the request silently fail).
x-api-key header. The keys are statically issued, can't easily be rotated per-user, and don't integrate with your user database. Most B2B SaaS APIs that ship public docs end up not using usage plans — they front API Gateway with their own authorizer Lambda that looks up the customer's plan in a database and applies rate limits via DynamoDB-backed counters. Stripe, Twilio, and Algolia document this pattern in their engineering blogs.7 · REST vs HTTP vs WebSocket vs ALB-with-Lambda
Four AWS products give you "managed HTTPS endpoint that calls a Lambda" with overlapping but distinct trade-offs. The picks are not interchangeable.
| REST API | HTTP API | WebSocket API | ALB + Lambda | |
|---|---|---|---|---|
| Price (per request) | ~3.5× HTTP API | 1× (baseline) | ~1× + connection-mins | ~0.4× + per-LCU floor |
| Latency overhead | 20–40 ms | 10–20 ms | persistent — no per-msg setup | 5–10 ms |
| Request transforms | Full VTL | Limited parameter mapping | None (raw frames) | None |
| WAF | Yes (native) | No — front with CloudFront WAF | No | Yes (ALB WAF) |
| Authorizers | Cognito, IAM, Lambda | JWT, Lambda | Connect-time Lambda | OIDC or Cognito (HTTPS listener rules) |
| Bidirectional | No | No | Yes (server can push) | No (HTTP request/response) |
| Max timeout | 29 s | 29 s | 10-minute idle, 2-hour total | Lambda's 15 minutes |
| Reach for it when | Need WAF, VTL, request validation, usage plans | Most new HTTP APIs | Chat, live dashboards, IoT shadows | Already running an ALB; low latency matters |
The under-discussed answer is ALB + Lambda: roughly a quarter the cost of HTTP API per million requests, lower latency, and no 29-second timeout (Lambda's 15-minute ceiling applies instead). The catch is that the ALB carries a fixed hourly floor (per-LCU pricing) regardless of traffic, so it's only cheaper above tens of millions of requests per month. Below that, HTTP API wins on absolute spend.
8 · Real-world case studies
Three publicly documented API Gateway deployments give a sense of how the product fits real workloads.
LEGO.com — Black Friday on serverless. LEGO re-platformed lego.com to a Lambda + API Gateway + CloudFront architecture in 2019, documented in the AWS case study and a re:Invent 2020 talk by their VP of Technology. The headline number: handling 13x peak traffic during Black Friday with zero capacity planning calls. The architecture has CloudFront in front of API Gateway, with CloudFront absorbing the bulk of read traffic via caching; API Gateway only sees the per-customer requests (cart state, checkout). They explicitly chose REST API over HTTP API for the WAF integration — bot traffic during sales events is the limiting concern, not request transform features. The piece worth stealing: CloudFront cache TTLs are tuned aggressively per route (catalog pages: 5 min, product pages: 1 hour, account: zero), so traffic spikes are mostly absorbed at the edge.
Liberty Mutual — claims platform. Liberty Mutual re-built parts of its claims platform as a microservices-on-serverless architecture, documented in the AWS case study and a re:Invent 2020 talk by their distinguished engineer Gillian McCann. They run hundreds of HTTP APIs behind API Gateway, each fronting a small Lambda — the "microservice-per-endpoint" pattern. The architectural decision that's most replicable: a shared Lambda authorizer across the whole API surface, doing token validation once and caching the result for 5 minutes, instead of every team building their own auth. This is the pattern most enterprises end up at when API Gateway scales past one team: centralise auth, federate everything else.
Comcast — entitlements API. Comcast runs the entitlements service for Xfinity Stream on Lambda + API Gateway, documented in a 2020 architecture blog post and several re:Invent talks. The interesting design: every request to "can this customer watch this stream" is a single API Gateway call that fans out to multiple downstream services via the Lambda integration, with the entire request finishing in under 200 ms p95 despite touching four backends. They explicitly call out API Gateway throttling as their first defense layer during incidents — when a downstream service degrades, they cap the route's req/s at the gateway and let 429s flow back to the client rather than letting the backend fall over.
The through-line: API Gateway works best when treated as the policy and routing layer, not as a transformation engine. Auth, throttling, CORS, and WAF are what justify the per-million cost; complex business logic belongs in the integration target.
9 · Build it yourself — HTTP API → Lambda
- Reuse the Lambda from the previous page.
# or create: cat > /tmp/idx.py <<'EOF' def handler(event, context): return { "statusCode": 200, "body": f"hi from {context.function_name}" } EOF cd /tmp && zip h.zip idx.py ROLE=$(aws iam get-role --role-name lab-lambda --query 'Role.Arn' --output text) aws lambda create-function --function-name api-fn --runtime python3.12 --role $ROLE \ --handler idx.handler --zip-file fileb:///tmp/h.zip - Create the HTTP API.
API_ID=$(aws apigatewayv2 create-api --name lab-api --protocol-type HTTP \ --query ApiId --output text) FN_ARN=$(aws lambda get-function --function-name api-fn --query Configuration.FunctionArn --output text) INTEG_ID=$(aws apigatewayv2 create-integration --api-id $API_ID \ --integration-type AWS_PROXY --integration-uri $FN_ARN \ --payload-format-version 2.0 \ --query IntegrationId --output text) aws apigatewayv2 create-route --api-id $API_ID \ --route-key 'GET /hello' \ --target "integrations/$INTEG_ID" aws apigatewayv2 create-stage --api-id $API_ID --stage-name '$default' --auto-deploy - Let API Gateway invoke the function.
aws lambda add-permission --function-name api-fn \ --statement-id apigw-invoke --action lambda:InvokeFunction \ --principal apigateway.amazonaws.com \ --source-arn "arn:aws:execute-api:us-east-1:$(aws sts get-caller-identity --query Account --output text):$API_ID/*/*/hello" - Test.
URL=$(aws apigatewayv2 get-api --api-id $API_ID --query ApiEndpoint --output text) curl $URL/hello - Add a JWT authorizer (using a fake JWKS for demo).
# In real life: $ISSUER points at your Cognito user pool or Auth0 domain. aws apigatewayv2 create-authorizer --api-id $API_ID \ --authorizer-type JWT --identity-source '$request.header.Authorization' \ --jwt-configuration "Audience=my-app,Issuer=https://example.auth0.com/" \ --name jwt-auth - Tear down.
aws apigatewayv2 delete-api --api-id $API_ID aws lambda delete-function --function-name api-fn
10 · What breaks
- "CORS doesn't work." Forgot to configure CORS on the API. HTTP API is one CLI flag; REST API requires per-method OPTIONS setup. The most common variant: CORS is enabled on the API but the Lambda is returning the response without the
Access-Control-Allow-Originheader that API Gateway requires for proxy integrations. - "My Lambda doesn't return — API Gateway returns 502." Lambda returned a malformed response (must have
statusCodeas an integer andbodyas a string). Returning{"body": {…}}instead of{"body": JSON.stringify(…)}is the #1 cause; CloudWatch shows the actual return value. - 29-second integration timeout — hard limit. API Gateway hard caps every integration response at 29 s; there is no support-ticket path to extend it. For long requests use WebSocket API, async via SQS, or a Lambda Function URL (which inherits Lambda's 15-minute timeout).
- 10 MB payload limit. Both request and response. Larger uploads must use pre-signed S3 URLs (the standard pattern); larger responses must stream from a Function URL or ALB. The limit applies even after binary media type handling.
- WAF attaches natively to REST API only. If you need AWS WAF rules on an HTTP API, you have to front the whole thing with CloudFront and attach WAF to the distribution. Adds latency and cost; sometimes pushes teams back to REST API after a security review surfaces the gap.
- Authorizer not invoked. Routes attach authorizers individually — adding an authorizer to the API doesn't auto-attach to all routes. Common after the first refactor when a new route is added and silently exposed without auth.
- Deployment vs stage confusion (REST API). Editing the API definition doesn't deploy it; you must explicitly call
create-deploymentand point a stage at the new deployment ID. Teams ship a Terraform change, see no behaviour change, and spend an hour debugging before realising they didn't deploy. - Usage plan + API key clunkiness for customer-facing APIs. Keys are statically issued and don't rotate with user lifecycle. Most real B2B tiered APIs end up implementing rate limits in a Lambda authorizer against a DynamoDB counter and ignoring usage plans entirely.
- Cost spike. Per-request pricing adds up at high TPS, and REST API costs ~3.5× HTTP API per request. Switch to HTTP API; or front API Gateway with CloudFront for cacheable responses; or skip it entirely for an ALB + Lambda once you're into tens of millions of requests per month.
- "My API works in us-east-1 but not eu-west-1." REST APIs default to edge-optimised (CloudFront in front globally) — but the origin runs in one region. For multi-region active-active, use a regional REST API with Route 53 latency routing, or HTTP API which is regional by default.
11 · Further reading
- API Gateway developer guide. The canonical reference; the "best practices" pages are worth reading.
- LEGO.com on AWS. 13x Black Friday spike absorbed by API Gateway + CloudFront.
- Liberty Mutual case study. Microservice-per-endpoint pattern at enterprise scale.
- Comcast architecture blog. Entitlements API on Lambda + API Gateway, low-latency fanout.
- RFC 7519 (JWT). The token format the JWT authorizer validates.
- How API gateways work. The protocol-and-pattern primer.
- API design Codex. Protocol-layer companion.