CloudFront Cache Simulator: a request in three layers.
Edge POP. Regional edge cache. Optional Origin Shield. Then the origin. A request walks that ladder; on miss it goes deeper; on the way back, every layer it passed populates its own cache. A CDN is, at heart, that staircase, plus TTL arithmetic and a few invalidation tricks. Click a request below and watch the cascade.
The stacked lanes are the four cache layers a request walks through — Edge POP, Regional Edge Cache, the optional Origin Shield, and the Origin that always holds the object. Send request lights each lane in turn as the request probes it for a HIT, then drops deeper on a MISS. When it finally finds the object, the response flows back and every layer it passed populates its own copy with the TTL you set. The stat strip tracks where hits landed, the running hit ratio, and bytes saved; the log narrates each hop with its latency.
Send the same URL twice in a row. The first request is all MISSes down to the origin and returns slowest, around 180ms plus each hop. The second is an instant edge HIT at roughly 8ms, because the way back warmed every layer. Then hit Fast-forward 30s a couple of times to expire the TTLs and watch the next request cascade to origin all over again. Toggle Origin Shield off and the ladder loses a rung — fewer layers to warm, but every region now hammers the origin directly.
What CloudFront actually is
A CDN with three caching layers, plus a few protection and routing services bolted on. AWS publishes roughly 600+ points of presence: about 450 edge locations plus 13 regional edge caches. The edge locations are where TLS terminates, where the request first lands. Regional caches sit one tier deeper, one per AWS region. Origin Shield is an optional single regional cache designated as the choke point in front of your origin, so all CloudFront traffic converges through it before reaching you.
On top of caching, CloudFront does DDoS mitigation, WAF rule evaluation, and lets you run code at the edge — either short JavaScript via CloudFront Functions or full Node/Python via Lambda@Edge. The cache hierarchy is the part you tune for cost and hit rate; the rest is configuration.
The cache hierarchy
From client to origin, in order: Edge POP (closest to the user, terminates TLS, fastest answer) → Regional Edge Cache (one per region, larger, longer-lived) → Origin Shield (optional, a single regional cache that consolidates traffic before it leaves CloudFront) → Origin (S3, ALB, EC2, MediaPackage, custom). Each layer caches independently and answers with a HIT or a MISS. On a MISS, the request moves deeper; on a HIT, the response flows back, populating every layer it passed.
The default-without-Origin-Shield path is three layers (edge, regional, origin). With Origin Shield enabled, it's four. The shield adds roughly 5–15ms of latency on a cold miss but cuts origin-facing request rates dramatically for many-region deployments.
Cache keys — what makes two requests "the same"
CloudFront hashes the URL plus whichever headers, query strings, and cookies you've told it to include via a cache policy. Anything not in the cache policy is invisible to caching. Two requests with the same cached key share the same response even if their other headers differ. Two requests differing on a cached-in field are two cache entries.
Origin request policies are separate. They control what gets forwarded to the origin on a MISS, independent of what's in the cache key. Common mistake: putting Accept-Language in the origin request policy but not the cache policy — the origin returns different content per language, but CloudFront caches whichever one arrived first and serves it to everybody.
TTL — minimum, default, maximum
A cache policy declares three TTLs: minimum, default, maximum. The actual TTL CloudFront uses for a given response is the clamp of Cache-Control: max-age (or s-maxage, which wins for CDNs) into [min, max]. If the origin sends no Cache-Control at all, CloudFront uses the default.
Two TTL extremes are common. Static assets (JS, CSS, images with content hash in the URL): set max-age=31536000, immutable — cache for a year. HTML and API responses: short TTLs (30s–5min) so changes propagate fast. Anything in between is usually a misconfiguration.
Cache hit ratios in production
What's good. Static-asset buckets routinely hit 95%+ at the edge. API caching (where applicable) lands 30–70%, depending on how much of the surface is cacheable. Video streaming sits at 90%+ — that's the whole reason CDNs exist. Below 80% on a content-heavy site usually means a cache-key bug (something is being keyed on that shouldn't be), not a fundamental limit.
The metric you actually want is edge hit ratio from CloudFront's reports. AWS also reports regional and origin hit rates separately so you can see where the misses are landing.
Origin Shield — when the extra hop earns its cost
Origin Shield designates one regional cache as a global choke point. All CloudFront traffic flows through that one cache before reaching your origin. Two outcomes follow: your origin sees a tiny fraction of the requests it would otherwise see (deduplication of identical requests across all regions), and you pay extra GB-out for the shield-to-origin traffic.
When it earns its cost: many viewer regions hitting the same content (global software downloads, popular video). When it doesn't: small footprint, low cache hit ratio (shield adds latency without much dedup benefit), or origin that already lives behind a fast cache (a second CDN in front).
Invalidation — and why versioned URLs are better
CloudFront invalidation marks paths as stale across all edges. Propagation: ~60 seconds. Cost: first 1,000 paths/month free, then $0.005 each. Wildcards count as one path.
The shipping-engineer reflex: don't invalidate. Version assets in the URL — /app.v347.js, /photo-abc123hash.jpg. Each release ships new URLs; old ones expire naturally. Invalidation is reserved for the case where you must purge something that's already been cached, like a bad image or a leaked URL.
Lambda@Edge vs CloudFront Functions
| CloudFront Functions | Lambda@Edge | |
|---|---|---|
| Runtime | JavaScript (subset) | Node.js, Python |
| p99 latency | ~1 ms | ~50 ms |
| Runs at | Every edge POP | Regional edges only |
| Triggers | Viewer request, viewer response | Viewer request/response + origin request/response |
| Memory | 2 MB | Up to 10 GB |
| Best for | Header rewrites, URL normalisation, A/B routing, simple auth | Image resizing on the fly, full JWT validation, server-side rendering at the edge |
Functions are 80× cheaper per million invocations than Lambda@Edge. The rule: if you can do it in Functions, do it in Functions; reach for Lambda@Edge only for the heavier transformations.
Origin protection — OAC, signed URLs, signed cookies
Don't let the world reach your S3 origin directly. Use Origin Access Control (OAC, the modern replacement for OAI). CloudFront signs requests to S3 with SigV4 using a service principal; S3 only allows access from CloudFront's principal via its bucket policy. The bucket stays private.
For paid content, sign URLs or cookies with a CloudFront key pair. Signed URLs are good for one-off downloads; signed cookies are good for sessions where the user authenticates once and then loads many resources (HLS / DASH video).
Real-world deployments
Disney+ launch (Nov 2019). Architected entirely on CloudFront, with 250+ Tbps of peak edge capacity reported in the months after launch. The team has publicly discussed how Origin Shield kept the encoders from getting hammered when millions of viewers tuned in to the same premiere.
Netflix Open Connect. Netflix runs its own embedded ISP-located cache appliances, but the source-of-truth content sits in S3 and CloudFront fronts the rest of the surface (web, metadata APIs, image art). The same pattern shows up in BBC iPlayer, Hulu, and most large video platforms.
Static-site hosting. S3 + CloudFront + ACM is the default modern hosting stack for documentation sites, marketing pages, and SPAs. Vercel, Netlify, and Cloudflare Pages all sit on similar primitives elsewhere.
What breaks — the five gotchas
- Forgetting headers in the cache key. Origin returns different content per
Accept-Language; CloudFront caches the first response and serves it to everyone. Always include any header the origin varies on. - Aggressive TTL on dynamic content. Caching API responses for an hour because "it'll be fast" — and then your status page shows stale order counts for 60 minutes after a deploy.
- Misconfigured Vary header. Origins that send
Vary: *orVary: User-Agentkill the hit rate. CloudFront treats every Vary'd field as a cache-key dimension implicitly via the policy; mixing the two leads to surprises. - CORS preflight not cached. If
OPTIONSisn't in the cache policy, every preflight goes to origin. Front-ends that issue many CORS requests slow down for everyone. - SigV4 origin-auth headers stripped. If your origin request policy doesn't forward
Authorizationand the SigV4 date / token headers, your private S3 origin returns 403 to CloudFront. OAC handles this correctly; legacy OAI configurations sometimes don't.
Tuning knobs that matter
- Cache policy. The setting that moves your hit ratio most. Pick from the AWS-managed presets first (CachingOptimized, CachingDisabled), then customise only what you need.
- Origin request policy. What gets forwarded to origin on MISS. Forward only what the origin needs.
- TTL clamp. Default TTL when origin sends no Cache-Control. Set this deliberately; don't accept whatever CloudFront ships out of the box.
- Compression. Enable both gzip and Brotli at the edge. Brotli at level 6 beats gzip at level 9 on text payloads.
- HTTP/2 and HTTP/3. Enable both. HTTP/3 (QUIC) helps mobile and high-latency clients noticeably.
- Price class. If your audience is purely US/EU, restrict to the cheaper price class and save on edge serving from Asia/South America POPs.
Further reading
- CloudFront developer guide. The canonical reference. The performance and caching pages are the must-reads.
- Origin Shield announcement post. AWS's own framing of when and why to enable it.
- CloudFront in the AWS Codex. The longer-form reference that pairs with this simulator.
- DNS resolution simulator. The other layer of CDN routing — how the client finds the right edge POP in the first place.
- Load balancer simulator. Origin-side traffic distribution after CloudFront has done its job.