What does a recursive resolver do?

It is the server your client asks. It walks the namespace tree on the client's behalf, querying the root, the TLD, and the authoritative servers in turn, caching every answer for the duration of its TTL. Clients never talk to root or TLD servers themselves; they only ever speak to the recursive resolver.

How long is a DNS answer cached?

For the TTL of the record, set by the zone owner. Common values: 60 seconds for records intended to support fast cutover; 300 seconds for content delivery; 3600 seconds for stable records; 86400 seconds for NS records. RFC 2308 also requires negative answers (NXDOMAIN) to be cached.

DNS Resolution Simulator: six servers, one answer.

Q: What is DNS?

The Domain Name System is a globally distributed key-value store mapping human-readable names like www.example.com to machine-readable records, most commonly IP addresses. It runs on UDP and TCP port 53, with modern transports DoH (443) and DoT (853) layering it over TLS.

Q: Are there really only thirteen root servers?

There are thirteen named root server identities labeled A through M, but each one is anycast-routed across hundreds of physical instances worldwide. A query to k.root-servers.net reaches whichever physical instance is closest in BGP routing terms, not a single server in one data centre.

DNS resolution turns a name like example.com into an IP address by walking a chain of servers — browser cache, OS resolver, recursive resolver, root, TLD, authoritative — and caching the answer at each hop so the next lookup is faster. Watch a name walk the chain and the answer cache itself on the way back.

Queries

Cached

External

Step

Domain

Mode

TTL

Clear cache

Resolution path

Browser cache

(empty)

→

OS resolver

(empty)

→

Recursive resolver

(empty)

→

Root NS

NS com a.gtld-servers.net static

→

TLD NS (.com)

NS example.com dns1.example.com static

→

Authoritative NS

A www.example.com 93.184.216.34 static

A example.com 93.184.216.34 static

NS example.com dns1.example.com static

active layer idle browser → OS → recursive walk the cache chain; root → TLD → auth walk the namespace

Trace

— quiet —

Why DNS works at all

A globally distributed key-value store can't survive without caching. Every layer of the resolution chain stores answers for the TTL of each record. By the time a query is "popular", almost every recursive resolver in the world has the answer locally.

Thirteen root servers

A through M. Each is anycast-routed across hundreds of physical machines. A query to k.root-servers.net reaches whichever instance is closest in BGP terms, never one specific data centre.

TTL trade-off

Short TTLs (60s) let you cut over quickly when you change records but multiply traffic on the resolvers. Long TTLs (86400s) keep load light but make planned migrations slow. Most CDN records use 60s; most NS records use 86400s.

DoH and DoT

DNS over HTTPS (port 443) and DNS over TLS (port 853) wrap classic DNS in transport encryption. Privacy and harder for middleboxes to block. They are not faster — encryption adds a handshake — but they remove the cleartext side channel.

Adjacent

What you're looking at

The six rows are the layers a name passes through on its way to an address: browser cache, OS resolver, recursive resolver, then the root, the TLD, and the authoritative server. Each row lists the records that layer currently holds and the seconds left on each TTL; the highlighted row is the one answering right now, and the trace underneath narrates every hop. The counters split your lookups into cached hits and full external walks.

Resolve www.example.com with the caches cold and watch the query climb all the way to the authoritative server, seeding a copy of the answer at every layer on the way back. Resolve the same name again and the browser cache answers in a single step, with no network at all. The thing to notice is how rarely the deeper servers get touched once a name is warm — almost everything stops at the first or second row. Then fast-forward the TTL until the short browser and OS entries expire and resolve once more: the walk grows back layer by layer as each cache drops its record.

What DNS actually is

A globally distributed key-value store.

DNS is the Domain Name System. It maps human-readable names like www.example.com to machine-readable records, most commonly an IPv4 A record or IPv6 AAAA record, but also mail exchanges, text records for verification, certificate authorities, and pointers to other names. RFC 1034 (1987, Paul Mockapetris) defines the concepts and RFC 1035 defines the wire format. The system runs on UDP and TCP port 53; modern transports add DNS over HTTPS on port 443 (DoH, RFC 8484) and DNS over TLS on port 853 (DoT, RFC 7858).

The protocol is small. A query is a single UDP packet under 512 bytes (4096 with EDNS, RFC 6891) containing the name and the record type. A response is a single packet containing the answer, or a referral to another server, or an authoritative "no such name" (NXDOMAIN). There is no session, no handshake, no state — every query is independent. That property is what lets DNS scale to billions of queries per second across the planet on commodity hardware.

The data itself lives in a tree. The root of the tree is . (yes, a single dot). Below it sit the top-level domains: .com, .org, .net, country codes like .uk and .jp, and the newer gTLDs (.app, .dev, .io). Below each TLD sit second-level domains: example.com, semicolony.dev. Each level is delegated: the root knows which servers run .com, the .com servers know which servers run example.com, and so on. NS (name-server) records glue the levels together.

The simulator above shows the journey of a single query through the chain. Try www.example.com with caches cold (Clear all then Resolve): the browser asks the OS resolver, the OS resolver asks the recursive resolver, the recursive resolver asks the root, the root refers it to the .com TLD servers, those refer it to dns1.example.com, and that authoritative server finally answers with 93.184.216.34. Each step is a separate UDP exchange. Then try the same query again with caches warm: the browser cache answers it locally in microseconds.

The recursive resolver — your only DNS contact

The one server you actually talk to.

A client (your laptop, phone, server) never talks to the root, the TLD, or the authoritative servers directly. It only ever talks to one server: its configured recursive resolver. The resolver does the walking. The client asks "A record for www.example.com?" once and gets one answer back; the dozen UDP exchanges that happened in between are entirely hidden.

Which resolver? Whichever one your OS is configured to use. On a home network it's usually your router's resolver, which forwards to your ISP's resolver. On a corporate network it's the corporate resolver. Many people now run public resolvers directly: Google Public DNS (8.8.8.8, 8.8.4.4), Cloudflare 1.1.1.1, Quad9 (9.9.9.9), OpenDNS, AdGuard. Public resolvers compete on latency, on privacy posture (Cloudflare publishes a no-logs commitment), on DoH/DoT support, on DNSSEC validation, and on filtering (Quad9 blocks known malware domains).

The recursive resolver is the layer where caching matters most. A query for www.google.com from a busy resolver hits the cache; the resolver returns the answer in under a millisecond without touching the wider network. Resolvers serve billions of queries per second this way. Cloudflare's 1.1.1.1 blog post details their architecture: BGP anycast to put a resolver near every user; LRU caches sized to billions of entries; per-query latency under 10ms for cache hits, under 50ms for misses to popular zones.

The namespace tree — root, TLD, SLD, subdomain

Delegation, level by level.

The DNS namespace is a tree, read right-to-left. www.example.com. (note the trailing dot — the root) parses as: the root, then the com top-level domain, then the example second-level domain, then the www subdomain. Each level is a zone, and each zone is delegated to a set of authoritative name servers.

NS records are the glue. The root zone holds NS records for every TLD: com NS a.gtld-servers.net. The .com zone holds NS records for every domain registered under it: example.com NS dns1.example.com. The example.com zone holds the actual A, AAAA, MX, TXT records. A recursive resolver walking down the tree follows the NS chain at every step.

Delegation explains why DNS is decentralised in practice even though it has a single root. The root operators run the root zone. ICANN delegates TLDs to registries (Verisign for .com and .net; PIR for .org; .DEV is Google Registry). Registrars sell domains under those TLDs. The owner of example.com runs (or pays someone to run) the authoritative servers for that zone, and controls every record under it. No single party can change example.com's records without holding the credentials at the registrar.

The thirteen root servers — anycast, not single boxes

A through M, hundreds of instances each.

There are thirteen root server identities, named a.root-servers.net through m.root-servers.net. Each identity is operated by a different organisation: Verisign runs A and J; USC-ISI runs B; Cogent runs C; the University of Maryland runs D; NASA Ames runs E; Internet Systems Consortium runs F; and so on through RIPE NCC, Netnod, Verisign again, ICANN, US Army Research Lab, WIDE Project. The list at root-servers.org is public.

The "thirteen" is a packet-size limit. RFC 1035 caps DNS over UDP at 512 bytes, and the root NS records plus their A records (later AAAA records) need to fit in a single response. Thirteen is the largest count that fits. EDNS (RFC 6891) lifted the cap to 4096 bytes in 1999, but the thirteen-identity tradition stayed.

Each identity is not a single machine. Through BGP anycast (RFC 4786), every identity is advertised from hundreds of physical instances worldwide. A query to k.root-servers.net is routed to whichever instance is closest in BGP terms — usually in the same metro as the querying resolver. The root-servers.org map counts well over a thousand root server instances combined. The root system handles tens of billions of queries per day, mostly answered in single-digit milliseconds, mostly cacheable.

This is also why no one has "taken down DNS" by attacking root servers. The 2002 and 2007 attacks on the root nominally hit named identities but failed to disrupt service: anycast means the actual servers are spread across hundreds of locations, and the per-query work is trivial because nearly every recursive resolver in the world caches root NS records for two days at a time.

Record types — A, AAAA, CNAME, MX, TXT, NS, SOA, CAA

What lives at a name.

A DNS query carries a name and a record type. The type tells the resolver which record to return. The major types and what they're for:

Type	Maps name to	Used for
A	IPv4 address	web, generic IPv4 services
AAAA	IPv6 address	IPv6 services, dual-stack hosts
CNAME	another name (alias)	CDN front ends; pointing one name at another
MX	mail server name + priority	SMTP routing
TXT	arbitrary text	SPF, DKIM, domain ownership proofs
NS	name server for a zone	delegation
SOA	start-of-authority for a zone	zone metadata, refresh intervals, negative TTL
CAA	certificate authority allowed	restricts which CAs may issue for the name
PTR	reverse lookup IP → name	logs, sender reputation
SRV	service host + port	SIP, XMPP, internal service discovery
DS / DNSKEY / RRSIG	DNSSEC signatures	chain-of-trust validation

The CNAME record is unusual: it returns another name instead of an address. Resolvers follow the chain transparently. The CDN industry depends on this — www.example.com CNAME example.cdn-provider.net means every lookup for the customer's name eventually hits the CDN's authoritative servers, which return a latency-optimised A record per client geography. CNAME chains have a rule: a name with a CNAME record cannot also hold other record types, which is why the apex of a zone (example.com, no www) can't be CNAMEd — it must hold the SOA and NS records of the zone.

Caching and TTL — why DNS works under load

Every layer keeps its own copy.

The reason DNS scales at all is caching. Every recursive resolver on the planet caches the answer to a popular query (www.google.com, api.cloudflare.com, www.github.com) for the duration of that record's TTL. The authoritative server is therefore queried only when an existing cache entry expires — typically every few minutes for short-TTL records, every few days for stable ones. Without caching, the root and TLD servers would receive every query from every client on Earth and the system would collapse on day one.

The TTL is set by the zone owner. Trade-offs:

60 seconds — used by CDNs (Cloudflare, Fastly, Akamai) and by hosts about to migrate. Lets you cut over quickly when you change the underlying IP, but multiplies query load on your authoritative servers.
300 seconds (5 min) — typical default for application hosts. Reasonable speed of propagation; tolerable load.
3600 seconds (1 hour) — typical for stable web records.
86400 seconds (24h) — typical for NS records and large infrastructure that doesn't change. Light load; you cannot reroute traffic within the day.

When you plan a DNS-based cutover (changing the IP behind a name), reduce the TTL a few hours before the change so caches don't hold the old value past the cutover window. This is the "DNS propagation" effect users complain about: it's not propagation, it's TTL expiry. Recursive resolvers everywhere will independently re-query when their local TTL hits zero, and only then will they see the new value. If you forgot to lower the TTL before the change, some users see the old IP for hours.

Negative caching — NXDOMAIN is a real answer

Per RFC 2308.

A name that does not exist returns an NXDOMAIN response. RFC 2308 (1998) specified that NXDOMAIN responses must be cached, with the negative TTL drawn from the zone's SOA record (the SOA's minimum TTL field). Otherwise a typo like www.googel.com typed into a million browsers would keep hammering the root and the .com servers forever.

Negative TTLs are typically shorter than positive ones — 300 seconds is common for SOA minimum. The reasoning: a positive answer rarely flips to "no such name", but a non-existent name might be created at any moment by the zone owner. Long negative caching would leave users staring at "this domain does not exist" for hours after the owner spun up the new record.

The 2021 Facebook outage exposed an unhappy interaction. When Facebook's BGP routes withdrew, their authoritative DNS servers also became unreachable. Recursive resolvers' caches for facebook.com records expired, leading to SERVFAIL responses (not NXDOMAIN, but treated similarly by client retry logic). Resolvers retried, hard. The query rate from the world's recursive resolvers to Facebook's now-unreachable name servers spiked into the millions per second, choking the upstream networks for hours after Facebook brought BGP back. SERVFAIL is not subject to RFC 2308 negative caching; the spec deliberately keeps it short so transient failures self-heal.

Modern transports — DoH and DoT

Encryption, not speed.

Classic DNS travels in cleartext UDP. Every router on the path, every WiFi sniffer, every captive portal can see which names you resolve, and many actively rewrite the responses (split-horizon, ad-injection, NXDOMAIN-to-search-page hijacks at hotel networks). Two newer transports wrap DNS in TLS to fix this.

DoT — DNS over TLS (RFC 7858) runs classic DNS-over-TCP inside TLS on port 853. A long-lived TLS connection between the client and the recursive resolver carries every query and response. Easy for resolvers to deploy (the protocol inside the TLS is unchanged); easy for network operators to identify (port 853) and potentially block.

DoH — DNS over HTTPS (RFC 8484) encapsulates DNS queries in HTTPS POST or GET on port 443. The query becomes indistinguishable from any other HTTPS traffic to an outside observer. Firefox enables DoH by default in many regions, with Cloudflare or NextDNS as the resolver. Chrome supports DoH but does not enable it unconditionally; it uses the system resolver's choice when possible.

Neither protocol is faster than classic DNS — TLS adds a handshake, and the first query suffers the TCP plus TLS round-trip penalty. Sessions are reused for subsequent queries, so the warm-state latency is comparable to UDP DNS, but the cold-start cost is higher. What you gain is privacy from on-path observers (your ISP no longer sees your DNS), resistance to NXDOMAIN hijacking, and integrity (no in-flight rewrites). What you lose is local resolver autonomy if the client is configured to bypass the system resolver and go straight to a public DoH provider.

DNSSEC — the chain of trust

Origin authentication for DNS answers.

DNSSEC (RFCs 4033, 4034, 4035) signs DNS records cryptographically. Every record set in a zone is signed by the zone's key; the zone key is in turn signed by the parent zone's key; the chain bottoms out at the root, whose key is hardcoded into resolvers. A DNSSEC-validating resolver can detect a tampered answer because the signature won't verify against the parent's expected key.

The wire records: DNSKEY holds the public signing key for a zone. RRSIG holds the signature of a record set. DS (Delegation Signer) sits in the parent zone and asserts the hash of the child's DNSKEY — this is the link that ties one zone's trust to the next. The chain runs from the root DNSKEY (the "Trust Anchor") down through TLD DS records, TLD DNSKEY, second-level DS, second-level DNSKEY, and finally the RRSIG over the leaf record.

DNSSEC adoption is uneven. The root, all TLDs, and most ccTLDs are signed. Most large second-level zones are not. Operating DNSSEC correctly is harder than running plain DNS: key rotations require careful timing (RFC 6781), and a misconfigured signature breaks the zone entirely until fixed. The 2014 Cloudflare engineering write-up on DNSSEC deployment is candid about the operational cost.

What DNSSEC does not do: encrypt the query. Anyone on the path still sees which name you asked for. DNSSEC protects integrity and authenticity, not confidentiality. Combine with DoH or DoT for both.

CDNs and DNS — latency-based routing

Different clients get different answers.

A content delivery network places points of presence in dozens of metros worldwide. The trick that ties them together is the DNS layer: a request from Tokyo for cdn.example.com resolves to a Tokyo POP; the same name from London resolves to a London POP. Two clients query the same name; they get two different answers.

This works because the authoritative servers behind cdn.example.com are owned by the CDN, and they pick the answer based on the resolver's IP address (a rough geographic proxy) or the client subnet hint (ECS, RFC 7871) if the resolver supports it. GeoDNS is the simple version: a geographic lookup table maps resolver IP to POP. Latency-based routing goes further by measuring real round-trip latency from each POP to each resolver and picking the lowest.

The other half of the trick is anycast. The CDN advertises the same IP from every POP via BGP; whichever POP is BGP-closest receives the packet. DNS-based routing decides "which IP" — anycast decides "which physical location for that IP". A modern CDN uses both layers: DNS picks a regional address (or set of addresses); anycast picks the actual POP inside that region.

This is also why DNS-based load balancers are limited. The granularity is the resolver, not the client. A million clients behind one corporate resolver all see the same answer. The ECS extension exposes a hint about the client subnet but is not universally honoured. For per-client traffic shaping you need load balancers in the data path, not DNS.

What breaks — TTL races, resolver bugs, BGP cascades

The failure modes are all in the seams.

TTL races on cutover. You change www.example.com from the old IP to the new IP, expecting it to flip within a minute because you set TTL to 60. Half your users see the new server immediately. The other half see the old server for hours because their resolver cached the answer with the previous (longer) TTL, before you lowered it. Best practice: lower the TTL at least 24 hours before the change, so every cached copy in the world has aged out and re-queried at the short value before the cutover.

Resolver bugs and quirks. Resolvers do not all behave identically. Some respect the SOA negative TTL strictly; some clamp it to a maximum (often a few minutes). Some honour ECS, most don't. Some retry SERVFAIL aggressively; some don't. Some implement DNSSEC validation; some don't. Some leak queries even for cached names (Windows' DNS Client Service has had this bug periodically). Designing for the diversity is mostly about not assuming uniform behaviour: short TTLs for things that change, redundant authoritative servers, and authoritative-server health alarms that look at actual query volume, not just uptime.

Facebook 2021 — the BGP-DNS cascade. On October 4 2021, a Facebook configuration command withdrew the BGP routes to their authoritative DNS servers (along with everything else in their global network). Recursive resolvers worldwide started timing out on facebook.com, instagram.com, and whatsapp.com queries. The retries flooded upstream networks; Cloudflare's 1.1.1.1 saw a 30x spike in SERVFAIL responses. Facebook engineers couldn't get into the data centres because their door access depended on the same internal DNS. Recovery took six hours. The lesson the industry took: keep your authoritative DNS reachable through a path that does not depend on your service being up.

Split-horizon gone wrong. Many enterprises run two views of a zone — internal (for VPN'd employees) and external (for the public Internet). When the two views drift, employees on the corporate network can't access services that work from home, or vice versa. Worse, when the internal view is taken offline for maintenance and the external view is the only copy left, clients on the VPN see SERVFAIL and the helpdesk floods.

Reachability rule

Your authoritative DNS, your monitoring, and your incident-response runbooks must work when your service is down. If they all live behind your own network and DNS, the worst-case outage becomes recursive.

DNS Resolution Simulator: six servers, one answer.

What DNS actually is

The recursive resolver — your only DNS contact

The namespace tree — root, TLD, SLD, subdomain

The thirteen root servers — anycast, not single boxes

Record types — A, AAAA, CNAME, MX, TXT, NS, SOA, CAA

Caching and TTL — why DNS works under load

Negative caching — NXDOMAIN is a real answer

Modern transports — DoH and DoT

DNSSEC — the chain of trust

CDNs and DNS — latency-based routing

What breaks — TTL races, resolver bugs, BGP cascades

Further reading on DNS