QUIC and HTTP/3
QUIC is the transport protocol that took the IETF roughly five years to standardise and the wider industry about twenty years to want. It runs on top of UDP, ships its own loss recovery and congestion control, bakes TLS 1.3 into the handshake, and gives applications independent streams instead of one big byte queue. HTTP/3 is the HTTP semantics layer that sits on top. Together they cut the time to first byte on a fresh connection from two round-trips to one — and to zero on the second visit — while sidestepping the head-of-line blocking that has haunted HTTP/2 since 2015.
Why QUIC exists — the head-of-line problem
TCP delivers an ordered byte stream. The kernel hands bytes up to the application in the order they were sent, no exceptions. When a packet is lost, every byte that arrived after it sits in the receive buffer and waits until the retransmission lands and fills the gap. The application doesn’t see the packetisation; it sees a stall. This is head-of-line blocking, and it’s fundamental — you can’t fix it without changing the transport.
HTTP/1.1 hid the problem by opening six TCP connections per origin. A loss on one only stalled the request riding on it. HTTP/2, shipped in 2015, tried to do better: one connection, many multiplexed streams interleaved with HTTP/2 frames. Smaller memory footprint, fewer handshakes, better header compression. But the multiplexed streams still ride a single TCP byte stream — and one lost segment stalls every stream behind it. On a clean fibre link the difference is invisible. On a 1% loss mobile link, with a page that pulls 80 small subresources, it’s painfully visible.
Why did it take twenty years to fix? Because the obvious solution — change TCP — runs into a wall. TCP lives in the kernel. Middleboxes (NATs, firewalls, traffic shapers) inspect TCP options and drop packets they don’t recognise. A new TCP extension can take a decade to deploy widely enough to be useful; even MPTCP, standardised in 2013, is still rare in practice in 2026. QUIC sidesteps the whole problem by riding inside UDP, where middleboxes only see opaque datagrams, and by living in userspace, where every application can ship a new version with its next release.
QUIC's two big ideas
Strip QUIC down and there are two real innovations underneath, both pragmatic responses to twenty years of lessons from HTTP-over-TCP.
The first is independent streams. A QUIC connection carries many streams at once, each with its own ordering and flow control. A lost packet only stalls the streams whose bytes it carried; everything else keeps delivering. Stream 0 can be 80% delivered while stream 4 is blocked on a retransmission, and the receiving application can act on stream 0’s data right away. This is the feature HTTP/2 wanted but couldn’t have, because HTTP/2 was bolted onto TCP.
The second is the integrated TLS 1.3 handshake. Classic TCP+TLS is a chain: one round trip to establish TCP, then one or two more for the TLS handshake, then your first HTTP byte goes on the wire. That’s 2–3 RTTs of dead air before the application sees anything. QUIC merges the two — the very first packet a client sends carries both transport setup and the TLS ClientHello, encrypted with keys derived from the server’s Connection ID. By the end of one round trip you have a working transport and a working encrypted session. On a return visit, a saved TLS session ticket lets you skip even that one RTT.
Everything else in QUIC — connection migration, packet number spaces, the specific frame types — falls out of these two choices. Independent streams need a richer frame format than TCP’s flat segment header. The integrated handshake forces packet number spaces, because you have to send and acknowledge packets before you’ve negotiated app-data keys.
The QUIC handshake — one RTT, often zero
QUIC frames its packets into three protection levels, called packet number spaces: Initial, Handshake, and Application data. Each space has its own keys and its own monotonically-increasing packet number. The handshake interleaves them: the client sends an Initial packet carrying a TLS ClientHello; the server responds with an Initial (ServerHello) and a Handshake packet (encrypted Extensions, Certificate, Finished); the client completes with a Handshake (Finished) and immediately starts sending 1-RTT (application-data) packets.
Compared to the legacy stack: TCP costs you one RTT for SYN / SYN-ACK / ACK, then TLS 1.3 costs one more for ClientHello / ServerHello+Finished / Finished, then your first HTTP byte. Two RTTs of dead air, and the second one only began once the first finished. QUIC does both in one round trip. On a 50 ms path that’s 50 ms saved on every cold connection — small per request, large in aggregate across a page that touches multiple hostnames.
0-RTT works by remembering. After a successful 1-RTT handshake, the server sends a TLS session ticket. The client stashes it, and on the next connection sends an Initial packet that already carries application data encrypted with keys derived from the ticket. The server can read the request and start sending the response before it has finished the handshake.
Streams — independent byte streams in one connection
A QUIC stream is a lightweight, ordered byte sequence with its own flow control. Inside QUIC packets, application data rides in STREAM frames; each frame carries a stream ID, an offset into that stream, and a chunk of bytes. The receiver reassembles them in order per stream. If a packet is lost, only the streams whose STREAM frames were in it are blocked; everything else delivers normally.
Stream IDs are 62-bit integers with structure baked into the low two bits. Bit 0 says client-initiated (0) or server-initiated (1). Bit 1 says bidirectional (0) or unidirectional (1). So stream IDs 0, 4, 8, 12… are client-initiated bidi; 1, 5, 9… server-initiated bidi; 2, 6, 10… client-initiated uni; 3, 7, 11… server-initiated uni. The structure means either endpoint can open a new stream without coordinating IDs.
QUIC packet payload (decrypted):
+------------------+--------------------+--------+------------+
| frame type=0x08 | stream id (varint) | offset | length |
| (STREAM, with | e.g. 0 (client | 0 | 312 |
| off+len flags) | bidi stream 0) | | |
+------------------+--------------------+--------+------------+
| ... 312 bytes of stream 0 application data ... |
+-------------------------------------------------------------+
| frame type=0x08 | stream id | offset | length |
| (STREAM) | 4 (client bidi 4) | 0 | 89 |
+------------------+--------------------+--------+------------+
| ... 89 bytes of stream 4 application data ... |
+-------------------------------------------------------------+
One packet can carry frames from many streams. One lost packet
only stalls those specific streams until the bytes are resent.Flow control is per-stream and per-connection — the receiver tells the sender both "stream X can accept up to byte Y" and "the whole connection can accept up to byte Z". Both are advertised with MAX_STREAM_DATA and MAX_DATA frames, both can be updated mid-flight, and both do the same job as TCP’s window: keep the sender from drowning the receiver.
Packet number spaces
TCP uses one sequence number space for the whole connection. QUIC uses three, one per protection level: Initial, Handshake, and Application data. Each space has its own keys, its own packet number counter starting at zero, and its own ack-eliciting threshold. The split exists because the handshake itself has to send and acknowledge packets before the application-data keys are available, and mixing them in one sequence space would either leak handshake structure or force re-keying gymnastics.
Packet numbers are monotonically increasing within a space and never reused, even across retransmissions. If you send packet 42 and it gets lost, the retransmission rides packet 67 (or whatever the current value is) carrying the same frames. That kills the TCP ambiguity where the receiver can’t tell whether an ACK is for the original send or the retransmission — QUIC’s loss recovery gets real RTT samples on retransmissions, which TCP cannot.
On the wire, the packet number is encoded truncated to one, two, three, or four bytes. The receiver reconstructs the full value from the highest packet number it has already acknowledged in that space. This saves header bytes — most packets carry one or two — without losing the monotonicity property. Header protection (a XOR mask derived from a sample of the packet payload) then hides the packet number itself from passive observers, so middleboxes can’t learn anything from it.
Connection IDs and connection migration
TCP pins a connection to a five-tuple — protocol, source IP, source port, destination IP, destination port. Change any of those and the connection is gone. That’s why your TCP-based video call drops when your phone switches from wifi to LTE: the source IP and source port change, the server’s kernel doesn’t recognise the new packets as part of your existing connection, the connection breaks, and the app has to reconnect with a fresh TCP+TLS handshake.
QUIC connections are identified by Connection IDs, not by the five-tuple. Each endpoint assigns IDs that the other end uses on its outgoing packets. When your phone hands off to LTE and your source IP changes, the packets still carry the same Connection ID, the server recognises them as part of the existing session, and the connection survives. No re-handshake, no fresh TLS, no dropped call.
Migration isn’t free of risk. An attacker who spoofs your IP and sends QUIC packets with your Connection ID could trick the server into aiming a large response at a victim’s address — an amplification attack. QUIC defends with path validation: when packets arrive from a new address, the server sends a PATH_CHALLENGE frame full of random bytes and waits for a PATH_RESPONSE echoing them before it will send much traffic on the new path. Until validation completes, the server is rate-limited to roughly three times the bytes it received from the new address.
In practice, browsers do connection migration silently for HTTP/3, and apps like Chrome hold connections through wifi-to-cellular transitions that would have cost TCP a full reconnect. The benefit is largest for long-lived connections — video calls, gRPC streams, server-sent events — and modest for short HTTP fetches.
Pacing — why QUIC needs to slow down sending
Kernel TCP has had pacing built into the qdisc layer for years. tc-fq
(fair queue) spreads sends out evenly so a 100 Mbit/s flow doesn’t fire packets in
1 Gbit/s bursts. Without pacing, congestion controllers like CUBIC or BBR will happily
send a whole cwnd of packets back-to-back, drown some downstream buffer, and cause the
very loss the controller was trying to avoid.
QUIC runs in userspace over UDP. The kernel has no idea this is a connection, let
alone what its congestion window is, so kernel pacing doesn’t apply. The QUIC
implementation has to pace itself. The simple way is a token bucket — release one
packet every (RTT / cwnd) microseconds — but sendto calls have overhead,
and at 10 Gbit/s you’re calling it millions of times a second.
Linux gave QUIC two helpers. SO_TXTIME lets you stamp each outgoing
packet with a future send time, and the kernel emits it at that moment — pacing
without per-packet syscalls. UDP Generic Segmentation Offload (GSO) lets you pass a
64 KB buffer in one syscall and have the kernel split it into MTU-sized datagrams.
Modern QUIC stacks use both, plus io_uring, plus AF_XDP on the
highest-throughput servers. Without these, a naive QUIC sender on Linux burns roughly
three times the CPU per byte that kernel TCP does. With them, the gap closes to maybe
30%.
QUIC and HTTP/3 — what's the difference
QUIC is the transport: streams, packets, congestion control, loss recovery, integrated TLS. It’s standardised in RFC 9000 (the core), RFC 9001 (TLS binding), and RFC 9002 (loss detection and congestion control). None of those documents mention HTTP. QUIC could carry anything — and it does: DNS-over-QUIC (DoQ, RFC 9250), Microsoft’s SMB-over-QUIC, and gRPC has experimental QUIC transports.
HTTP/3 is the application protocol on top, standardised in RFC 9114. It maps HTTP semantics onto QUIC streams: each request/response pair gets its own bidirectional stream, request and response frames flow inside, and the stream closes when the response is complete. Independent streams mean independent requests — no head-of-line blocking between concurrent fetches, the single biggest source of HTTP/2 disappointment.
HTTP/2’s header compression scheme, HPACK, doesn’t survive the transition. HPACK needs both ends to process header table updates in order — fine with one TCP byte stream, impossible with many independent QUIC streams that might arrive out of order. HTTP/3 ships QPACK instead. It carries dynamic-table updates on a dedicated unidirectional stream and lets header frames reference table entries with a small commit lag, trading a touch of compression efficiency for the freedom to deliver headers out of order.
| Layer | RFC | What it handles |
|---|---|---|
| QUIC core | 9000 | Streams, packets, frames, congestion control framework |
| QUIC + TLS | 9001 | How TLS 1.3 keys derive the QUIC packet protection |
| QUIC recovery | 9002 | Loss detection, congestion controllers, pacing |
| HTTP/3 | 9114 | HTTP semantics over QUIC streams |
| QPACK | 9204 | Header compression that tolerates out-of-order delivery |
| DoQ | 9250 | DNS over QUIC — same transport, different application |
Why userspace transport — the engineering case
TCP lives in the kernel, and that used to be a feature. The kernel can see all flows and arbitrate between them, the network stack runs at high privilege without context-switching to userspace for every packet, and optimisations like TSO and GRO sit close to the NIC. The cost is the deployment cycle. A new TCP feature ships in a Linux kernel release every two years or so, then takes another three to five years to percolate into the LTS kernels that datacenters and devices actually run. ECN took fifteen years. MPTCP, fourteen. TCP Fast Open was approved in 2014 and is still off-by-default on most servers.
Userspace transport flips the deployment problem. Cloudflare’s
quiche, Google’s quiche (different project), Meta’s
mvfst, Microsoft’s msquic, Apple’s
network.framework — all ship as libraries that update with the
application. Google deployed BBR2 to YouTube’s QUIC stack in 2018; the same congestion
controller took years longer to land in mainline Linux as a TCP option. The next time
someone has a better loss recovery algorithm, the QUIC ecosystem can ship it in one
release cycle.
The cost is CPU. Naive QUIC on commodity Linux burned roughly three times the CPU per
gigabit of kernel TCP, mainly because every packet means a
sendto/recvfrom syscall and a userspace crypto operation. The
kernel community’s answer has been to push the bottlenecks out: UDP GSO/GRO for batched
syscalls, SO_TXTIME for pacing, io_uring for amortised
submission, AF_XDP for bypassing the kernel stack entirely. Production QUIC stacks that
combine these are now within 20–40% of kernel TCP on CPU and beat it on
latency-sensitive workloads.
Adoption — who runs it in production
Google built the first version, called gQUIC, and turned it on for YouTube and Chrome-to-Google traffic in 2013. By 2017 over a third of Google’s client-to-server traffic was gQUIC. The IETF then spent four years standardising a cleaner version (sometimes called iQUIC) with TLS 1.3 properly integrated instead of gQUIC’s bespoke crypto. RFC 9000 published in May 2021. Google moved its production fleet from gQUIC to IETF QUIC over the next 18 months.
Cloudflare turned on HTTP/3 in late 2019 with their quiche library
built in Rust. By 2023 they reported roughly 30% of all Cloudflare web requests were
riding HTTP/3 — most from Chrome and Safari clients. Meta uses mvfst in
C++ for Facebook and Instagram; Apple shipped HTTP/3 in Safari and the
network.framework stack across iOS and macOS; Microsoft built
msquic in C and uses it for HTTP/3 in Windows and as the transport for
SMB-over-QUIC.
| Implementation | Language | Used by |
|---|---|---|
| Google QUICHE | C++ | Chrome, Google servers (Search, YouTube, Maps) |
| Cloudflare quiche | Rust | cloudflared, Cloudflare edge, NGINX HTTP/3 module |
| Meta mvfst | C++ | Facebook, Instagram, WhatsApp |
| Apple network.framework | C / Swift | Safari, iOS, macOS, all Apple system traffic |
| Microsoft msquic | C | Windows HTTP/3, SMB-over-QUIC, .NET HttpClient |
| quinn | Rust | The de-facto Rust ecosystem QUIC library |
| quic-go | Go | Caddy HTTP/3, much of Go’s third-party QUIC tooling |
| ngtcp2 | C | curl, nghttp3, embeddable in C/C++ apps |
On the client side, Chrome, Firefox, Safari, and Edge all ship HTTP/3 enabled
by default. curl can be built with HTTP/3 via the
--http3 flag. Server-side adoption lags a little — nginx’s HTTP/3 support
stabilised in 2023, Apache’s mod_http3 is newer still — but the major CDNs and any
cloud load balancer worth using support it today.
Common mistakes
- Assuming QUIC works through every middlebox. Some carrier-grade NATs and corporate firewalls block UDP on all ports except 53 (DNS). Browsers handle this gracefully — Chrome retries with HTTP/2 over TCP after one failed QUIC connection attempt — but server operators see "HTTP/3 didn’t help on this network" and the cause is upstream of them.
- Using 0-RTT for non-idempotent operations. A POST that charges a credit card sent over 0-RTT can be replayed by an on-path attacker. Most HTTP/3 stacks refuse to send non-GET requests on 0-RTT by default, but if you’re using QUIC for a custom protocol you have to enforce this yourself.
- Blocking UDP at the firewall and expecting HTTP/3 to work.
QUIC rides UDP on port 443 by default. If the firewall allows TCP/443 and
blocks UDP/443, clients silently downgrade to HTTP/2 and you wonder why
the latency improvements never showed up. Check
Alt-Svc: h3=":443"negotiation is actually completing. - Not pacing your sender. A QUIC implementation that lets
its congestion controller fire packets back-to-back will lose performance
on bottleneck links. Use
SO_TXTIMEif you’re on Linux; many userspace QUIC libraries do this for you, but if you wrote the loop, you wrote the bug too. - Reusing Connection IDs predictably. A connection ID is a tracking primitive; if it never rotates, an observer can correlate your traffic across networks. RFC 9000 requires both endpoints to maintain a pool of unused CIDs and rotate them, and connection migration in particular should trigger a CID rotation to avoid linking the old and new paths.
- Treating HTTP/3 enablement as a magic latency button. HTTP/3’s biggest wins are on cold connections, lossy links, and pages with many small subresources. On a warm pool of HTTP/2 connections over fibre, HTTP/3 makes very little difference. Measure on your real traffic before celebrating.
Further reading
- RFC 9000 — QUIC: A UDP-Based Multiplexed and Secure Transport — the core specification. Long, but the architectural sections at the front are very readable and worth a slow first pass.
- RFC 9001 — Using TLS to Secure QUIC — how TLS 1.3 keys derive the QUIC packet protection, the integrated handshake, and the key update rules.
- RFC 9002 — QUIC Loss Detection and Congestion Control — the recovery and pacing rules. The reference loss detection algorithm in the appendix is the cleanest write-up of any loss recovery I’ve read.
- RFC 9114 — HTTP/3 — the HTTP semantics layer on top of QUIC. Short relative to the QUIC RFCs; QPACK (RFC 9204) is the harder companion read.
- HTTP/3 Explained — Daniel Stenberg — a free book from the curl maintainer. Covers QUIC and HTTP/3 together at a level above the RFCs without losing accuracy.
- Cloudflare — The Road to QUIC — and the rest of the Cloudflare QUIC blog series. The most accessible production-operator writing on the protocol.
- Langley et al. — The QUIC Transport Protocol, SIGCOMM 2017 — Google’s retrospective on gQUIC at internet scale. The numbers on handshake cost, head-of-line wins, and middlebox interactions are still the canonical reference.
- Jana Iyengar — Building QUIC: from research to deployment — SREcon talk from one of the QUIC working group editors. Best ninety minutes you can spend if you want the design rationale.