A typo in Pennsylvania, felt everywhere.
On 24 June 2019, a small Pittsburgh-area ISP (DQE Communications, AS33154) leaked roughly 20,000 BGP routes through Verizon Business (AS701). Verizon happily re-advertised the leak to the rest of the internet. For about three hours, a meaningful fraction of global traffic to Cloudflare, AWS, Google, Facebook and others was funnelled through a small network in western Pennsylvania that could not possibly carry it.
TL;DR
DQE Communications (AS33154), a small Pittsburgh ISP, was running a BGP optimiser that decomposed its customer Allegheny Technologies' inbound routes into more specific prefixes for better path selection inside its own network. Those more-specific optimisations were never meant to leave AS33154. A misconfiguration leaked them upstream to Verizon Business (AS701), which had no prefix filters or max-prefix limits on the session and happily re-announced roughly 20,000 routes — including more-specifics of Cloudflare, AWS, Google, Facebook and many others — to the rest of the internet's tier-1 fabric. More-specific routes always win in BGP, so global traffic to those prefixes converged on a small Pennsylvania network. The outage lasted about three hours, from roughly 10:30 UTC to 13:30 UTC.
Timeline
| UTC | Event |
|---|---|
| 10:30 | DQE Communications (AS33154) begins announcing roughly 20,000 more-specific prefixes — produced by an internal BGP optimiser — over its session with Verizon Business (AS701). |
| 10:33 | Verizon accepts the announcements without prefix filtering or a max-prefix limit and re-advertises them to its peers and customers across the global default-free zone. |
| 10:35 | Cloudflare's monitoring picks up a sudden traffic drop on numerous anycast prefixes. Service latency spikes; many edges show partial reachability. |
| 10:40 | External BGP looking-glass services (BGPMon, RIPE RIS) confirm route leak. Cloudflare, AWS, Linode, Google, and Facebook prefixes appear with AS_PATH ending in 33154. |
| 10:45 | Cloudflare's NOC begins paging Verizon. Status page updated; engineers start tracing upstream. |
| 11:00 – 12:30 | Sustained outage. Cloudflare's blog post — published while the incident is still ongoing — names Verizon publicly. Industry monitoring sites (ThousandEyes, Catchpoint) report errors across hundreds of services. |
| 12:39 | DQE Communications stops re-announcing the optimised prefixes upstream after operator contact. |
| 13:00 | Verizon begins withdrawing the leaked routes from its peers. |
| 13:30 | BGP convergence completes across most of the internet. Cloudflare traffic returns to normal levels. |
Total time from leak onset to recovery: roughly three hours. The withdrawal-to-convergence gap alone is about 30 minutes, which is normal for BGP: even after the offending announcements stop, every router in the path has to re-evaluate and propagate the change.
What went wrong, technically
Three failures stacked. None of them alone would have caused the outage; the combination took the internet's edges down.
The optimiser fired more-specifics into eBGP. DQE was running a Noction IRP-style optimiser that ingests transit routes, breaks them into more-specific prefixes along the path it prefers, and re-injects them into the local routing table for hot-potato path selection inside the network. These optimised routes were tagged for internal use only. A configuration change — the public reports suggest a session was reconfigured without the no-export community in place — caused the optimiser's more-specifics to start flowing out via eBGP to Verizon.
Verizon had no prefix filter or max-prefix limit on the session. Tier-1 BGP hygiene calls for two safeguards on every customer session: a prefix-list that restricts what the customer can announce to the prefixes they actually own, and a max-prefix limit that tears down the session if the customer announces orders of magnitude more routes than agreed. Verizon's session with AS33154 had neither configured. When the leak began, 20,000 unauthorised prefixes simply walked in.
BGP always prefers more-specific routes. A router with both
1.1.1.0/24 and 1.1.1.128/25 in its table will send traffic for
1.1.1.200 to whoever advertised the /25, regardless of AS_PATH
length. The optimiser's whole point was producing more-specifics. Once those
more-specifics escaped into the global table, every router that heard them preferred
them, and traffic for chunks of Cloudflare, AWS, and others started flowing through DQE.
# A normal BGP route announcement for Cloudflare's 1.1.1.0/24
*> 1.1.1.0/24 AS_PATH: 174 13335 (Cogent → Cloudflare)
*> 1.1.1.0/24 AS_PATH: 3356 13335 (Lumen → Cloudflare)
# What appeared on 2019-06-24 around 10:35 UTC
*> 1.1.1.0/25 AS_PATH: 701 33154 13335 (Verizon → DQE → Cloudflare)
*> 1.1.1.128/25 AS_PATH: 701 33154 13335 (Verizon → DQE → Cloudflare)
# More-specifics beat /24, so all global traffic for 1.1.1.x
# now flows: <wherever you are> → Verizon → DQE → Cloudflare
# DQE has roughly a 10 Gbps uplink. Cloudflare has terabits.Why Cloudflare was disproportionately affected
Cloudflare runs an anycast network: the same IP prefix (for example 1.1.1.0/24)
is announced from dozens of points of presence around the world, and BGP delivers each
client's traffic to the nearest PoP by AS_PATH and policy. This is normally a feature —
capacity and latency both win — but it depends on Cloudflare's announcements being the
ones routers see.
The DQE leak inserted a more-specific announcement for Cloudflare prefixes into the global table. More-specifics override the anycast announcement entirely, regardless of how many PoPs Cloudflare was advertising from. Suddenly every router in reach of the Verizon-propagated route was sending Cloudflare-bound traffic to AS33154 instead of to the nearest Cloudflare PoP. The anycast topology that usually distributes load across hundreds of edges collapsed onto a single regional ISP.
AWS and Google were hit too, but their announcement footprints are different — fewer prefixes, more uniform PoP layout — so the proportional impact on Cloudflare's anycast-heavy design was larger. Cloudflare's then-CTO John Graham-Cumming wrote the incident up the same afternoon.
The fix during the incident
There was nothing Cloudflare could do at the protocol level. They did not own AS33154, they did not have a BGP session to Verizon they could deprioritise, and they could not stop a more-specific announcement someone else was injecting. The only fix was to make the announcement stop.
Cloudflare's NOC paged Verizon repeatedly starting around 10:45 UTC. The public reporting suggests that escalation took a long time because the NOC contacts were either not staffed or did not have authority to tear down the offending session. DQE Communications eventually withdrew the optimised announcements upstream around 12:39 UTC. Verizon then withdrew the routes it had propagated. Normal BGP convergence — every router in the default-free zone re-running best-path selection and propagating withdrawals — took about another 30 minutes, putting full recovery at roughly 13:30 UTC.
The post-incident finger-pointing focused on Verizon. A tier-1 transit provider running eBGP sessions to small customers without a prefix-list and without a max-prefix limit is, by current operational consensus, an unforced error.
Lessons that propagated industry-wide
The 2019 leak became a forcing function for several BGP-security initiatives that had been moving slowly for years.
RPKI Route Origin Validation (ROV) adoption accelerated. RPKI lets a
prefix owner publish a cryptographically signed Route Origin Authorisation (ROA) saying
"AS13335 is allowed to originate 1.1.1.0/24". A router doing ROV would
have looked at 1.1.1.128/25 AS_PATH 701 33154 13335, found a ROA saying
only AS13335 originates that range, marked the more-specific as invalid, and
dropped it. Cloudflare turned ROV on for its own peers shortly after the incident.
Major transits — including AT&T, NTT, and eventually Verizon — followed over the
next 18 months.
MANRS (Mutually Agreed Norms for Routing Security) gained signatures. MANRS commits a network to four practices: prefix filtering, anti-spoofing, coordination contact information, and global validation (publishing ROAs and routing policy). The number of network operators publicly committing to MANRS roughly doubled in the 18 months after June 2019.
Max-prefix filters became table stakes. The defence that would have
caught the leak at Verizon's border — "this customer normally advertises 40 prefixes,
tear the session down if they suddenly advertise 4,000" — is a one-line
maximum-prefix directive in any IOS/JunOS configuration. It is now
standard in tier-1 customer onboarding checklists. The 2019 incident is one of the
canonical references operators cite when arguing for the configuration.
What Cloudflare actually changed
Cloudflare's own follow-up work focused on detection and response rather than prevention (since the prevention happens at other people's routers).
Internal monitoring for prefix hijacks and leaks of Cloudflare-owned prefixes was expanded. The team built tooling that watches global BGP feeds (RIPE RIS, RouteViews) for any announcement of a Cloudflare prefix whose origin AS is not AS13335 or whose AS_PATH contains unexpected hops. Detection latency for a leak of this shape dropped from "a customer tells us" to "alert fires within a couple of minutes".
Automated NOC escalation was hardened: pre-shared contact information for major transit providers, scripted incident creation, and a public commitment in the blog post that "we will name the upstream causing the leak in real time" — which became a notable cultural shift in how outages were communicated, and not just at Cloudflare. The 2019 post itself was published during the incident, naming Verizon, with traffic graphs showing the drop. That was unusual at the time.
Cloudflare also became one of the loudest public advocates for RPKI, publishing isbgpsafeyet.com to track which major ISPs do ROV. It is, in part, a campaigning tool — name and shame as a fix for an industry collective-action problem.
The broader lesson
BGP is a trust-based protocol designed in 1989 when the operators ran the network as a small club. A single typo, or a single optimiser misconfiguration, at a small ISP can ripple through to a meaningful fraction of the internet's edges in minutes — because the larger networks in the path treated the announcement as authoritative and re-told the lie at scale. The 2019 incident is one of the cleanest demonstrations of that structural weakness.
RPKI is the long-term cryptographic fix: prefix owners sign ROAs, routers verify them, invalid routes get dropped. The rollout has been steady but not fast. As of late 2024, ROV covers roughly 50% of internet routes by some measures — meaning about half of the default-free zone now drops invalid announcements automatically. The remaining half is still working off operational hygiene, prefix-lists, and luck.
For protocol depth — how BGP actually picks paths, what attributes propagate, why withdrawal takes minutes — see the BGP deep dive in the networking stack.
| Field | Value |
|---|---|
| Start | 10:30 UTC, 24 June 2019 |
| Peak impact | ~11:00 – 12:30 UTC |
| Restored | ~13:30 UTC |
| Total downtime | ~3 hours from leak onset to global convergence |
| Services affected | Cloudflare, AWS, Linode, Google, Facebook, many others reachable via Verizon |
| Root cause | BGP optimiser at DQE (AS33154) leaked ~20,000 more-specific prefixes to Verizon (AS701); Verizon had no prefix-list or max-prefix limit on the session |
| Fix | DQE withdrew the optimised announcements upstream around 12:39 UTC; Verizon withdrew the propagated routes; normal BGP convergence over the next 30 minutes |
Further reading
- Cloudflare blog, How Verizon and a BGP optimizer knocked large parts of the internet offline (24 June 2019) — the primary source, published the same afternoon by John Graham-Cumming and Tom Strickx.
- Cloudflare blog, A deep dive into the route leak (25 June 2019) — the technical follow-up with BGP table snapshots and AS_PATH walks.
- NIST — Robust Inter-Domain Routing programme (RPKI) — US government reference material on RPKI deployment and best practices.
- MANRS — Mutually Agreed Norms for Routing Security — the operator-led initiative for prefix filtering, anti-spoofing, and ROA publication.
- RFC 7908 — Problem Definition and Classification of BGP Route Leaks — the six leak types. The 2019 incident is a textbook Type 1.
- isbgpsafeyet.com — Cloudflare's RPKI scorecard for major ISPs. A useful weather vane for industry progress.
- RIPE RIS Live — global BGP play-by-play archive — the data source used by all the post-incident BGP analyses.
- BGP — the protocol that makes the internet a network — Semicolony deep dive on path selection, attributes, convergence, and why withdrawal takes minutes.