11 min read · Guide · Networking
How it works · Networking

Five ways
to push data to a browser in real time.

Polling. Long polling. SSE. WebSockets. WebRTC. Each makes a different bet on latency, direction, and operational cost. Pick the one your problem actually needs.

Parts01–10 InteractiveMode picker PrereqHTTP · TCP · sockets

What is realtime communication?

Three axes: latency, direction, durability.

Realtime communication is the family of techniques that push data from server to client (and sometimes peer-to-peer) with low latency. Five modes dominate: polling, long-polling, Server-Sent Events (SSE), WebSockets, and WebRTC. Each makes different trade-offs around latency, direction (one-way vs bidirectional), and durability (does the connection survive transient loss?).

"Realtime" is a wide tent. A live build status updating every 5 seconds is realtime. A multiplayer game with 20 ms ceilings is realtime. A video call hitting 100 ms RTT is realtime. The mechanism you reach for changes by an order of magnitude with each step.

Three questions decide it: How fresh do updates need to be? One direction or two? Is this a 100-RPS service or a 100k-CCU app? Answer those and the right mode picks itself.


The five realtime modes side by side

Pick a mode, see the trade.

Each mode below is in production at scale somewhere. None is universally right. Read the trade-offs, then map them to your problem.

Mode 03 · Server-Sent Events

One-way stream over HTTP.

A persistent HTTP/1.1 or HTTP/2 connection where the server pushes text events. Built into browsers via EventSource. Simple, automatic reconnection with Last-Event-ID, works through every reverse proxy that speaks HTTP. No client-to-server stream — for that, use a separate POST.

Latency: ~milliseconds. Bandwidth: minimal. Server load: one connection per client; tune fd limits.

Server-Sent Events (SSE): the simplest push channel

Server-to-client streaming over ordinary HTTP.

Server-Sent Events is HTTP — your existing observability, auth, rate limits, and reverse proxies all just work. The wire format is text: data: payload\n\n. The client uses EventSource; reconnection and event IDs are handled automatically.

# Server side (Node)
res.setHeader('Content-Type', 'text/event-stream');
res.setHeader('Cache-Control', 'no-cache');
res.setHeader('Connection', 'keep-alive');
res.write(`event: progress\ndata: ${JSON.stringify({ pct: 42 })}\n\n`);

# Client side
const es = new EventSource('/api/build/123/events');
es.addEventListener('progress', (e) => render(JSON.parse(e.data)));

Scaling WebSockets to 100k connections

The hard part is fan-out, not the connections.

WebSockets are cheap per-connection but persistent. Each open socket consumes a file descriptor, a small kernel buffer, and your application's per-connection state. With C10K-style tuning (epoll, large fd limit, generous TCP buffers) one box handles 100k concurrent connections comfortably; 1M is achievable with care.

The real scaling problem is not the connections but the fan-out. When a message needs to reach everyone in a chat room, that's an N-write problem — and the connections are spread across many backend instances. Solution: a pub/sub backplane (Redis, NATS, Kafka). Each instance subscribes to room channels; one publish fans out to all instances; each instance writes to its locally-held sockets.


WebRTC: peer-to-peer when latency has to be lowest

No server in the data path, at the cost of complexity.

If your latency floor is "as low as physics allows" — multiplayer FPS, video call, CRDT cursor sync at gaming-grade — WebRTC is the only choice in the browser. No server is in the data path; packets go directly between peers via UDP/DTLS, with congestion control that's tuned for media (Google Congestion Control, BBR-like behaviour).

The price is operational complexity. Signaling (you provide), STUN (cheap, free public servers), TURN (relay when direct fails — bandwidth costs you, and runs into NAT traversal). And the protocol itself is complex: codecs, simulcast, SVC, jitter buffers. For 90% of "realtime" web apps, SSE or WebSockets are simpler.


Connection durability: every persistent connection eventually drops

All persistent connections drop. Plan for it.

Networks change. Phones move between Wi-Fi and cellular. Laptops sleep. Reverse proxies have idle timeouts. Every persistent-connection mode (SSE, WS, WebRTC) must handle "the connection died and we missed messages between disconnect and reconnect".

SSE has Last-Event-ID built in — the client sends it on reconnect, the server replays from there. WebSockets give you nothing — you must design your own resume protocol (sequence numbers, server-side ring buffer of recent messages, client sends "I last saw N" on reconnect). WebRTC has its own session resumption story via ICE restart.


Choosing a realtime mode

Pick the simplest mode that meets your latency budget.

Need server-to-client only, OK with HTTP semantics? SSE. Need bidirectional, low latency? WebSockets. Updates a few times a minute and don't need OS-level keepalives? Polling. Sub-100ms latency, browser-to-browser, audio/video? WebRTC. Stuck on infrastructure that doesn't speak any of these? Long polling.

Don't reach for WebRTC because "it's the most realtime". The complexity is real. Most chat apps are perfectly happy on WebSockets; most live dashboards are happy on SSE.


Worked example: live order tracking

One product, walked through the decision.

Take a concrete case: the order-tracking screen in a food-delivery app. A status line ("being prepared", "picked up", "arriving") and a courier dot moving on a map. Walk it through the three axes. Direction: every update flows server to client. The customer sends almost nothing back — a cancel tap, maybe a tip — and those are ordinary POSTs with their own responses. Nothing on this screen needs a client-to-server stream. Frequency: a position fix every two to five seconds while the courier is moving, plus a handful of status changes over the order's life. A few hundred events across forty-five minutes, not a firehose. Fan-in: the courier fleet does push a lot of GPS upstream, but that's a separate ingest path — the courier app POSTs fixes, or holds its own socket. The customer-facing channel stays one-way and low-volume.

Score the modes against that shape. WebRTC is out immediately: there is no peer relationship here, and you want the server in the path — it owns order state, dispatch, and the audit trail. WebSockets would work, but you'd be paying for bidirectionality the screen never uses, and writing your own reconnect protocol on top. SSE matches the shape exactly: one-way, plain HTTP, passes through corporate proxies and your existing auth. Plain polling at five seconds is also defensible — the latency floor of this product is the GPS sampling rate, not the transport.

Now the part that actually settles it: failure behaviour. This screen lives on a phone walking out of a restaurant. The network flips from Wi-Fi to cellular, the OS suspends the app in a pocket, an elevator drops the link. Every persistent option will disconnect, repeatedly. With SSE, EventSource reconnects on its own and sends Last-Event-ID — but notice the deeper property: order tracking is latest-state-wins. A customer who was offline for ten minutes does not want 120 stale GPS fixes replayed; they want the current position and status, once. So make every event a full snapshot of the order. Replay then collapses to "send the newest event on connect", reconnection logic mostly disappears, and the polling fallback becomes trivially correct — a poll response and an SSE event carry the same payload.

The decision, written down: SSE for the watch path, POSTs for the few upstream actions, a five-second poll as the fallback when the stream won't open. WebSockets earn their place the day this screen grows courier chat.


Latency budgets: what each realtime mode actually costs

The numbers, by mode.

Each realtime technique pays a different latency tax. Approximate numbers on a same-region client-server connection (10 ms RTT) and a global connection (150 ms RTT):

Polling · 1-second interval
Average lag = ½ × interval = 500 ms. Predictable cost, no persistent connection. Wastes ~3,600 requests/hour even if nothing changes.
Long polling
Lag ≈ RTT (10–150 ms). One persistent request held open until data is ready, then a new request. Used to be the default before WebSockets. Still ships in many enterprise products that can't open WebSockets through a corporate proxy.
Server-Sent Events (SSE)
Lag ≈ RTT. One persistent HTTP/2 stream from server to client. The server pushes events as they arrive; the client gets them with one network hop's latency. No browser polling overhead.
WebSocket
Lag ≈ RTT/2 each way (5–75 ms one-way). Bidirectional. Both sides can send at any time. Connection setup costs an HTTP upgrade plus the underlying TCP+TLS handshake — about 3 RTTs total on a fresh connection.
WebRTC · peer-to-peer
Lag ≈ peer-to-peer RTT, often 20–60 ms even cross-region because traffic doesn't pass through a server. Cost: NAT traversal via STUN/TURN, ICE candidate negotiation (~1–3 seconds before the connection is usable), media-stream complexity.
HTTP/3 datagrams · WebTransport
Lag ≈ RTT/2 (no head-of-line blocking from TCP). WebTransport is a W3C Working Draft and not every browser engine ships it — check current support before betting a product on it. Promising for game traffic and real-time telemetry.

What Slack, Figma, Twitch, and Discord actually use

Real systems, real choices.

Slack — WebSockets per user, with reconnect logic. Each connected user holds one WebSocket to Slack's edge. Workspace-wide events fan out from a central event bus to the WebSockets of users in that workspace. Slack ran ~10 million concurrent connections per region in 2023 (public re:Invent talk). The connection multiplexer (called "Flannel") is the heart of Slack's realtime stack — written in Go, sits between the event bus and the public-facing WebSocket gateways.

Figma — WebSockets + custom CRDT. Figma's collaborative design canvas is one of the most demanding realtime apps shipping. Each open file opens one WebSocket per user, carrying CRDT operations (typed structures that can merge automatically). The server-side process runs a single-threaded event loop per file (~one CPU core max per file) for lock-free updates. Figma published a detailed engineering blog in 2019 on the architecture.

Twitch — HLS for video, IRC-style chat over WebSockets. Live video streaming uses HLS (HTTP Live Streaming) with 2-6 second segments — not really realtime, but resilient and CDN-friendly. The chat overlay is on a separate channel: WebSockets backed by an IRC-derived protocol Twitch built. Chat lag and video lag are independent; this is why Twitch chat reactions sometimes precede the video event by a few seconds.

Discord — voice on WebRTC, text on WebSockets. Voice and video channels use WebRTC for the sub-100ms lag end-users expect from voice chat. Text channels use WebSockets, with a custom binary protocol on top. Discord's 2017 "How Discord Stores Billions of Messages" and 2023 "How Discord Migrated Trillions of Messages" engineering posts document the architecture; the realtime layer alone handles millions of concurrent connections per region.

Google Docs — long polling, then WebSockets. Surprisingly, Google Docs ran on long polling for years (chosen for compatibility with corporate proxies). Modern Google Docs uses WebSockets where supported, falling back to long polling. The lesson: production realtime systems are often hybrid for compatibility, not just performance.


A closing note

Realtime is a design choice, not a technology choice. Pick the lowest-complexity mode that meets your latency budget; pick a higher-complexity one only when you've measured and the simpler thing isn't enough.

Found this useful?