07 / 11
Protocols / 07

WebSockets and SSE

Standard HTTP is request/response. Sometimes the server has news the client needs to know about — a new message, a price update, a job completing — and waiting for the client to ask is the wrong shape. There are four reasonable ways to push from server to browser, each a better fit for different traffic patterns. None of them is hard to set up; all of them have a few production gotchas worth knowing about ahead of time.


Why request/response cannot push

Plain HTTP has one shape: the client asks, the server answers, the exchange ends. The server has no channel back to a client it is not currently answering. If a chat message arrives, a stock ticks, or a long job finishes, the server holds news that the client has no way of hearing until the client happens to ask again. That asymmetry is fine for loading a page or submitting a form. It is the wrong shape the moment the interesting event originates on the server and the client wants to know about it now rather than on its next poll.

The naive fix is to ask more often. Poll every second and you cut the worst-case delay to a second, but you also pay for a full request when there is usually nothing to report, and you still carry up to a second of staleness. Every one of the techniques on this page is a different way to keep a channel open so the server can speak first. They differ in how the channel is held open, whether the client can also speak, and how gracefully each one survives the proxies, load balancers, and timeouts that sit between a browser and your servers. For a higher-level tour of the same ground see real-time communication.

ApproachDirectionWireGood for
PollingServer → client (one update per request)Plain HTTPLow-frequency updates; older infra; fallback
Long-pollingServer → client (held request)Plain HTTPPush over old infra; the universal fallback
Server-Sent EventsServer → client (continuous stream)HTTP, text/event-streamNotifications, live dashboards, anything one-way
WebSocketsBidirectionalRFC 6455 over upgraded HTTPChat, collaborative editing, multiplayer
WebTransportBidirectional, datagrams + streamsHTTP/3 over QUICLatency-sensitive workloads; still emerging

The diagram below puts the three workhorses side by side. Polling spends a request per check and learns nothing most of the time. SSE keeps one response open and the server writes into it as events happen. A WebSocket keeps one connection open that either side can write to at any moment.

pollingSSEWebSocketclientserverclientserverclientservermostly "nothing yet"one open responseboth directions, anytimeblue arrows are messages the server sent without being asked
Three ways to get server news to a browser. Polling pays per check; SSE and WebSockets hold the channel open so the server can speak first.

Polling and long-polling

Plain polling is the floor. The client sets a timer and requests an endpoint on a fixed interval; the server answers with whatever has happened since last time, or with an empty result. It needs nothing special from the network and nothing special from the server, so it always works. The cost is the tradeoff between freshness and waste: a short interval means low latency and many empty responses, a long interval means cheap traffic and stale data. There is no setting that gives you both.

Long-polling keeps the simplicity but removes most of the waste. The server holds the request open until it actually has something to send, then replies. The client reads the reply and immediately opens another request. As long as one request is always in flight, the server can push the instant it has news, and the only empty responses are the ones that hit the hold timeout.

GET /events?after=124 HTTP/1.1
Host: example.com

# server holds the request for up to 30s, then:
HTTP/1.1 200 OK
Content-Type: application/json

[{"id":125, "type":"message.created", "...":"..."}]

# client immediately requests:
GET /events?after=125 HTTP/1.1
...

Long-polling works through any HTTP infrastructure — load balancers, proxies, CDNs — without special configuration, because every exchange is an ordinary request and an ordinary response. That is why it remains the universal fallback: when a WebSocket cannot get through a corporate proxy or an SSE stream gets buffered to death, libraries quietly fall back to long-polling and most users never notice. The trade is one TCP request per update plus the latency of opening the next request, and a window where no request is in flight — between the reply landing and the next request leaving — during which a fresh event has to wait. That gap is small but real, which is the main reason it loses to the streaming transports once updates get frequent.

Server-Sent Events

SSE is the smallest possible "real" push channel: a single HTTP response that never ends. The server replies with Content-Type: text/event-stream and then, instead of closing the body, keeps writing into it. Each event is a few lines of UTF-8 text followed by a blank line. There is no second protocol to learn, no upgrade handshake, no framing layer. It is HTTP all the way down, which is exactly why it slips through nearly every proxy and load balancer without special handling.

HTTP/1.1 200 OK
Content-Type: text/event-stream
Cache-Control: no-cache

id: 125
event: message.created
data: {"id":125, "channel":"ops"}

id: 126
event: message.created
data: {"id":126, "channel":"ops"}

: keepalive comment

The format has exactly four fields. data: carries the payload, and several data: lines in a row are joined with newlines so you can stream multi-line text. event: names the event type, which the client can listen for by name. id: sets the event ID, and this one field is what makes reconnection work. retry: tells the client how long to wait before reconnecting. A line that starts with a colon is a comment, which is how the keepalive at the bottom works: the server sends a no-op comment every 15 to 30 seconds so that idle intermediaries do not decide the connection is dead and cut it.

On the browser side you do almost nothing. new EventSource('/events') opens the stream, fires a message event for each unnamed event and a named event for each event: line, and — this is the part that earns SSE its reputation — reconnects on its own if the connection drops. When it reconnects it sends the ID of the last event it saw back to the server in a Last-Event-ID request header. A server that keeps a short buffer of recent events can replay everything after that ID, so a blip in the network turns into a gap-free resume rather than a hole in the stream.

EventSourceserverGET /eventsid: 125 data: …id: 126 data: …connection drops; events 127, 128 are missedGET /events · Last-Event-ID: 126id: 127 (replayed)id: 128 (replayed)
The browser reconnects on its own and sends the last ID it saw. A server that buffers recent events replays the gap, so the client misses nothing.

There are two real constraints to plan around. The first is the per-origin connection limit. Over HTTP/1.1 a browser opens at most six connections to one origin, and an open SSE stream holds one of them for as long as it lives. Open a stream in each of several tabs and you can starve the page of connections for ordinary requests. HTTP/2 turns those connections into multiplexed streams and raises the practical ceiling to around a hundred, which is usually enough, so SSE and HTTP/2 are a natural pair. The second is buffering. Some older corporate proxies and a few CDN edges hold an HTTP response until it completes before forwarding it, which for an endless stream means the client sees nothing at all. Disable response buffering on the streaming route, set Cache-Control: no-cache, and on nginx send X-Accel-Buffering: no.

SSE has two genuine limits. It is one-way: the client cannot send anything back over the stream, only over separate ordinary requests. And the spec only carries UTF-8 text, so binary has to be base64-encoded, which costs about a third more bytes. Neither matters for the things SSE is good at — notifications, live dashboards, progress and log streams, and, as it happens, the token-by-token output of an LLM, which is why most chat APIs stream their responses over SSE.

When to pick SSE over WebSockets. If your traffic is server-to-client and most messages are small JSON updates, SSE is simpler in every way: no upgrade handshake, no framing format to think about, automatic reconnect with replay, and it works through any HTTP-aware proxy. Use WebSockets only when the client needs to talk back over the same channel.

WebSockets

A WebSocket is a full-duplex channel: both sides can send at any moment, with no request and no reply, until one of them closes. It does not start that way. It starts as an ordinary HTTP/1.1 GET that asks to be promoted. The client sends Upgrade: websocket along with a random Sec-WebSocket-Key; if the server agrees, it answers 101 Switching Protocols and from that point the bytes on the same TCP connection stop being HTTP and become WebSocket frames. The handshake is the bridge between the HTTP world (where proxies, auth, and routing all already work) and a raw two-way pipe.

browserserverHTTPGET /ws · Upgrade: websocket · Sec-WebSocket-Key101 Switching Protocols · Sec-WebSocket-Acceptframestext frame (masked)binary frameping →← pong
The connection begins as HTTP. After the 101 reply the same socket carries WebSocket frames in both directions, including ping/pong control frames.

The Sec-WebSocket-Accept value in the reply is not decoration. The server takes the client's key, appends a fixed magic string defined by the spec, hashes it with SHA-1, and base64-encodes the result. The client checks that the server returned exactly that value. A caching proxy that blindly replayed an old response would fail this check, so the handshake doubles as proof that a real WebSocket-aware server answered and not some intermediary serving a stale page.

# client → server (handshake)
GET /ws HTTP/1.1
Host: example.com
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==
Sec-WebSocket-Version: 13

# server → client
HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo=

# from here on, both sides exchange WebSocket frames

Once the handshake is done, everything travels as frames. A WebSocket frame is a small header of 2 to 14 bytes followed by a payload. The header carries an opcode that says what the frame is — text, binary, ping, pong, or close — a FIN bit that marks the last fragment of a message, a mask bit, and a length field that grows as the payload grows. A single logical message can be split across several frames, so a large upload need not be buffered whole before the first byte goes out. Text frames must be valid UTF-8; binary frames carry any bytes you like, which is the big practical win over SSE — you can send protobuf, images, or compressed blobs without base64.

One quirk surprises people: every frame the client sends must be masked with a four-byte XOR key, regenerated per frame, while server-to-client frames are never masked. This is not about secrecy — the key travels in the clear right next to the data. It exists so that a confused intermediary cannot be tricked into treating attacker-controlled frame bytes as a cached HTTP request, a cache-poisoning attack that was demonstrated against early proxies. Masking randomizes the bytes on the wire so they cannot be steered to look like a valid request line. Your library handles it; the cost is a small per-frame XOR.

Beyond the handshake, the operational details are where WebSockets get more demanding than SSE, precisely because they are no longer ordinary HTTP and the surrounding infrastructure no longer manages them for free:

  • No timeouts in the spec. A WebSocket has no built-in heartbeat. If the connection silently drops — a laptop sleeps, a mobile network changes towers, a NAT entry expires — neither side notices until its next write fails, which on an idle connection may be never. Send ping control frames on a timer and treat a missing pong within a few seconds as a dead connection. Without this you accumulate half-open sockets that consume memory and let you "send" to clients that left long ago.
  • Load balancers need configuring. Most reverse proxies apply an idle timeout to connections, often around 60 seconds, and a quiet WebSocket looks idle even though it is alive. Either raise that timeout above your heartbeat interval or make the heartbeat frequent enough to keep the connection looking busy. The proxy also has to be told to pass the Upgrade and Connection headers through, or the handshake never completes.
  • Sticky sessions. A WebSocket lives on one server for its whole life, so the load balancer must keep routing that client to that server. And when the client reconnects after a drop, it may land on a different replica entirely. That new server needs to recover the client's state, which means either sticky routing plus a way to rehydrate, or — better — a shared backplane (Redis pub/sub, NATS, Kafka) so that any server can serve any client.
  • Closing matters. The WebSocket close is a two-step handshake: one side sends a close frame with a status code, the other replies with its own close frame, then the TCP connection is torn down. Skip it and you leak file descriptors and miss the status code that tells you why a client left. Many libraries half-close and never finish, so check that yours completes the handshake.

Choosing between them

The decision is mostly about direction, and only then about everything else. If the data flows one way — server to client — SSE is the right default. It is less code, it survives proxies, the browser handles reconnection and replay for you, and there is no second protocol to operate. Reach for WebSockets when the client needs to push over the same channel with low latency: a chat where you are typing, a collaborative document where your cursor moves, a game where your inputs go up as often as state comes down. Reach for WebSockets too when you need to send binary efficiently. If you only think you might need two-way later, start with SSE plus ordinary POST requests for the rare upstream message; that combination covers a large share of "real-time" apps without the operational weight of a long-lived two-way socket.

QuestionIf yes
Does only the server push?SSE
Does the client push often, with low latency?WebSockets
Do you need to send binary efficiently?WebSockets
Must it work through unknown proxies with zero config?SSE, or long-poll fallback
Is it occasional, low-frequency news?Polling or long-polling
Are you streaming an LLM response token by token?SSE

For a longer side-by-side that weighs the two against each other feature by feature, see WebSockets vs SSE. The short version is that "bidirectional" is the only word that should push you off SSE; almost everything else SSE does at least as cleanly.

WebTransport

WebTransport is the newer option, layered on HTTP/3 (which itself runs on QUIC). It gives you both reliable bidirectional streams and unreliable datagrams over the same connection — useful for things like multiplayer game state where the latest state matters more than historical updates.

In 2025 it's supported in Chrome, Edge, and Firefox; Safari support is partial. The server side is still maturing — most production servers reach for it through Cloudflare or aioquic-based stacks. For most teams, WebSockets remains the safer choice today; WebTransport is worth tracking for the next few years.

Scaling long-lived connections

Request/response servers scale by being stateless: any box can answer any request, so you add boxes. Push servers break that assumption, because each one holds thousands of open connections that each have a little state. Three problems show up at scale, and they are the same whether you chose SSE or WebSockets.

The first is connection count. An idle connection is cheap on CPU but not free: each costs a file descriptor, some kernel socket memory, and a slice of your app's per-connection state. A single server can hold tens or hundreds of thousands of mostly-idle connections, but the ceiling is real, and the figure people quote for "requests per second" tells you nothing about "concurrent connections held open." Size for the connections, not the request rate.

The second is fan-out. When an event needs to reach many clients, the server holding those connections has to write to each one, but the event usually originates somewhere else — a different replica, a background worker, a database change. So you put a publish/subscribe backplane between the producers of events and the servers holding connections. A producer publishes once to Redis, NATS, or Kafka; every connection server subscribes and pushes to the clients it happens to hold. This is what lets a message typed on one server reach a recipient connected to another, and it is the piece people most often forget when a prototype that worked on one box falls apart on three.

producerpub/sub brokerRedis · NATS · Kafkaconn server Aholds clients 1–3conn server Bholds clients 4–6conn server Cholds clients 7–9publish onceeach server pushes to only the clients it holds
One publish, many pushes. The backplane decouples whoever produces an event from whichever servers happen to hold the connected clients.

The third is liveness. A connection can die without either end being told, so servers must heartbeat (pings on WebSockets, comment keepalives on SSE) and reap connections that stop answering. Reaping matters as much for memory as for correctness: a server that never notices dead connections slowly fills with zombies until it runs out of descriptors. Pair heartbeats with sensible idle timeouts on both the app and every proxy in the path, and make sure the two agree, since a proxy timeout shorter than your heartbeat will sever live connections for no reason.

Reliability gaps and reconnect strategy

None of these transports guarantees delivery on their own. TCP makes sure bytes that arrive arrive in order, but it does nothing for bytes that were in flight when a connection died, and it cannot tell you whether your peer processed a message or merely received it. Treat every push channel as best-effort and build the guarantees you need on top. Whichever transport you pick, the client should expect to reconnect, and a few habits make that go smoothly.

  • Backoff with jitter. If a server has a momentary outage and a thousand clients all reconnect on a fixed timer, they arrive in a synchronized wave that knocks the server straight back over. Backoff doubles the wait after each failed attempt, capped at maybe 30 seconds; jitter adds random spread of around half the delay so the wave smears out into a trickle the server can absorb.
  • A monotonic cursor. Give every event a number that only goes up — an event ID for SSE, an application sequence number for a WebSocket, a "since" timestamp for long-polling — and have the client send the last one it saw on reconnect. The server resumes from there. Without a cursor a reconnect either drops events or replays the whole history.
  • Idempotent message handling. Resumes and retries deliver duplicates, so the receiver must be able to recognize a message it has already processed and drop it. Carry a stable message ID and dedupe on it; do not let "apply" and "apply again" produce different results.
  • Back-pressure. A slow client cannot drain messages as fast as the server produces them, so the server's send buffer for that connection grows. Decide in advance what happens: drop the slowest clients, coalesce updates into the latest state, or apply a bounded buffer and disconnect anyone who overflows it. An unbounded buffer is a memory leak waiting for one slow phone.

Further reading

Found this useful?