10 / 10
Internals / 10

Network poller

Go lets you write network code as if every read and write were a blocking syscall, and yet a single process can hold a million open connections on a handful of OS threads. The trick is the runtime netpoller: every socket is registered with the kernel's async interface — epoll on Linux, kqueue on BSD, IOCP on Windows — and goroutines are parked and woken on socket readiness without the code ever seeing it.


The goroutine-per-connection model

Go's network model is straightforward: every accepted connection gets its own goroutine, and that goroutine reads and writes the socket using calls that look synchronous on the page. conn.Read(buf) reads bytes. conn.Write(buf) writes them. There's no callback, no event loop, no future, no async/await. The handler just reads and writes.

Underneath, those calls are anything but synchronous. When the syscall would block — there are no bytes ready to read, or the kernel's send buffer is full — the runtime parks the goroutine, registers interest in the socket with the kernel's async interface, and frees the OS thread to run another goroutine. When the kernel reports the socket is ready, the runtime marks the parked goroutine runnable and the scheduler eventually resumes it. The illusion of blocking I/O over actually-async epoll is the single most important property of net/http.

// What you write:
func handle(c net.Conn) {
    buf := make([]byte, 4096)
    n, err := c.Read(buf)   // looks blocking
    // ... process ...
    c.Write(reply)          // looks blocking
}

// What actually happens on Read:
//   1. attempt non-blocking recvfrom(fd)
//   2. if EAGAIN: register fd with netpoller, gopark()
//   3. some time later: netpoller reports fd readable
//   4. goready(g), scheduler resumes us
//   5. retry recvfrom(fd), return bytes

netpoll — the runtime's I/O subsystem

The runtime's network poller lives in runtime/netpoll.go with three platform-specific backends: netpoll_epoll.go for Linux, netpoll_kqueue.go for BSD and macOS, and netpoll_windows.go for IOCP. They all expose the same interface to the rest of the runtime — netpollopen, netpollclose, netpoll — and translate it to whatever the host kernel speaks.

There is one netpoller per Go process. Every socket opened by net registers itself with it on creation. When a goroutine calls conn.Read and the syscall returns EAGAIN, the runtime calls netpollblock, which stashes the goroutine pointer on the file descriptor's poll entry and parks the goroutine. The descriptor sits in the kernel's epoll set waiting for an event.

Why this matters. A traditional blocking-I/O server needs one OS thread per connection — at ~8 KB of kernel stack plus scheduler state, you hit a wall at maybe ten thousand connections. Go pays one OS thread per active goroutine (GOMAXPROCS of them at any moment), and the parked goroutines waiting on sockets cost only a small g struct and a 2 KB user-space stack. The kernel tracks readiness for all of them through a single epoll descriptor.

Sysmon and the netpoll integration

The scheduler asks the netpoller for ready sockets in two places. First, sysmon — the dedicated runtime monitor thread — calls netpoll(0) roughly every 10 ms to drain any pending readiness events without blocking. Second, when a P has nothing to do and is about to go idle, it calls netpoll(blocking) and parks the thread inside the kernel's epoll_wait until something is ready.

Goroutines blocked on those sockets get pulled off the netpoller's per-fd waiter list and marked runnable. From there they're regular runnable goroutines — pushed onto a P's local run-queue, eventually picked up by an M, eventually running. The handler resumes inside its conn.Read as if nothing happened.

Why a million connections is feasible

An idle TCP connection in a Go server costs roughly: one goroutine (g struct plus its 2 KB initial stack, which often stays at 2 KB for read-heavy connections), one file descriptor (a few hundred bytes in kernel memory plus the socket buffers), and one entry in the netpoller's epoll set.

Per idle connectionApprox cost
Goroutine (g struct + 2 KB stack)~2.5 KB user space
Kernel struct sock + minimum buffers~3 KB kernel space
Netpoller poll entry~100 bytes
Total per idle connection~5–6 KB

A million idle connections is therefore a few GB of memory and some net.core.somaxconn / fs.file-max tuning. Famous Go services (Discord, Cloudflare's edge, the Caddy server) have all crossed the million-connection line on commodity hardware. The number isn't the point; the point is that the design scales until something else — TLS, real CPU work, socket buffers — runs out first.

Where it breaks

The model is forgiving until connections start doing real work. A few honest failure modes:

  • Active connections under sustained load. Idle is cheap; busy is not. Each read/write goroutine that's actually running needs a real CPU slice. A million connections all chatting at once is still a million goroutines competing for GOMAXPROCS threads.
  • TLS connections. The handshake is real CPU work (an RSA or ECDSA signature, AEAD setup) and the per-connection tls.Conn state is several KB on top of the bare TCP cost. A million TLS connections is a very different number from a million TCP connections.
  • HTTP/2 multiplexing. One TCP connection can carry many streams. The per-stream concurrency story is different; see below.
  • Slow clients. A client that trickles bytes one per second holds a goroutine parked on every Read. With no timeouts you can be slowloris'd into the ground.

http.Server tuning

The default http.Server has no timeouts. ReadTimeout, ReadHeaderTimeout, WriteTimeout, and IdleTimeout all default to zero, which the documentation describes as "no timeout" and which production engineers describe as "the bug". Always set them.

srv := &http.Server{
    Addr:              ":8080",
    Handler:           mux,
    ReadHeaderTimeout: 5 * time.Second,   // slowloris guard
    ReadTimeout:       30 * time.Second,  // entire request body
    WriteTimeout:      30 * time.Second,  // entire response
    IdleTimeout:       120 * time.Second, // keep-alive between requests
    MaxHeaderBytes:    1 << 20,           // 1 MiB cap on headers
}

// Graceful drain on SIGTERM:
go func() {
    <-ctx.Done()
    shutCtx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
    defer cancel()
    srv.Shutdown(shutCtx)
}()

log.Fatal(srv.ListenAndServe())

MaxHeaderBytes bounds the per-request memory; without it a hostile client can stream headers until you OOM. Shutdown stops accepting new connections, lets in-flight requests finish, and closes idle keep-alive connections — preferable to Close, which yanks the rug out.

TLS

net/http uses crypto/tls by default for HTTPS. The handshake is CPU-heavy — a few milliseconds of asymmetric crypto on every fresh connection — and the per-connection state holds buffers and keying material on top of the plain TCPConn.

Session resumption via TLS 1.3 session tickets is automatic on both client and server, so reconnecting clients pay only a symmetric-crypto round trip. Hardware AES-NI keeps the steady-state encryption cost small — typically a percent or two of CPU at gigabit speeds — so the bottleneck is almost always the handshake, not the bulk crypto.

When to terminate TLS at the edge. If your service does tens of thousands of new TLS handshakes per second, terminate at a separate layer — envoy, nginx, an ALB — and run your http.Server on plaintext upstream. You keep the Go code simple and let a dedicated proxy own the certificate lifecycle and the handshake CPU budget.

HTTP/2

Go's HTTP/2 server is on by default for any HTTPS listener — the net/http package wires it up through golang.org/x/net/http2. One TCP connection per client, with streams multiplexed inside it. The handler-per-request abstraction looks the same to your code; underneath, each stream gets its own pair of read/write goroutines that share the connection's flow control state.

The "one goroutine per logical request" pattern still applies — what changes is that the underlying TCP connection is shared. A misbehaving stream can stall the whole connection if it doesn't drain its flow-control window. Frames are interleaved with a 16 KiB default frame size, so a large response from one stream doesn't fully starve another, but head-of-line blocking at the TCP layer is still a real failure mode under packet loss (which is why HTTP/3 moved to QUIC).

Common pitfalls

  • No timeouts. The single most common production bug in Go HTTP servers. Defaults are zero, which means infinite. Set all four on http.Server; set Timeout on every http.Client too — http.DefaultClient has none either.
  • Goroutines leaked on disconnected clients. If your handler kicks off background work that holds a goroutine, watch r.Context().Done() — the server closes the context when the client disconnects, and goroutines that ignore it keep running and accumulating.
  • Connection pool exhaustion on the client. http.Transport.MaxIdleConnsPerHost defaults to 2. For a service that makes lots of requests to a single upstream, that's a quiet bottleneck — every request beyond two is opening a fresh TCP+TLS connection. Bump it to something like 100 for chatty internal services.
  • TLS handshake costs at fleet scale. A thousand pods each opening fresh connections to a service is a lot of handshakes. Long-lived clients, connection reuse, and session resumption all help.
  • Ignoring HTTP/2 flow control. A slow consumer that doesn't drain its window can stall a stream while the rest of the connection runs fine — but the stalled goroutine is still parked, holding resources.

Production checklist

  • Set all four timeouts on http.Server: ReadHeaderTimeout, ReadTimeout, WriteTimeout, IdleTimeout.
  • Set MaxHeaderBytes to bound per-request memory.
  • Use http.Server.Shutdown for graceful drain on SIGTERM.
  • On http.Client: set Timeout, bump Transport.MaxIdleConnsPerHost, reuse the client across requests.
  • Terminate TLS at the edge if you're doing tens of thousands of fresh handshakes per second.
  • Monitor goroutine count over time — a steadily climbing number is your leak detector.
  • Measure handshake rate, cipher distribution, and HTTP version split with tls.ConnectionState in middleware.
  • Prefer HTTP/2 for many small requests to the same host; prefer HTTP/1.1 for long downloads where you want raw throughput per connection.

Further reading

Found this useful?