Service discovery in Kubernetes.

A pod boots up. A second later, requests flow to it. Between those two events: kubelet, API server, etcd, EndpointSlice controller, kube-proxy on every node, and an iptables rewrite. Watch the chain — eight components, eight steps, ~1 second wall time.

speed1.6s

step 1 / 8

New pod starts · kubelet reports ready

kubelet on the node sees its newly-scheduled pod pass readiness probes. It updates the pod's status in the API server: status.podIP=10.244.3.7, conditions.Ready=True. The pod is now eligible to receive traffic.

Readiness probe: A health check that gates traffic. Until the probe succeeds, the pod is in the Service's endpoints but marked NotReady — kube-proxy won't route to it.
Endpoint: A single (pod IP, port, ready?) tuple. A Service points at a set of endpoints managed by the EndpointSlice controller.

Why DNS doesn\'t cut it alone

Kubernetes Services COULD be just DNS — round-robin A records for each backend pod IP. But DNS has caching with TTLs you don\'t fully control (clients, resolver, OS). A pod that crashed 10 seconds ago might still get traffic because the cached DNS hasn\'t expired. The iptables/IPVS layer adds a much faster control plane: when a pod becomes unready, every node\'s iptables drops it from rotation within ~1 sec, regardless of any DNS cache. DNS resolves the Service to a stable ClusterIP; the dataplane handles per-request routing.

Service mesh alternative

Istio / Linkerd push service discovery into a sidecar proxy (Envoy) running alongside each pod. The proxy gets endpoint updates over xDS from the control plane and does L7 routing (HTTP/gRPC paths, retries, circuit breaking) per-request. Kube-proxy still exists but is mostly bypassed. The trade-off: more CPU and latency overhead per pod (an extra hop), but much richer routing and observability.

Service discovery outside Kubernetes

Consul, ZooKeeper, Etcd, AWS Cloud Map all do roughly the same thing: a place to register services, a way to discover them. Consul gained popularity with HashiCorp\'s tooling and ships with health checking. Modern AWS uses Cloud Map + ALBs to handle the same problem for ECS/Fargate. The pattern is consistent: "registry + watchers + dataplane" — only the implementations differ.

Go deeper

Service mesh + discovery patterns →

How Istio/Envoy actually route, xDS protocol, sidecar vs ambient mode, DNS-based discovery for non-k8s systems.

Open the Codex →

Found this useful?