Multi-page · for infra + operator authors
Kubernetes internals

How Kubernetes
actually works.

The control plane, the data plane, and every protocol that binds them. For infrastructure engineers building the platform, SREs running it in production, and the people writing the next generation of operators and controllers.

Fourteen sub-pages, all live below. Each is a 4,000-word deep dive with sequence diagrams, code excerpts, and links into the source tree.


The system, one canvas

Every component, every wire.

Kubernetes is not one program; it is roughly a dozen, each with a single responsibility, talking to each other over a small set of well-defined protocols. The diagram below is the whole system from above. Every sub-page deep-dives into one slice of it.

CONTROL PLANE · usually 3 nodesapi-server:6443 (https)REST + watchetcd:2379 (gRPC)Raft + MVCCschedulerwatches Podswrites nodeNamecontroller-manager~30 controllers in one binaryleader-electedDATA PLANE · every node, large fleetsNODE 1kubelet:10250kube-proxyiptablesCRIcontainerdCNICiliumNODE 2kubelet:10250kube-proxyiptablesCRIcontainerdCNICiliumNODE 3kubelet:10250kube-proxyiptablesCRIcontainerdCNICiliumLEGENDapi-serveretcdschedulercontroller-mgr--- watch stream— direct call

The api-server is the only component that talks to etcd. Everything else watches the api-server. That single property is the most important architectural fact in Kubernetes.


Live deep dives

Start here.

01 Live

Architecture

The control plane (api-server, etcd, scheduler, controller-manager, cloud-controller-manager) and the data plane (kubelet, kube-proxy, container runtime, CNI). Who calls whom, on which port, with which protocol.

control plane ·data plane ·process boundaries ·gRPC + JSON wires ·leader election
Read
02 Live

The lifecycle of `kubectl apply`

A complete trace from the moment you press Enter on the kubectl command to the moment the pod is Running. Twelve hops, named, timed, and explained.

client-side validation ·auth + admission ·storage in etcd ·watch fan-out ·reconcile
Read
03 Live

Pod scheduling, end to end

How the scheduler picks a node — predicates, priorities, the scheduling framework, plugin order. Then how the kubelet on that node actually starts the containers via the CRI.

scheduler framework ·predicates and scoring ·kubelet sync loop ·CRI / containerd ·probes
Read
04 Live

The controller pattern

Informers, listers, work queues, reconciliation loops. The pattern that every built-in and custom controller follows. Pseudocode you can ship.

informer + lister ·work queue + rate-limiting ·reconcile loop ·leader election ·controller-runtime
Read
05 Live

The scheduler framework

PreFilter, Filter, PreScore, Score, NormalizeScore, Reserve, Permit, PreBind, Bind, PostBind. The plugin chain a Pod walks through, in order.

scheduling framework ·plugin extension points ·percentageOfNodesToScore ·preemption ·multi-scheduler
Read
06 Live

etcd — the consistent store

Raft consensus, MVCC, watch streams, compaction. Why an etcd disaster makes the API server useless. Backups, restores, and the lease model.

Raft ·MVCC + revision ·watch stream ·compaction ·lease + TTL
Read
07 Live

The API server pipeline

Authentication chain, authorisation chain, mutating + validating admission webhooks, conversion, storage. The 11-stage request handler.

authn / authz ·admission webhooks ·conversion ·list / watch ·priority + fairness
Read
08 Live

CRDs and operators

Defining a custom resource, building a controller for it with kubebuilder, the OperatorHub maturity model, when to ship a CRD vs a Helm chart.

CRD schema ·kubebuilder ·controller-runtime ·finalizers ·subresources
Read
09 Live

Networking — CNI to kube-proxy

The four assumptions in the K8s networking model. CNI plugin spec. kube-proxy in iptables / IPVS / nftables / eBPF. Service IPs that are not IPs.

CNI spec ·pod-to-pod ·pod-to-service ·kube-proxy modes ·NetworkPolicy backends
Read
10 Live

kubelet internals

Sync loop, CRI calls, image pull, volume mount, probe execution, eviction signals, cgroups setup. How a pod actually starts on a node.

sync loop ·CRI ·CSI volume mount ·probes ·eviction
Read
11 Live

Storage — CSI, PV / PVC, attach / detach

CSI plugin lifecycle. Volume binding modes. The provisioner / attacher / resizer split. Why StatefulSet pods stick to their volume.

CSI driver ·PV / PVC binding ·attach / detach controller ·volume snapshots ·topology
Read
12 Live

Authentication & authorisation

The auth chain — client certs, bearer tokens (ServiceAccount, OIDC), webhook authn. The authz chain — RBAC, ABAC, Webhook, Node. ServiceAccount projection.

webhook chain ·ServiceAccount tokens ·projected tokens ·RBAC ·ImpersonationPolicy
Read
13 Live

Informers and the shared cache

The list-watch loop, the DeltaFIFO, the thread-safe store, and the work queue. How every controller reads cluster state without hammering the API server.

list-watch ·DeltaFIFO ·shared informer ·work queue ·resync
Read
14 Live

client-go internals

The Go client every controller is built on. RESTClient, typed clientsets, the discovery client, informers, and the work queue, from the bottom up.

RESTClient ·clientset ·discovery ·informers ·rate limiting
Read