Sub-page 14 · for controller + operator authors

Kubernetes internals · client-go

One library,
every conversation.

Almost every Kubernetes process you have ever read about — kubelet, the scheduler, the controller-manager, every operator, every kubectl invocation, every operator SDK on top — opens its TCP connection to the api-server through the same Go library. Six packages, ninety thousand lines, and a single end-to-end opinion about how clients should be built. That library is client-go.

This page is a vertical slice through everything in client-go that is not the cache. The REST client and its verbs, the transport stack with TLS and exec plugins, dynamic and discovery clients, kubeconfig versus in-cluster auth, client-side QPS / Burst, DefaultRetry, the four patch types, server-side apply, and the code-generation toolchain. Roughly 4,200 words. Pair it with the informers sub-page for the cache layer, and the api-server sub-page for what is on the other end of the wire.

Why a separate client library exists at all.

The Kubernetes API is a REST surface, and in principle anyone with curl and a kubeconfig can drive it. In practice, almost no production code does. The reason is not the HTTP — that part is fine — it is everything around it. The api-server speaks at least three media types per resource (JSON, YAML, protobuf), supports four mutually exclusive patch formats, requires one of half a dozen authentication schemes (bearer tokens, client certs, exec plugins, OIDC, AWS IAM, GCP application-default), expects every list to be paged with a continue-token, demands a particular content-negotiation dance for protobuf, and rejects watches that do not include a resync-aware resourceVersion with a 410 Gone. Implementing all of that correctly, once, is hard. Implementing it correctly twenty times — once per controller — is masochism.

client-go is the Kubernetes project's answer: a Go module living at k8s.io/client-go that contains the canonical implementation of every one of those concerns, structured as a layered stack so that callers can reach in at the level of abstraction that fits their problem. At the top is a typed, code-generated clientset where you call clientset.CoreV1().Pods("prod").Get(ctx, "web", metav1.GetOptions{}) and a *v1.Pod comes back. At the bottom is a raw rest.RESTClient with verbs that take a path and a body and return bytes. In between are the dynamic client (for resources you do not know the type of at compile time), the discovery client (for asking the server what resources exist), and the transport stack (for actually sending the bytes over TLS).

The library's design is unusually opinionated. It assumes you want connection pooling on a long-lived http.Client, not a fresh dial per call. It assumes you want exponential-backoff retries on idempotent verbs, not on POSTs. It assumes you want client-side rate limiting (QPS plus Burst) so a buggy controller cannot DoS the api-server. It assumes you want optimistic concurrency on Updates and graceful 409 handling. Every assumption is configurable, but the defaults are right for almost every controller, which is why almost every controller uses them unmodified.

The library is also the load-bearing thing that makes Kubernetes' multi-version API work. The api-server speaks v1, v1beta1, v1beta2, and so on, of every resource simultaneously, with on-the-fly conversion between them. Every version has its own Go struct. client-go's typed clientset is generated for one version at a time, but its runtime.Scheme machinery — the type registry that knows how to unmarshal an arbitrary apiVersion/kind pair into the right Go type — is shared. That is what lets a single binary compiled today keep working tomorrow when the cluster's storage version flips.

There are two consequences worth holding. First, "writing your own client to talk to the api-server" is, ninety-nine times out of a hundred, the wrong instinct. Even kubectl is built on client-go (with its own command-routing layer on top). Even the operator-SDK is built on controller-runtime, which is built on client-go. Second, when client-go's behaviour surprises you — a 409, a sudden 429, a watch that closed at 90s — the answer is almost always in the library's source, not in the api-server's. Read it. The package boundaries are clean and the code is mostly Go idiomatic; you can find the relevant function in five minutes.

Mental model — client-go is to the api-server what a database driver is to a database: it absorbs the boring, error-prone parts of the protocol so that callers can write idiomatic Go. The difference is that this driver also runs the cache and the rate limiter for you.

Four layers — REST, typed, dynamic, discovery.

Every call from a Go process to the api-server passes through one of four client surfaces. They are stacked, in the sense that each one is implemented in terms of the one below it, and the right choice depends on what the caller knows at compile time. If you know the resource type (v1.Pod), use the typed clientset. If you know only the group-version-resource at runtime (apps/v1/deployments), use the dynamic client. If you do not even know what resources the cluster has, ask the discovery client first, and feed the answer to the dynamic one. If you want raw bytes — usually because you are implementing a new client for some new resource — drop to the REST client.

The typed clientset is what every controller in kube-controller-manager uses. It is generated, not handwritten: you can find the source under k8s.io/client-go/kubernetes/typed/<group>/<version>/, and every file in there is the output of client-gen. The shape is predictable: a top-level Clientset with one accessor per API group; each group has one method per resource; each resource exposes Get, List, Watch, Create, Update, UpdateStatus, Delete, DeleteCollection, Patch, and Apply. All of them return concrete Go types. All of them take a context.Context as their first argument; if you are reading code that does not, you are reading code from before 1.18.

The dynamic client is the same shape but parameterised by GVR (group-version-resource) instead of Go type. It is the workhorse for tools that need to operate on resources they do not know about at compile time: kubectl (which has to handle CRDs that did not exist when it was built), the garbage collector (which has to walk owner references across arbitrary resources), backup and replication operators, GitOps engines like ArgoCD. Every value in and out is an unstructured.Unstructured, which is a thin wrapper around map[string]interface{}. You give up type safety and gain generality.

The discovery client answers a simpler question: what resources does this cluster have, and at what versions? It hits two endpoints — /api and /apis — pages through the API groups they describe, and produces a RESTMapper: a lookup table from kind names ("Deployment") to the preferred GVR (apps/v1/deployments) plus its scope (namespaced or cluster). RESTMapper is what powers the magic in kubectl get deploy — turning a short name into a real REST path. Most users of dynamic clients pair them with a cached discovery client and a RESTMapper built on top.

The REST client at the bottom is the only one that touches HTTP directly. It exposes one method per HTTP verb (.Get(), .Post(), .Put(), .Patch(), .Delete()) which return a fluent Request builder. You add path segments with .Resource(), .Namespace(), .Name(), query params with .Param(), a body with .Body(), and finally call .Do(ctx) to issue the request. The result is a rest.Result that can be decoded into whatever Go type you registered in the runtime.Scheme.

Layer	What it offers	Who uses it
rest.RESTClient	Lowest layer. Verbs, content negotiation, retries.	used by every higher layer; rarely touched directly
kubernetes.Clientset	Typed, generated. clientset.CoreV1().Pods("ns").Get(...)	every controller in the controller-manager
dynamic.Interface	Untyped. Operates on unstructured.Unstructured by GVR.	kubectl, garbage collector, custom-resource tooling
discovery.DiscoveryClient	Asks the api-server which resources exist.	kubectl, RESTMapper, dynamic clients

If your code reaches for the dynamic client, double-check that you actually need it. Operators written against a single CRD almost always want a generated typed clientset (run client-gen on your types), because the type safety eliminates a class of bugs that production has a hard time catching. The dynamic client is for polymorphic code, not for skipping the build step.

Authentication — kubeconfig, in-cluster, exec plugins.

Every client-go program starts the same way: build a *rest.Config, then hand it to kubernetes.NewForConfig or dynamic.NewForConfig. The rest.Config is a struct containing the server URL, the TLS settings, the authentication credentials, and the rate-limiter / retry / user-agent knobs. The interesting question is where it comes from. There are exactly two recommended sources, and one fallback: kubeconfig (for humans and CI), in-cluster ServiceAccount (for processes running inside pods), and a manual struct construction (for tests, never for production).

The kubeconfig path runs through k8s.io/client-go/tools/clientcmd. The library reads ~/.kube/config (or the path in $KUBECONFIG, comma-separated for merging), parses the YAML into a api.Config object with three top-level lists — clusters, users, and contexts — and selects the current context. The current-cluster gives the server URL and CA bundle; the current-user gives the credentials. From a credentials perspective, kubeconfig is a union type: it can hold a bearer token, a client cert / key pair, a username and password (rarely used), or — most interestingly — an exec stanza that names an external command to invoke.

The exec credential plugin is the modern answer for cloud-managed clusters. When you aws eks update-kubeconfig, what gets written is not your IAM credentials but an exec block that says "to authenticate, run aws eks get-token --cluster-name foo and parse its JSON output as a ExecCredential". Every time client-go needs a fresh token (the cached one has expired, or you have not made a request yet this run), it spawns that subprocess, reads its stdout, extracts the token plus an expiration timestamp, and uses the token in an Authorization: Bearer ... header. The plugin protocol is documented as a Kubernetes API: client.authentication.k8s.io/v1.

The in-cluster path is what every pod running with a ServiceAccount uses. Inside a pod, kubelet mounts the ServiceAccount token at /var/run/secrets/kubernetes.io/serviceaccount/token, the cluster CA bundle at .../ca.crt, and the namespace at .../namespace. It also injects two env vars, KUBERNETES_SERVICE_HOST and KUBERNETES_SERVICE_PORT, that point at the in-cluster Service for the api-server. rest.InClusterConfig() assembles a *rest.Config from those five inputs. The token, post-1.21, is a projected ServiceAccount token with a finite lifetime (default one hour, configurable via the expirationSeconds field in the projected volume), and client-go's tokenfile source rotates it transparently when the file changes on disk.

All three paths flow into the same destination: the transport.Config, which is what rest.Config.TransportConfig() returns. From there, transport.New builds a wrapped http.RoundTripper that adds — in order — TLS configuration, request signing (cert auth or bearer-token injection), gzip handling, request-debug logging if enabled, and rate limiting. The wrapped RoundTripper is then plugged into a single http.Client shared by every layer above. One TCP connection pool, one TLS handshake, one credential refresh loop, every verb funnels through it.

// k8s.io/client-go/tools/clientcmd + k8s.io/client-go/rest — pick the right config.
func buildConfig(kubeconfigPath string) (*rest.Config, error) {
    // In-cluster first: if KUBERNETES_SERVICE_HOST is set, we are running inside a pod.
    if cfg, err := rest.InClusterConfig(); err == nil {
        return cfg, nil
    }
    // Otherwise read kubeconfig from disk.
    return clientcmd.BuildConfigFromFlags("", kubeconfigPath)
}

// Standard tuning for a controller. Defaults of 5/10 are too low for a busy operator.
cfg.QPS         = 50.0
cfg.Burst       = 100
cfg.UserAgent   = "my-operator/v1.4.2 (linux/amd64) kubernetes/abc1234"
cfg.Timeout     = 0  // 0 = no overall deadline; per-call ctx still applies

clientset, _ := kubernetes.NewForConfig(cfg)

# An exec credential plugin invocation, as run by client-go on every token refresh.
# From kubeconfig:
#   users:
#   - name: aws-eks
#     user:
#       exec:
#         apiVersion: client.authentication.k8s.io/v1
#         command: aws
#         args: ["eks", "get-token", "--cluster-name", "prod-east"]

$ aws eks get-token --cluster-name prod-east
{
  "kind": "ExecCredential",
  "apiVersion": "client.authentication.k8s.io/v1",
  "status": {
    "expirationTimestamp": "2026-05-03T19:42:11Z",
    "token": "k8s-aws-v1.aHR0cHM6Ly9zdHMuYW1hem9uYXdzLmNvbS8..."
  }
}
# client-go caches this token in memory until 30s before expirationTimestamp,
# then re-execs the command in the background.

A subtle production gotcha — exec plugins are spawned with the same environment your controller binary sees. If your operator runs in a pod whose $HOME is unset, the aws CLI cannot find its credentials cache and will re-exec STS on every refresh. Set HOME in the pod spec, or pre-bake the token via IRSA / Workload Identity instead. See the auth sub-page for the cluster-side view.

Four patch types — and the one that won.

Update is the verb most people reach for first, but in production controller code it is almost always the wrong one. Update writes the entire object; it requires you to GET first, modify, then PUT, and it fights every other writer on the cluster for the resourceVersion token. Patch writes only the fields you care about, lets the api-server merge them with whatever else is on the object, and is far harder to accidentally race. The trade-off is that there are four flavours of patch and each behaves slightly differently. The constants live in k8s.io/apimachinery/pkg/types: StrategicMergePatchType, MergePatchType (RFC 7396), JSONPatchType (RFC 6902), and ApplyPatchType (server-side apply, KEP-555).

Strategic-merge patch is the legacy default for built-in resources. It looks like a JSON-merge patch but with one key addition: it understands the patchMergeKey annotation on Go struct fields, which tells it that lists of containers in a Pod spec should be merged by name, not replaced wholesale. This is what makes kubectl edit deploy not blow away your envs when you change a single image. The downside is that strategic-merge is implemented by the api-server using compiled-in knowledge of the resource's Go types — it does not work for CRDs unless they ship the same merge metadata, which most do not.

JSON-merge patch (RFC 7396) is the simple cousin: send a JSON document whose shape mirrors the target, and the server overlays it. Lists are replaced atomically. It is what client-go uses against CRDs by default if you do not pick another type, and it is what most operator-SDK generated code emits. JSON-patch (RFC 6902) is the explicit alternative: an array of operations (add, remove, replace, test) addressed by JSON-pointer paths. It is the only patch type that lets you remove a single list element by index, or atomically test-and-set a value. It is rarely the right choice for human-authored controllers but it is what GitOps tooling sometimes prefers because the operations are deterministic and easy to dry-run.

Apply patch — server-side apply — is the type the project's KEP-555 picked as the long-term answer. The body is a YAML or JSON fragment of the desired state, sent with the application/apply-patch+yaml content type, plus a query parameter ?fieldManager=<name> that names the actor. The api-server merges this fragment with whatever is already there, recording per-field ownership in metadata.managedFields. We will dive into SSA's mechanics in part six; here the relevant fact is that it is the only patch type that solves multi-controller coordination cleanly. Two controllers writing different fields with SSA do not race; with strategic-merge, they would.

The decision tree is short. If you are writing a new controller for a CRD: use server-side apply. If you are writing a controller against a built-in and you cannot use SSA for some reason (rare; legacy code path; pre-1.22 cluster): use strategic-merge. If you are doing very narrow, surgical edits and you need the predictability: JSON-patch. If you are mirroring an external system and need exact replacement semantics: JSON-merge. The default that client-go's .Patch() uses is the one you pass; there is no implicit choice.

types.PatchType	Content-Type header	Notes
StrategicMergePatchType	application/strategic-merge-patch+json	kubectl default for built-ins; honours patchMergeKey on lists
MergePatchType	application/merge-patch+json	RFC 7396; replaces lists wholesale, no merge keys
JSONPatchType	application/json-patch+json	RFC 6902; explicit ops add/remove/replace/test
ApplyPatchType	application/apply-patch+yaml	Server-side apply (KEP-555); requires fieldManager

// One Patch call, four shapes. Note the body and PatchType change together;
// everything else (path, name, namespace) is identical.

// 1. Strategic merge — built-ins; honours patchMergeKey on lists.
sm := []byte(``{"spec":{"replicas":3}}``)
clientset.AppsV1().Deployments("prod").Patch(ctx, "web",
    types.StrategicMergePatchType, sm, metav1.PatchOptions{})

// 2. JSON merge (RFC 7396) — the CRD-friendly default.
jm := []byte(``{"spec":{"replicas":3}}``)
clientset.AppsV1().Deployments("prod").Patch(ctx, "web",
    types.MergePatchType, jm, metav1.PatchOptions{})

// 3. JSON patch (RFC 6902) — explicit ops by JSON-pointer.
jp := []byte(``[{"op":"replace","path":"/spec/replicas","value":3}]``)
clientset.AppsV1().Deployments("prod").Patch(ctx, "web",
    types.JSONPatchType, jp, metav1.PatchOptions{})

// 4. Server-side apply — KEP-555. fieldManager is required.
ap := []byte(`apiVersion: apps/v1
kind: Deployment
metadata: {name: web, namespace: prod}
spec: {replicas: 3}`)
clientset.AppsV1().Deployments("prod").Patch(ctx, "web",
    types.ApplyPatchType, ap, metav1.PatchOptions{
        FieldManager: "my-operator",
        Force:        ptr.To(true),  // take ownership on conflict
    })

The Force flag on apply patches is the conflict-resolution lever. If another field manager already owns a field and you re-apply with a different value, you get a 409 Conflict by default. Setting Force: true says "take ownership". Use it carefully; two controllers both setting Force on the same field will ping-pong.

Rate limiting and retries — QPS, Burst, DefaultRetry.

A misbehaving controller can DoS its own api-server. This has happened in every production cluster long enough to have outage post-mortems: a tight reconcile loop, a missing rate limiter, a Deployment that updates an annotation on a pod every iteration of the loop, and suddenly the api-server is melting under sixty thousand requests per second from one operator. client-go ships with two layers of defence, both on by default, both worth understanding.

The first layer is client-side rate limiting via a token bucket. Two knobs: QPS (steady-state queries per second) and Burst (the bucket size). Defaults are QPS=5, Burst=10, which are deliberately conservative — too low for a busy operator, but the right shape for a one-off kubectl. Before every request, the rate limiter calls Wait() on the bucket; if the bucket is empty, the goroutine blocks until a token is available. The blocking is reflected as latency in your metrics; if you see workqueue_queue_duration_seconds climbing for no apparent reason, your QPS is too low.

The second layer is the api-server's own server-side rate limiting, surfaced as 429 Too Many Requests with a Retry-After header. client-go honours this automatically: on a 429, it sleeps for the indicated duration and retries. On modern clusters with API Priority and Fairness (APF) enabled, the api-server allocates per-flow concurrency budgets, and individual clients tagged with the same UserAgent share a budget. Setting a unique cfg.UserAgent on every controller is operationally useful precisely because APF flow-keys default to it.

Retries on errors that are not 429 are handled by a small library at k8s.io/client-go/util/retry. The canonical helper is retry.RetryOnConflict(retry.DefaultRetry, fn), which retries fn if it returns a 409, with a fixed schedule: DefaultRetry = wait.Backoff{Steps:5, Duration:10ms, Factor:1.0, Jitter:0.1}. That is five attempts spaced ~10 ms apart, with 10 percent jitter. The DefaultBackoff variant is more aggressive: Steps:4, Duration:1s, Factor:5, which means roughly 1, 5, 25, 125 seconds — the right shape for retrying an api-server outage, the wrong shape for a tight loop.

Watches are a special case. A watch can fail for benign reasons — the api-server rolling, a load balancer rotating, the watch cache rotating past your resourceVersion — and naively reconnecting in a tight loop can cause a thundering-herd outage. client-go's cache.Reflector uses a WatchErrorHandler with exponential backoff: 800ms, 1.6s, 3.2s, capped at 30s, jittered. If you write your own watch loop (you should not, but if), copy this shape. Unbounded reconnects on a 5xx are a well-known way to keep a recovering api-server down.

// k8s.io/client-go/util/retry — DefaultRetry is for 409 Conflict on optimistic-concurrency Updates.
var DefaultRetry = wait.Backoff{
    Steps:    5,
    Duration: 10 * time.Millisecond,
    Factor:   1.0,
    Jitter:   0.1,
}

// Canonical Update-with-retry pattern. The api-server returns 409 if someone else
// bumped the resourceVersion between our Get and our Update; we re-Get and try again.
err := retry.RetryOnConflict(retry.DefaultRetry, func() error {
    pod, err := clientset.CoreV1().Pods("prod").Get(ctx, "web", metav1.GetOptions{})
    if err != nil { return err }
    pod.Annotations["reconciled-at"] = time.Now().Format(time.RFC3339)
    _, err = clientset.CoreV1().Pods("prod").Update(ctx, pod, metav1.UpdateOptions{})
    return err
})

A common mistake — bumping QPS to 1000 to "fix" a slow controller. If you are hitting your QPS limit you have a reconcile-loop bug, not a rate-limit bug. Find the logic that emits unnecessary writes (usually status updates that do not actually change anything) and remove them. The api-server can do 1000 QPS, but your etcd disk fsync cannot.

Server-side apply, deeply — managed fields and conflicts.

Server-side apply is, on the wire, just another patch type. What makes it interesting is the bookkeeping the api-server does on its behalf. Every time a client applies a fragment to a resource, the api-server records — in a field on the object called metadata.managedFields — exactly which JSON paths that client now owns. Two clients can own different paths on the same object simultaneously; if a third applies and tries to set a path another already owns, the api-server returns a 409 Conflict unless the third explicitly passes Force. This is the cluster's first-class story for multi-controller coordination, and it is the reason KEP-555 was a five-year project rather than a one-week one.

The key argument to every apply call is the fieldManager string. It identifies the actor: "kubectl-client-side-apply", "kube-controller-manager", "my-operator". Pick something stable and unique per controller; if you let it default, you will end up sharing ownership records across reconcile loops in confusing ways. Apply calls that mutate paths owned by another field manager fail-closed by default. If your controller is the canonical owner of a field — say, the HPA owning spec.replicas on a Deployment — set Force: true, and the apply takes ownership.

The structure of managedFields is worth pulling open at least once. Each entry is an ManagedFieldsEntry with the manager name, the operation (Apply or Update), the API version, the timestamp, and a FieldsV1 blob. The blob is a recursive JSON structure where every key is either a literal field name or a k:-prefixed associative-list key. So "f:spec":{"f:replicas":{},"f:template":{"f:spec":{"f:containers":{"k:{\"name\":\"web\"}":{...}}}}} means "I own spec.replicas, and within spec.template.spec.containers I own the entry whose name is "web"". You will never need to construct one of these by hand; the api-server does it. But when you debug an SSA conflict, this is the structure you are reading.

Apply calls have one subtle behaviour you must know: fields you stop sending are removed from your manager's ownership. If your controller previously applied spec: {replicas: 3, paused: true} and you now apply just spec: {replicas: 3}, you have abandoned ownership of paused. If no other manager owns it, it stays at its current value (true). If you re-apply spec: {replicas: 3} a second time, paused stays. But if you apply spec: {replicas: 3, paused: false}, you take ownership again and overwrite. SSA is fully declarative: every apply is the complete desired state from your manager's perspective, and the diff against the previous apply produces the ownership and value changes.

A few production patterns. The HPA writes spec.replicas with fieldManager:"horizontal-pod-autoscaler" and Force: true; if you also have a CD system that applies spec.replicas, expect a fight unless you exclude it from your CD manifest. Argo CD has a "Respect SSA" mode that filters out fields owned by other managers from its diff. The kubelet writes parts of status with fieldManager:"kubelet"; controllers that touch status should use a distinct fieldManager so that ownership is unambiguous in audit logs.

// Server-side apply with field manager and Force.
// applyConfiguration types are generated by applyconfiguration-gen; see Part 07.

deploy := appsv1ac.Deployment("web", "prod").
    WithSpec(appsv1ac.DeploymentSpec().
        WithReplicas(3).
        WithSelector(metav1ac.LabelSelector().WithMatchLabels(map[string]string{"app":"web"})))

_, err := clientset.AppsV1().Deployments("prod").
    Apply(ctx, deploy, metav1.ApplyOptions{
        FieldManager: "my-operator",
        Force:        true,  // take spec.replicas from the HPA on conflict
    })

// What gets recorded on the object after the call:
// metadata.managedFields:
// - manager: my-operator
//   operation: Apply
//   apiVersion: apps/v1
//   fieldsType: FieldsV1
//   fieldsV1: { f:spec: { f:replicas: {}, f:selector: {...} } }

The migration trap — clusters older than 1.18 wrote everything with the Update operation, which is treated by SSA as a single big "before server-side apply" manager. When you start applying with a real fieldManager, you may have to re-apply twice (or use Force) before the legacy ownership is fully replaced. The migration guide is required reading.

Code generation — the toolchain that produces clientsets.

Almost everything user-facing in client-go is generated from a small set of declarations. You write Go structs that describe your CRD's types, you mark them up with a few magic comments, you run a script, and out pop a typed clientset, listers, informers, deepcopy implementations, and server-side-apply applyconfiguration helpers. The toolchain lives in k8s.io/code-generator, and the entry-point script is generate-groups.sh (or its newer cousin kube_codegen.sh). Both call into four executables, each generating a different shape of code.

deepcopy-gen is the simplest. It walks every type marked with // +k8s:deepcopy-gen=true and emits a DeepCopyInto method that recursively copies the struct. This is required because Go's assignment is shallow, and Kubernetes APIs pass objects by pointer through informers and caches; without deep copies, two consumers could mutate the same in-memory object. Hand-writing these for a non-trivial CRD is a hundred-line task; deepcopy-gen does it in a second and the output is mechanical and readable.

client-gen emits the typed clientset. Given a package of types and a marker like // +groupName=widgets.example.com, it produces the full Clientset, WidgetsV1Interface, WidgetInterface with Get/List/Watch/Create/Update/UpdateStatus/Delete/ DeleteCollection/Patch/Apply, and a fake variant for tests. It is what you import as github.com/example/my-operator/pkg/client/clientset/versioned.

lister-gen emits cache-backed listers, which are read-only views over the informer's local store, and informer-gen emits the informer factory and per-resource informers themselves. The two are paired: the informer populates a thread-safe store; the lister wraps the store with a typed lookup API. Most controllers consume both. If you have read the informers sub-page already, you have seen what these produce in motion; here the relevant fact is just that they are generated, not handwritten, and the generation is gated by Go file tags so that the output is regenerated whenever your types change.

A fifth generator, applyconfiguration-gen, was added in 1.21 to support server-side apply. It produces fluent builder types that let you express an apply patch in code rather than as a YAML string — appsv1ac.Deployment("web","prod").WithSpec(appsv1ac.DeploymentSpec().WithReplicas(3)) — which the typed clientset's Apply method then serialises to the wire. Without this generator, every apply is a YAML literal with all the typo risk that implies.

The output of all five generators lives under pkg/client/ in most operator repos, gitignored or checked-in depending on team taste. If you check it in (most do), the build step is a pre-commit hook that runs make generate; if you do not, every CI run regenerates from scratch. Either works; the important rule is that the generated code is reproducible from the type declarations, and any local edits to it will be obliterated next regeneration.

# A typical operator repo's hack/update-codegen.sh — what the generators emit.

$ ./hack/update-codegen.sh
Generating deepcopy funcs
  → pkg/apis/widgets/v1/zz_generated.deepcopy.go
Generating applyconfiguration for widgets/v1
  → pkg/client/applyconfiguration/widgets/v1/widget.go
Generating clientset for widgets/v1 at pkg/client/clientset
  → pkg/client/clientset/versioned/clientset.go
  → pkg/client/clientset/versioned/typed/widgets/v1/widget.go
Generating listers for widgets/v1 at pkg/client/listers
  → pkg/client/listers/widgets/v1/widget.go
Generating informers for widgets/v1 at pkg/client/informers
  → pkg/client/informers/externalversions/widgets/v1/widget.go
  → pkg/client/informers/externalversions/factory.go

# Total output for one CRD type with three fields: ~3,200 lines of Go.
# Hand-writing equivalents that handle every edge case correctly: weeks of work.

A practical note — controller-runtime (used by Operator SDK, Kubebuilder, and most modern operators) ships its own client.Client abstraction that uses unstructured operations under the hood and does not require client-gen / lister-gen / informer-gen at all. Many operator repos therefore skip the generators entirely and only run deepcopy-gen. This is fine; you trade type-strict generated code for a more dynamic runtime, and the controller-runtime cache handles the same job that an informer would.

Diagnostic tooling and further reading.

The single most useful debugging tool for client-go is kubectl -v=. At verbosity 6 you see the URLs and status codes; at 7 you see the request and response headers; at 8 you see the full bodies. Every controller built on client-go honours the same --v flag, so when an operator misbehaves in production, raising its log level to 8 (or its equivalent klog flag) gives you the same wire trace. Most production incidents involving "the api-server returned a weird error" are diagnosable from one v=8 trace and ten minutes of reading.

The trace below is what a single kubectl get pods -n prod looks like at v=8. Note the user-agent (always set this on production clients), the Accept header (protobuf preferred, JSON fallback), the If-None-Match for cache validation, and the response that lands in one round trip. Multiplied across an operator's reconcile loop you can read the request shape directly from the trace.

$ kubectl get pods -n prod -v=8 2>&1 | head -40
I0503 12:14:08.421 loader.go:373] Config loaded from file:  /home/u/.kube/config
I0503 12:14:08.430 round_trippers.go:466] curl -v -XGET  -H "Accept: application/vnd.kubernetes.protobuf,application/json" \
                                              -H "User-Agent: kubectl/v1.30.2 (linux/amd64) kubernetes/abc1234" \
                                              "https://prod.example.com:6443/api/v1/namespaces/prod/pods?limit=500"
I0503 12:14:08.498 round_trippers.go:495] HTTP Trace: DNS Lookup for prod.example.com resolved to [{52.84.x.x 0}]
I0503 12:14:08.501 round_trippers.go:510] HTTP Trace: Dial to tcp:52.84.x.x:6443 succeeded
I0503 12:14:08.529 round_trippers.go:553] GET https://prod.example.com:6443/api/v1/namespaces/prod/pods?limit=500 200 OK in 99 milliseconds
I0503 12:14:08.529 round_trippers.go:570] Response Headers:
I0503 12:14:08.530 round_trippers.go:573]     Audit-Id: 7c3b1f0e-8e2a-4d11-9e6f-...
I0503 12:14:08.530 round_trippers.go:573]     Cache-Control: no-cache, private
I0503 12:14:08.530 round_trippers.go:573]     Content-Type: application/vnd.kubernetes.protobuf
I0503 12:14:08.530 round_trippers.go:573]     X-Kubernetes-Pf-Flowschema-Uid: 8b...   # APF flow that handled the request
I0503 12:14:08.530 round_trippers.go:573]     X-Kubernetes-Pf-Prioritylevel-Uid: c1...
I0503 12:14:08.531 round_trippers.go:577] Response Body: 0000 6b 38 73 00 0a 0c 0a 02 76 31 12 06 50 6f 64 4c  k8s.....v1..PodL ...

A few other tools earn their keep. go tool trace on a controller binary will show you exactly where time is going across goroutines, including time spent waiting on rate.Limiter.Wait — the smoking gun for "my operator is slow" incidents. curl --cacert ... --cert ... --key ... with the same TLS material from your kubeconfig isolates whether a problem is in client-go's request building or in the cluster's response (when in doubt, replicate with curl). And the api-server's own /metrics endpoint surfaces apiserver_request_total labelled by user-agent, so you can attribute QPS to specific controllers from the server side without instrumenting them yourself.

When client-go itself misbehaves — rare but it happens — the source tree is the answer. The four packages worth knowing by heart are k8s.io/client-go/rest (the REST client and its retry logic), k8s.io/client-go/tools/clientcmd (kubeconfig parsing), k8s.io/client-go/transport (TLS, exec plugins, the wrapped RoundTripper), and k8s.io/client-go/util/flowcontrol (the rate limiter). All four are short, well-commented, and idiomatic. Reading them is the fastest way to resolve "is this a bug or expected behaviour" questions, and it pays back the time many times over.

Authoritative docs

Source-tree pointers

KEPs that shaped this

And the rest of the Semicolony ladder. The informers sub-page picks up where this one stops, at the cache layer that almost every controller stacks on top of client-go. The controllers sub-page then layers the workqueue and reconcile loop on top of that. The api-server sub-page is what is on the other end of every byte client-go sends, and the auth sub-page covers the cluster-side primitives — RBAC, ServiceAccounts, OIDC trust — that the kubeconfig and exec-plugin paths in this page exist to satisfy.

One closing observation. client-go is the load-bearing piece of Go infrastructure in the entire cloud-native ecosystem; it is what every operator, every controller, every kubectl plugin, every Helm install ultimately runs through. It is also smaller than you would expect — five packages you can hold in your head, and a code-generator that produces most of the user-visible surface. When you next debug a flaky operator, suspect the layer you have just read about: the wrong patch type, a too-low QPS, an exec plugin that re-runs every call, an apply with the wrong field manager. Those are the four bugs you will see most. Knowing where they live in the source tree is half the fix.

Next in the internals series

Keep going.

Informers and the cache

The cache layer that stacks on top of client-go: Reflector, DeltaFIFO, Indexer, Lister, workqueue.

The controller pattern

Reconcile loops, idempotence, work queues, the controller-runtime opinion.

The api-server, deeply

What is on the other end of every byte client-go sends. Authn, authz, admission, storage.

Authentication and RBAC

The cluster-side counterpart to kubeconfig and exec plugins: ServiceAccounts, OIDC, RBAC.

Read

Found this useful?

One library,every conversation.

Why a separate client library exists at all.

Four layers — REST, typed, dynamic, discovery.

Authentication — kubeconfig, in-cluster, exec plugins.

Four patch types — and the one that won.

Rate limiting and retries — QPS, Burst, DefaultRetry.

Server-side apply, deeply — managed fields and conflicts.

Code generation — the toolchain that produces clientsets.

Diagnostic tooling and further reading.

Keep going.

One library,
every conversation.