Who you are,
what you can do.
Every request to kube-apiserver passes through two chains before it touches a single byte of state: authentication establishes who the caller claims to be, authorisation decides what that identity is allowed to verb at what resource. The chains are pluggable, ordered, and surprisingly opinionated. This page walks each plugin, each token format, each rule-evaluation step.
Roughly 4,400 words. Pair it with the api-server sub-page for where these chains live in the request pipeline, and the architecture sub-page for the network and TLS context that makes any of it meaningful.
Auth in Kubernetes, in two hundred words.
Kubernetes has no user database. Read that twice. There is no /etc/passwd inside etcd, no User object in any built-in API group, no place where the cluster records the list of human beings allowed to talk to it. What it has instead is a chain of authenticators, each of which inspects an incoming request, attempts to extract an identity from the credentials presented, and either succeeds (returning a user.Info with a username, UID, and a list of groups) or abstains so the next authenticator in the chain can try. If every authenticator abstains, the request is anonymous and the cluster decides separately whether to allow that.
Once an identity is established, a second chain runs: the authorizers. Each one is asked the same question (should this user, in this group, be allowed to do this verb on this resource in this namespace) and may answer yes, no, or no-opinion. The first authorizer to return a definite yes ends the chain. A definite no from any single authorizer also ends it, short-circuiting the rest. If everyone abstains, the request is denied. The chain is deny-by-default.
Both chains are ordered, both are configured at api-server startup with flags, and both have a default that the project considers safe: --authentication-mode is implicit (the chain is wired by which other flags you set), and --authorization-mode=Node,RBAC is the standard baseline. The pluggable subjects of each chain, which the rest of this page covers, are surprisingly few in number. Four authenticators carry the entire production world: client certificates, ServiceAccount tokens, OIDC, and webhooks. Three authorizers carry it: Node, RBAC, and Webhook. That is it.
The third leg of the stool is observability. Every request, regardless of how the auth chains rule, passes through an audit filter that can record the identity, the verb, the resource, the response code, and (at the highest level) the request and response bodies. Without the audit log, "who deleted the namespace at 3am" is unanswerable; with it, every action in the cluster is attributable. We close on audit because it is the meta-system that makes the rest believable.
Mental model: authn answers "who are you", authz answers "what may you do", admission answers "is the thing you sent legal", audit answers "what just happened". They run in that order, every request, every time. There is no fast path.
Client certificates and the kubeconfig pattern.
The oldest authenticator in Kubernetes, and still the one that signs the cluster's own administrator into existence on day one, is mutual-TLS client-certificate authentication. The api-server is started with a client-CA bundle via --client-ca-file=/etc/kubernetes/pki/ca.crt. Any incoming TLS handshake that presents a client certificate signed by that CA is recognised, and the certificate's subject becomes the user identity. The convention is exact and load-bearing: the certificate's CN (Common Name) becomes the username, and any O (Organization) entries become groups. There is no other place to encode the identity; if you forget the O=system:masters on the bootstrap admin cert, your supposed administrator has no group memberships and RBAC will refuse them.
The kubeconfig file is the client-side of this. By convention it lives at ~/.kube/config, a YAML document that holds three lists (clusters, users, contexts) and one pointer at a current-context. A cluster entry pins the api-server URL and the server-CA bundle that the client will trust. A user entry holds the credential, which may be a client cert and key, a bearer token, or an exec hook that produces one. A context binds a cluster to a user and an optional default namespace; switching contexts is just rewriting one string. The whole file is plaintext-readable; the secrets in it are protected only by the filesystem permissions you give it.
Multiple contexts are the bread-and-butter of operating more than one cluster. A working engineer typically has a dozen, named after environments (prod-eu, staging-us, sandbox-test-7), and switches between them with kubectl config use-context. The same kubeconfig can hold three users (a personal cert for emergency break-glass, an OIDC exec hook for daily work, an IAM-role-assumed token for CI) and bind whichever pair makes sense for whichever cluster. kubeconfig is thus less a credential store than a credential map: how to authenticate against each named cluster.
# ~/.kube/config — three clusters, two users, three contexts apiVersion: v1 kind: Config current-context: prod-eu clusters: - name: prod-eu cluster: server: https://k8s-prod-eu.example.com:6443 certificate-authority-data: LS0tLS1CRUdJTi… - name: staging-us cluster: server: https://k8s-staging.us.example.com:6443 certificate-authority-data: LS0tLS1CRUdJTi… users: - name: alice@example.com user: # client certificate — encodes CN=alice@example.com, O=platform-eng, O=on-call client-certificate: /home/alice/.kube/alice.crt client-key: /home/alice/.kube/alice.key - name: oidc-alice user: exec: apiVersion: client.authentication.k8s.io/v1 command: kubectl args: [oidc-login, get-token, --oidc-issuer-url=https://dex.example.com] contexts: - name: prod-eu context: { cluster: prod-eu, user: alice@example.com, namespace: platform } - name: prod-eu-oidc context: { cluster: prod-eu, user: oidc-alice, namespace: platform } - name: staging-us context: { cluster: staging-us, user: oidc-alice }
Client certificates have one large operational drawback that drives the rest of this page: they cannot be revoked. The api-server has no concept of a CRL or an OCSP responder. Once a certificate is issued, it is valid until it expires, and the only way to "revoke" it is to rotate the entire client CA, which invalidates every cert anyone holds, including the cluster's own internal ones, or to add explicit deny rules to RBAC for the user's CN. Neither is operationally acceptable for a leaving employee. This is why production clusters do not issue long-lived client certs to humans; humans authenticate with OIDC, where revocation is the identity provider's problem and Kubernetes only sees a short-lived token. Client certs are reserved for the bootstrap admin and for service-to-service flows where the client is a Pod and the cluster issues the cert via the CertificateSigningRequest API with a short TTL.
The bootstrap admin cert deserves special mention because it is the cluster's most dangerous credential. It is signed by the cluster CA, with CN=kubernetes-admin and O=system:masters. The system:masters group is hard-coded into the api-server to bypass every authorizer. RBAC does not even run for requests from this group; the request is just allowed. This is a deliberate escape hatch so that a misconfigured cluster can be repaired even when RBAC is broken, but it means the admin cert is a master key. Treat the /etc/kubernetes/admin.conf file as the equivalent of root SSH access to every node in the cluster, because that is what it is. Most production playbooks store it offline, in a vault, and use OIDC for the operator's day-to-day work.
Diagnostic: when a kubeconfig "stops working" with no error change, the most common root cause is the client cert expired. openssl x509 -in alice.crt -noout -dates tells you instantly. Cluster CAs are typically valid for ten years, but client certs issued from them are usually one year, and nobody puts them in their calendar.
ServiceAccount tokens: from long-lived to projected.
Pods running inside the cluster need an identity to talk back to the api-server, and that identity is the ServiceAccount object. Every namespace gets a default ServiceAccount when it is created, and every Pod that does not specify a different one is assigned to it. The api-server then arranges for a JSON Web Token bound to that ServiceAccount to be available inside the Pod, and the in-cluster client libraries (client-go, the Python and Java clients) all know to read it. The token's bearer is, from the api-server's perspective, the user system:serviceaccount:<namespace>:<name>, a member of two groups: system:serviceaccounts and system:serviceaccounts:<namespace>. This is the workhorse identity of the cluster: every controller, every operator, every workload that calls the api-server does so as a ServiceAccount.
The legacy mechanism, in place from Kubernetes 1.0 through about 1.21, was to materialise the token into a Secret of type kubernetes.io/service-account-token and mount it at /var/run/secrets/kubernetes.io/serviceaccount/token inside the Pod via an automatically-attached volume. This token had three problems. It had no expiry, so anyone who exfiltrated it could use it forever. It had no audience claim, so a token meant for the api-server could be replayed against any other JWT-validating service that happened to trust the same signing key. And it lived in etcd as a Secret, base64-encoded but not encrypted, accessible to anyone with read on the namespace. Together these properties were a constant source of incidents: a leaked Pod log with the token visible, a malicious sidecar reading the file from the shared volume, a backup of etcd carrying every token ever issued.
The modern mechanism is the projected ServiceAccount token, introduced as the BoundServiceAccountTokenVolume feature, default-on in 1.21, and mandatory by 1.24 (the legacy Secret token is no longer auto-created). Instead of materialising to a Secret, the kubelet asks the api-server's TokenRequest API for a fresh JWT scoped to a specific Pod, with a specific audience claim, with a specific expiry, and projects it into the Pod's filesystem through a projected volume. The kubelet rotates the token roughly every ten minutes by default, and the client library re-reads the file periodically. If the Pod is deleted, the token's kubernetes.pod.uid claim no longer matches anything, and the api-server invalidates it.
# Pod spec — projected ServiceAccount token with audience and expiry apiVersion: v1 kind: Pod spec: serviceAccountName: app-prod containers: - name: app image: registry.example.com/app:v1.42.0 volumeMounts: - name: api-token mountPath: /var/run/secrets/api readOnly: true volumes: - name: api-token projected: sources: - serviceAccountToken: path: token audience: vault.example.com # bound to one consumer expirationSeconds: 3600 # 1h, kubelet rotates at 80% - configMap: name: kube-root-ca.crt items: [ { key: ca.crt, path: ca.crt } ]
The audience claim is the key innovation. By default, a projected token is issued with the api-server's own URL as its audience, so it is valid only against the cluster's api-server. When a Pod needs to authenticate to a third party (Vault, AWS via IRSA, GCP via Workload Identity, a private signing service), you set the audience to that third party's identifier. The third party validates the JWT signature against the cluster's jwks_uri (exposed by the api-server at /openid/v1/jwks) and rejects any token whose aud claim does not match its own identifier. A token leaked from a log line cannot be replayed against the api-server, because its audience is not the api-server. The blast radius of any single leaked token is bounded to the one service it was minted for.
Bound tokens close the second hole as well: the kubelet binds the token to a specific Pod's UID via the kubernetes.io/pod.uid claim, and the api-server's ServiceAccount authenticator checks at validation time that the Pod still exists and still has that UID. A token whose Pod has been deleted is rejected even if it has not yet expired. This is the "Bound" in BoundServiceAccountTokenVolume — the token's lifecycle is bound to the Pod's lifecycle, not to a Secret's lifecycle, and there is no cluster artefact left behind for an attacker to grep through. For the JWT mechanics underneath, see the JWT lifecycle simulator; for the OAuth context that JWTs grew up in, the OAuth guide covers the bearer-token model end-to-end.
Migration gotcha: older client libraries (Java client < 16, Python < 24) read the token file once at startup and cache it. With projected tokens rotating every ten minutes, those clients receive 401 Unauthorized after about an hour and never recover until the Pod restarts. The fix is a client-library upgrade, not a longer expiry. Long expiries defeat the entire point.
OIDC: humans, dex, Keycloak, and the refresh dance.
Authenticating humans against the api-server with client certificates does not scale past about a dozen engineers. The replacement, in essentially every production install since 1.16, is OpenID Connect — an authentication layer on top of OAuth 2.0 that produces signed, short-lived, revocable JWTs. The api-server has a built-in OIDC authenticator that knows how to fetch a provider's jwks_uri, validate signatures, and translate claims into users and groups. The provider can be anything that speaks OIDC: a managed identity service like Okta, Azure AD, or Google Workspace; a self-hosted server like Keycloak, dex, or Authelia; or the cloud's native broker like AWS IAM Identity Center or GCP Cloud Identity. The api-server does not care; it only cares about the JWT.
The configuration is a small set of flags at api-server startup. You point it at the provider's issuer URL, tell it which client ID to expect in the aud claim, name the username claim and the groups claim, and it handles the rest. The api-server fetches the provider's /.well-known/openid-configuration document at startup, caches the JWKS for the period the provider says it should, and from that point forward every incoming bearer token is checked against the cached keys. There is no callback to the provider on each request; OIDC validation is offline once the keys are loaded.
# kube-apiserver flags — OIDC configured against a self-hosted dex --oidc-issuer-url=https://dex.example.com --oidc-client-id=kubernetes --oidc-username-claim=email --oidc-username-prefix=oidc: --oidc-groups-claim=groups --oidc-groups-prefix=oidc: --oidc-ca-file=/etc/kubernetes/oidc-ca.crt # Result on a successful login — # user.Info { name: "oidc:alice@example.com", groups: ["oidc:platform-eng", "oidc:on-call"] }
The username and groups prefixes are not cosmetic; they are a defence-in-depth measure. Without the oidc: prefix, an OIDC user named system:masters in the IdP would be granted cluster-admin via the hard-coded group bypass mentioned earlier. By prefixing every OIDC-derived identity with a distinguishing string, you make it impossible for an OIDC identity to collide with the cluster's built-in groups. Always set the prefix. Always.
The client side is more interesting. The api-server expects an id_token in the Authorization: Bearer header, but kubectl is not a browser; it cannot run the OAuth interactive login flow itself. The standard solution is the kubectl oidc-login credential plugin, an external binary registered in kubeconfig under the exec credential provider. When kubectl needs a token, it shells out to the plugin, which runs the OIDC authorisation-code-with-PKCE flow, opens a browser to the IdP, captures the redirect, exchanges the code for an id_token + refresh_token, caches them locally, and prints the id_token on stdout in a JSON format kubectl knows how to read. From that point on, kubectl uses the id_token as a bearer token to the api-server. When the id_token expires (typically one hour), the plugin uses the refresh token to silently mint a new one without re-prompting the human. Refresh tokens are typically valid for a day or a week, configurable at the IdP, and they are the durable credential — the id_token is ephemeral.
The flow has three trust boundaries that operators routinely conflate. The api-server trusts the IdP's signing key; the kubectl plugin trusts the IdP's TLS cert; the IdP trusts the user's password and second factor. Compromising any of the three compromises the cluster's human identity layer, but the impact is different. A compromised api-server signing-key trust forces every cluster trusting that IdP to be reconfigured; a compromised plugin TLS trust opens up MITM attacks on the redirect; a compromised user credential is a single user, and the IdP's session revocation handles it. Most production setups put the IdP behind a separate CA from the cluster's internal CA so that the blast radius of either compromise is bounded. For the OAuth and OIDC mechanics in detail, see how OAuth works and how OIDC works; this page assumes you know the difference.
Operational gotcha: the api-server caches the OIDC provider's JWKS but does not re-fetch on every signing-key rotation. If the IdP rotates its signing key faster than the cache TTL, valid tokens will be rejected. The remediation is to set the IdP's key rotation cadence longer than the api-server's oidc-ca-file reload interval, or to restart the api-server after a key rotation. Most teams pin a long rotation window for exactly this reason.
Webhook authentication: when nothing else fits.
The webhook authenticator is Kubernetes' escape hatch for credential formats nobody anticipated. If your tokens are not JWTs, your certs are not X.509, your identity is brokered by a system that does not speak OIDC, you can still wire it in: configure the api-server with --authentication-token-webhook-config-file, and any bearer token that is not validated by an earlier authenticator gets POSTed to a webhook URL that returns either an authenticated identity or a refusal. AWS EKS uses this for IAM-derived authentication (a static binary called aws-iam-authenticator runs as the webhook, validates a signed STS token, returns an IAM-derived user-and-groups identity); GKE used it historically for Google account integration; many enterprise installs use it to bridge to LDAP, Kerberos, or proprietary SSO systems.
The protocol is straightforward. The api-server constructs a TokenReview object containing the bearer token it received, serialises it as JSON, and POSTs it to the configured webhook over mTLS. The webhook inspects the token, performs whatever validation it likes (call out to its own database, verify a signature against an external KMS, check group memberships in LDAP), and returns a TokenReview with status.authenticated: true and a populated user field, or with authenticated: false if it does not recognise the token. The api-server treats a failure response as "abstain" and continues to the next authenticator; there is no way for a webhook to actively deny a request, only to recognise or not recognise it.
This is the same protocol used by the kubelet's webhook authentication: when something hits kubelet's :10250, the kubelet POSTs a TokenReview back to the api-server, which acts as the webhook. The kubelet does not run its own authn chain; it delegates entirely. This is why misconfiguring api-server authn breaks kubelet authn at the same time, and why the same identity (a ServiceAccount, an OIDC user) authenticates uniformly whether you are talking to api-server or to a kubelet.
The performance and reliability cost is real. Every request whose token is not recognised by the static or built-in authenticators incurs an HTTPS round-trip to the webhook. The api-server caches the result for --authentication-token-webhook-cache-ttl (default two minutes), but on a cold cache or a cluster doing many distinct identities, the webhook becomes a hot dependency. EKS has had multiple high-profile outages whose root cause was the IAM authenticator pod going down or being slow; the cluster effectively becomes unauthenticated until it recovers. The mitigation is the same shape as for any sidecar dependency — run multiple replicas, set realistic timeouts (the api-server's --authentication-token-webhook-cache-ttl and the webhook's own timeout), monitor the webhook's latency separately from the api-server's, and make sure the webhook itself is not authenticating to the api-server (a recursive deadlock that has bit more than one team).
There is also a webhook authorizer, structurally identical but for the authorisation step, which we cover in Part 07. Do not confuse the two: webhook authn extracts an identity, webhook authz decides whether an identity may do something. They are configured separately (--authentication-token-webhook-config-file vs --authorization-webhook-config-file), they have different request and response payloads (TokenReview vs SubjectAccessReview), and they run at different points in the pipeline. A surprising number of operators run both, against the same external service, to bridge a wholly external identity-and-policy system into Kubernetes.
Production constraint: the webhook authenticator must not authenticate to the api-server using a token that itself requires the webhook to validate. This recursive dependency is the single most common deadlock in webhook deployments. The fix is to give the webhook its own client certificate signed directly by the cluster CA, so its identity is established at the TLS handshake and the webhook chain is never invoked for it.
RBAC: rule evaluation, deeply.
RBAC is the authoriser that does most of the work in any modern Kubernetes cluster. It has four object kinds and one evaluation algorithm, and the relationships between them are the entire model. Role and ClusterRole describe a set of permissions: which API groups, which resources, which verbs. A Role is namespaced and only confers permissions inside that namespace; a ClusterRole is cluster-scoped and can confer permissions either across the whole cluster (when bound by a ClusterRoleBinding) or scoped to a namespace (when bound by a RoleBinding). RoleBinding and ClusterRoleBinding are the bindings: each names one Role/ClusterRole and a list of subjects (users, groups, ServiceAccounts) and grants the role's permissions to those subjects.
The rule-evaluation algorithm is straightforward and worth memorising in full. When a request arrives at the RBAC authoriser, it has a known user.Info (from authn) and a known target (verb + resource + namespace, parsed from the URL path). RBAC enumerates every binding in the cluster (every ClusterRoleBinding, plus every RoleBinding in the namespace of the request) and for each binding checks whether any of the binding's subjects match the user. A subject match is exact-name on user, exact-name on ServiceAccount, or any group-membership match. For each matching binding, the referenced role's rules are scanned, and each rule is checked for whether its verbs, resources, and (optional) resourceNames include the request. The first rule that matches returns "allow". If no rule across no binding matches, RBAC returns "no opinion", and the next authorizer (typically nothing — RBAC is usually last) gets a turn. Crucially, RBAC has no deny rules. There is no way to express "alice can do X except Y in namespace Z"; you express it as "alice cannot do anything in Z" plus "alice can do X in non-Z namespaces".
A few non-obvious behaviours follow from this algorithm. First, ClusterRoles bound by a RoleBinding only confer permissions in the binding's namespace, even though the role itself is cluster-scoped — this is the standard pattern for reusing a role definition across many namespaces without duplicating it. Second, a ClusterRole that lists resourceNames still binds against any namespace if used in a ClusterRoleBinding, which is rarely what you want for a sensitive named resource — prefer a RoleBinding in the relevant namespace. Third, the special verbs * (any verb), get + list + watch (the read trio), and create + update + patch + delete (the mutating quad) are the ones you see in 95% of real rules; learn to recognise them at sight.
Aggregating ClusterRoles is the project's answer to the problem of extending the built-in roles. The well-known view, edit, and admin ClusterRoles ship empty; their rules are populated by aggregation. Each is annotated with a label selector (aggregationRule.clusterRoleSelectors), and the ClusterRoleAggregation controller in kube-controller-manager continuously merges the rules of every ClusterRole that matches the selector into the aggregating role. The mechanism is how CRD authors extend RBAC: when you ship a CRD, you also ship a ClusterRole labelled rbac.authorization.k8s.io/aggregate-to-admin: "true", and your custom resource's verbs become available to anyone bound to admin. No human action needed.
# A CRD operator's ClusterRole that aggregates into the built-in admin role apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: app-operator-aggregate-admin labels: rbac.authorization.k8s.io/aggregate-to-admin: "true" rbac.authorization.k8s.io/aggregate-to-edit: "true" rules: - apiGroups: [apps.example.com] resources: [databases, databases/status, databases/scale] verbs: [get, list, watch, create, update, patch, delete] - apiGroups: [apps.example.com] resources: [databasebackups] verbs: [get, list, watch, create] --- # The aggregating ClusterRole (built-in, ships empty) apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: admin aggregationRule: clusterRoleSelectors: - matchLabels: rbac.authorization.k8s.io/aggregate-to-admin: "true" rules: [] # populated by ClusterRoleAggregation controller
Operationally, RBAC is the single most common source of "why doesn't this work" complaints. The diagnostic tools are excellent and rarely used. kubectl auth can-i and the --as flag let you simulate any user against any verb; kubectl auth reconcile applies RBAC manifests with a sane diff; kubectl describe clusterrole admin shows the merged rule set after aggregation. The audit2rbac tool from the SIG-Auth maintainers takes an audit log of denied requests and emits the minimum RBAC rules to allow them — useful when you are tightening a previously-permissive role and need to know what to keep.
Tuning rule of thumb: the api-server keeps an in-memory index of bindings keyed by subject, so RBAC evaluation is sub-millisecond even on clusters with thousands of bindings. The exception is wildcards: a ClusterRole with verbs: ["*"] on resources: ["*"] defeats the index and forces a linear scan. If your audit log shows RBAC latency p99 over a millisecond, somebody has shipped a wildcard rule. Find it.
Node, Webhook, ABAC, and impersonation.
RBAC carries the human-and-controller load, but it is not the only authoriser, and the others exist for very particular jobs. The Node authorizer is the reason a kubelet cannot read every Secret in the cluster. Conceptually, kubelet is a kind of controller — it watches Pods assigned to its node, fetches the Secrets and ConfigMaps and PVCs those Pods reference, and reports status back. If kubelet authenticated as a user with broad permissions, it could read any Secret in any namespace, which is a known-bad blast radius for the worst node-compromise scenarios. The Node authorizer narrows it. It runs before RBAC in the default chain (--authorization-mode=Node,RBAC) and grants kubelet access only to objects that the kubelet's own node is reasonably expected to need: Pods scheduled to this node, Secrets and ConfigMaps mounted by those Pods, the Node object for this node, and a small handful of others. Anything else, the Node authorizer abstains and RBAC takes over, and RBAC is configured to grant kubelets nothing extra.
The Node authorizer's identity check is exact and load-bearing. It only grants permissions to clients authenticated as the user system:node:<nodeName> in the group system:nodes. The kubelet bootstrap process, the first time a node joins the cluster, uses a Node Authorizer-aware Bootstrap Token to issue itself a client certificate with exactly that subject. If you ever see a request denied with "Node authorizer: not from a node identity", it is almost always because the kubelet is presenting the wrong cert or the cert has expired and the kubelet is using its bootstrap token instead. The Node authorizer reference lists the exact set of allowed verbs by resource; it is shorter than you might expect, perhaps thirty entries.
The Webhook authorizer is the structural counterpart of webhook authn: every request that survives earlier authorisers gets POSTed (as a SubjectAccessReview) to an external HTTP endpoint, which returns yes/no/no-opinion. This is how policy systems like OPA Gatekeeper used to integrate before they moved to ValidatingAdmissionWebhook (admission is more flexible, since it sees the request body and can mutate); it is still the integration point for cloud IAM systems that want to do per-request authorisation against an external policy engine. The performance cost is the same as for webhook authn. Every request that does not short-circuit incurs an HTTPS round-trip, which is why webhook authz is rare in practice. RBAC carries the load; webhook authz handles the exceptions.
ABAC (Attribute-Based Access Control) is the original Kubernetes authoriser, configured by feeding the api-server a JSONLines policy file via --authorization-policy-file. Each line is one rule, of the form "if the user is X and the verb is Y and the resource is Z, allow". It works, but it has none of RBAC's reusability (no role concept, no binding concept, every rule duplicated for every user), and it requires an api-server restart to reload the policy file. ABAC is documented for historical reasons and because some hardened government installs still use it (the policy file is easier to audit than RBAC objects scattered across etcd), but every modern install should use RBAC. Treat ABAC as a known-deprecated curiosity.
Impersonation is the orthogonal feature that ties the chains together. By adding the headers Impersonate-User, Impersonate-Group, and optionally Impersonate-Uid / Impersonate-Extra to a request, an authenticated client can ask the api-server to act as a different user. The api-server first authorises the original caller against the special impersonate verb on the users/groups/serviceaccounts resources (only specifically-permitted callers can impersonate), and if that succeeds, runs the rest of the pipeline as the impersonated identity. kubectl exposes this via the --as and --as-group flags, and it is the standard mechanism for two scenarios: testing what an end-user would see ("can I list pods in their namespace as them?") and bridge-controllers that need to act on behalf of a user rather than as themselves. The audit log captures both identities, the impersonator and the impersonated, so attribution survives.
# Simulate alice's view of a namespace, from your admin context $ kubectl get pods -n prod --as=alice@example.com --as-group=oidc:platform-eng # the api-server runs alice's RBAC, returns the pods alice is allowed to see # Probe whether a ServiceAccount can do a specific verb $ kubectl auth can-i create deployments \ --as=system:serviceaccount:ci:builder -n prod # yes / no, with the source rule cited if --warnings is set # What the impersonating request looks like on the wire GET /api/v1/namespaces/prod/pods HTTP/2 Authorization: Bearer eyJ… # the impersonator's token Impersonate-User: alice@example.com Impersonate-Group: oidc:platform-eng Impersonate-Group: system:authenticated
Impersonation is also the mechanism by which the aggregated apiservers and the API aggregation layer present a coherent identity to extension api-servers — the front-proxy authenticates with mTLS, then passes the original user's identity in X-Remote-User and friends, which is functionally the same as impersonation but with a different header set. The principle is identical: chain trust, never forge it.
Production posture: impersonation is an escape hatch. The standard policy is to grant the impersonate verb to a small group of cluster operators only, never to ServiceAccounts, and to alert on any audit-log entry where the impersonator and impersonated differ. A leaked admin token that can impersonate is, for all practical purposes, every identity in the cluster.
Audit logging: the meta-system.
Audit logging is what makes the rest of this page believable. Without it, "alice deleted the prod database namespace at 3:14am" is unprovable; with it, every request to the api-server is attributable to an identity, a verb, a resource, a response code, a timestamp. The audit subsystem is configured by passing the api-server an audit policy file (--audit-policy-file) and a backend (--audit-log-path for log-to-file, or a webhook for streaming to a log aggregator). The policy is an ordered list of rules; each request is matched against rules in order, and the first match decides what to record.
There are five audit levels and you should know all of them. None records nothing, used for high-volume read traffic that would otherwise drown the log. Metadata records who, what, when, response code, but not the request or response body; the typical level for routine traffic. Request adds the request body, useful for tracking what was being asked. RequestResponse adds the response body too, used for the most sensitive resources where you want to know what was returned. The fifth is the implicit "do not match any rule" level, which is treated as None. The common mistake is logging every request at RequestResponse on a busy cluster — the audit log becomes the largest data source in the cluster, costing more than the cluster's actual data. The discipline is to log mutations at RequestResponse, sensitive reads (Secrets) at Request, and routine reads at Metadata.
# /etc/kubernetes/audit-policy.yaml — sane production starter apiVersion: audit.k8s.io/v1 kind: Policy omitStages: - RequestReceived # cut log volume by ~50%; keep only ResponseStarted/Complete rules: # 1. Never log read traffic on the high-cardinality endpoints - level: None verbs: [get, list, watch] resources: - group: "" resources: [endpoints, leases, events] # 2. Secrets and ConfigMaps with stack-stored secrets — full request/response on writes - level: RequestResponse verbs: [create, update, patch, delete] resources: - group: "" resources: [secrets, configmaps] # 3. RBAC and admin objects — RequestResponse, every verb - level: RequestResponse resources: - group: rbac.authorization.k8s.io resources: [roles, rolebindings, clusterroles, clusterrolebindings] # 4. Everything else — Metadata is enough for forensics - level: Metadata omitStages: [RequestReceived]
Each audit event records the authenticated identity, the impersonated identity (if any), the verb, the resource, the namespace, the source IP, the user-agent, the response code, and the stage of the request at which the event was emitted (ResponseStarted, ResponseComplete, Panic). It also records the names of every authentication and authorization decision made against the request — in 1.27+, you can see which RBAC rule allowed it, which authenticator established the identity, which webhook decided. This is the data audit2rbac and similar tools mine to generate minimal-RBAC policies after the fact.
A full forensic chain across the auth subsystem looks like this. An incident is detected at the application layer (a Pod has been deleted that should not have been). The audit log is queried for the matching verb=delete, resource=pods, name=<pod> entry. That entry names the user, the source IP, and the client TLS fingerprint. The user is OIDC-derived, so the IdP's audit log is queried for the corresponding session establishment, which gives a device, a location, and an MFA event. The session is correlated with the user's calendar (was alice supposed to be on call) and the change-management record (was there an approved ticket). The whole chain is reconstructible because every layer logged its own decision. This is what good auth looks like in production: not just a working access-control system, but a working forensic story.
The audit log itself has to be written somewhere durable, off the api-server's local disk, or it is useless after a node failure. The standard pattern is to ship the log to a webhook backend that writes to a separate cluster's centralised log store, with retention measured in years for regulated environments. The audit webhook protocol is the same as the audit log format, just streamed; the api-server batches events and POSTs them to the configured URL with the audit.k8s.io content type.
And the rest of the Semicolony ladder. The api-server sub-page traces the full request pipeline that wraps these chains; the architecture sub-page covers the network boundaries and TLS that make any of it meaningful. For the JWT and OAuth fundamentals, the OAuth and OIDC guides are the prerequisites most engineers think they have but actually do not, and the JWT lifecycle simulator lets you watch a token rotate through issue, refresh, replay, and revoke in real time.
One closing thought. Kubernetes' auth model is, on the surface, an alphabet soup of acronyms (RBAC, OIDC, ABAC, mTLS, SAR, TR, SAN, JWT), but the underlying shape is simple and durable. Two ordered chains, four authenticators, three authorisers, one audit trail. Everything else is details, configurations, integrations. The shape has been stable since 1.6, and the additions since (projected tokens, structured policy, KEP-3331's structured authn) have all been refinements within it, not replacements of it. That is unusual for a system this complex, and it means the time you spend learning the shape pays back across every cluster you will ever touch. Learn the chains; the integrations are footnotes.
Keep going.
Eleven stages from TCP to etcd: TLS, authn, authz, admission, schema, conversion, storage, watch, APF.
Read 01The control plane / data plane split, ports, TLS boundaries, leader election, the watch model.
Read ↗The auth-code-with-PKCE flow, id_token vs access_token, refresh, JWKS, the trust boundaries.
Read ↗Watch a token issued, used, refreshed, replayed, and revoked. The mechanics under projected SA tokens.
Open