Should you use Kubernetes?

Q: Should you use Kubernetes?

Not until you run a real fleet — many services across many machines, with multiple teams deploying independently. Until then a PaaS or a few VMs is the better answer; the operational cost of Kubernetes is real, and at small scale it outweighs everything it gives back. Kubernetes is an operating system for a fleet of machines. It schedules containers onto servers, restarts them when they die, scales them up and down, rolls out new versions gradually, and wires up networking and service discovery across the whole thing. That is a hard set of problems — when you have many services running across many machines. Most teams asking "should we use Kubernetes?" do not have that problem yet.

Kubernetes solves a fleet problem you might not have

Kubernetes is an operating system for a fleet of machines. It schedules containers onto servers, restarts them when they die, scales them up and down, rolls out new versions gradually, and wires up networking and service discovery across the whole thing. That is a hard set of problems — when you have many services running across many machines. Most teams asking "should we use Kubernetes?" do not have that problem yet.

The honest question is: how many services do you run, across how many machines, and how many teams deploy them independently? "Three services on two boxes, one team" means Kubernetes is solving problems you do not have while creating ones you do. "Forty services, dozens of nodes, six teams" means it starts looking less like overkill and more like the thing keeping you sane.

When Kubernetes earns its complexity

When you do operate a fleet, Kubernetes is excellent. Self-healing means a crashed container comes back without anyone paged. Declarative deployments mean you describe the desired state and the cluster reconciles toward it, with gradual rollouts and automatic rollbacks when health checks fail. Autoscaling responds to load without intervention. Bin-packing schedules workloads onto nodes efficiently so you are not paying for idle machines.

The other real benefit is a common platform. When every team deploys to the same primitives — the same way to describe a service, a config, a secret, a scaling policy — you stop reinventing deployment per team, and you get portability across clouds and on-prem as a bonus. For an organisation with many teams, that consistency is worth a lot, and a pile of bespoke scripts cannot give it to you.

These benefits compound with scale. The question is never whether Kubernetes is capable. It is whether your scale has crossed the line where its capabilities outweigh its weight.

When to stay simpler

One or two services? A PaaS, a managed container host, or a few VMs behind a load balancer ships faster and demands far less specialist knowledge. The orchestration problems Kubernetes solves are problems of running many things across many machines, and you do not have many of either yet.

Skip it, too, when nobody on the team wants to own cluster upgrades, networking, and YAML — that tax lands on someone whether you plan for it or not — and when the real motive is padding the architecture rather than solving a scaling or fleet problem. If you cannot name the fleet problem, you are not ready.

What Kubernetes actually costs

The learning curve and operational burden are steep. You are now responsible for cluster networking, ingress, storage classes, RBAC, secrets, upgrades, and a vocabulary of objects — pods, deployments, services, config maps — that someone has to actually understand to debug an incident at 3am. When something breaks, the failure could be in your app, the container, the pod spec, the networking layer, or the cluster itself, and narrowing that down takes real expertise.

Running your own control plane multiplies this. Keeping the cluster's brain healthy and upgraded is a serious job, which is why almost nobody who can avoid it does — managed offerings exist to take that piece off your plate. Even then, the worker-side complexity stays yours.

The deeper cost is cognitive. A small team has a fixed budget of attention, and Kubernetes eats a large slice just to keep running. Every hour debugging a networking quirk or a YAML indentation bug is an hour not on the product. At small scale, the complexity grows faster than the benefit, and you feel it.

The cost math, roughly

Three buckets on the bill. A managed control plane is a fixed monthly fee per cluster — small, but fixed, which means three small clusters cost three times the fee of one, and a cluster-per-team habit multiplies a charge that buys you nothing per copy. Worker nodes are the dominant infrastructure cost, the same machines you would have paid for as VMs. The third bucket, the one that dwarfs both at small scale, is people: a slice of an engineer permanently spent on upgrades, networking, and the platform itself.

The savings story is bin-packing. A fleet of VMs sized one-service-per-box tends to idle, because each machine was sized for its own peak. Kubernetes packs workloads onto shared nodes and runs the fleet hotter, and across many nodes that reclaimed idle capacity is real money. Across five nodes it is a rounding error. There is overhead pulling the other way, too: every node reserves a slice for the kubelet and system daemons, and small clusters burn a bigger fraction of themselves on that tax.

The rule: estimate the idle capacity bin-packing would reclaim across your current fleet and weigh it against a meaningful fraction of an engineer’s year. If the reclaimed compute is the smaller number — and under a dozen nodes it nearly always is — Kubernetes costs more than it saves, and the case has to be made on fleet operations, not money.

The trap: YAML and operational sprawl

A real deployment is not one file; it is dozens of interlocking manifests, then a templating tool like Helm to manage them, then values files layered on top. It is easy to end up with a config surface so large that nobody fully understands what is deployed, and a typo two levels deep in a values file ships a broken release that looks fine until traffic hits it.

A subtler trap is forgetting resource requests and limits. Without them, the scheduler cannot reason about what fits where, and one greedy pod consumes a node's memory and starves its neighbours — a cascading failure that is miserable to diagnose. The biggest trap is adopting Kubernetes because it is the resume-worthy, "proper" choice rather than because you have a fleet problem. Then complexity is the deliverable, and the product pays.

Kubernetes vs a PaaS or a few VMs

A PaaS takes your code and runs it — deploys, scaling, and health checks handled, almost nothing to learn. A few VMs behind a load balancer give you full control with familiar tools and a small mental model. Both carry a real product a long way, further than Kubernetes advocates tend to admit, and they leave your attention on features. Their ceiling is that they do not orchestrate many heterogeneous services across a large fleet, and at real scale you outgrow the PaaS's opinions and the VMs' manual wiring.

Kubernetes is the heavy generalist: it does the fleet orchestration the others cannot, in exchange for an operational tax the others do not charge. So the rule is to ride a PaaS or VMs until you run many services across many machines with multiple teams needing a shared platform. Then move to Kubernetes — and even then, managed, so the control plane is someone else's job.

How to adopt Kubernetes without regret

Start on a PaaS or VMs and stay there as long as it is working. Treat the move to Kubernetes as something you earn, not something you start with.

When the signals line up — many services across many machines, multiple teams needing a shared self-service platform, and either in-house expertise or a managed control plane so the cluster does not own someone's entire week — adopt it deliberately. Use a managed offering, set resource requests and limits, and wire up real health checks from day one. Reach for it before then and you have bought a heavy tax to solve a problem you do not have.