Compute — three buckets, pick one.

Every cloud compute service drops into one of three buckets: a virtual machine you treat like a server, a container scheduler that runs your image, or a function-as-a-service that runs a snippet when something happens. Picking the right bucket is the first decision in any cloud build. Once you've picked, the second decision is which managed flavour — and that's where the AWS / GCP / Azure trios all start to look the same.

1 · The three buckets

Virtual machine. You get a Linux box. SSH in, install whatever, run it. Long-running, mutable, full control. Default for legacy lifts, GPU workloads, anything that needs a specific kernel module or persistent local disk.
Container scheduler. You hand the platform an image plus a count, and it runs N copies somewhere. You stop thinking about hosts. Default for stateless web tiers, modern microservices, anything that benefits from immutable deploys and per-deploy isolation.
Function-as-a-service. You upload code, the platform wires it to an event (HTTP, queue, schedule, file landing in storage). It runs only when triggered. Default for event-glue, sporadic workloads, anything you'd otherwise build as a tiny always-on service.

2 · The AWS canonical version

Bucket	AWS service	What it actually is
VM	EC2	A virtual machine. Family + size (e.g. `m7i.large`) picks CPU/RAM/network. Bring your own AMI or use one Amazon ships.
Container (orchestrated for you)	ECS on Fargate	Tasks run on AWS-managed infrastructure. You never see the host. Good default if you don't need Kubernetes.
Container (you run K8s)	EKS	Kubernetes control plane managed by AWS; you bring the worker nodes (or use Fargate-backed pods). For teams already living in K8s.
Container (you run K8s, harder)	EC2 + self-managed K8s	kubeadm. Almost nobody does this on purpose any more.
Serverless functions	Lambda	Zip or container image, triggered by API Gateway, EventBridge, SQS, S3 events, etc. Up to 15 min per invocation.
Long-running container, no orchestration	App Runner	Push a container, get an HTTPS endpoint. The "Heroku on AWS" play.
Batch jobs	AWS Batch	Submit jobs, AWS provisions EC2 (often Spot), runs to completion, tears down. Built for HPC-shaped workloads.

At the senior level you'll be using EC2, ECS-on-Fargate, EKS, and Lambda for ~95% of decisions. The rest are real, but you reach for them less often.

3 · GCP and Azure equivalents

Bucket	AWS	GCP	Azure
VM	EC2	Compute Engine (GCE)	Azure Virtual Machines
Managed K8s	EKS	GKE	AKS
Container, no K8s	ECS / Fargate	Cloud Run	Container Apps
Serverless functions	Lambda	Cloud Functions (2nd gen, runs on Cloud Run under the hood)	Azure Functions
"Push a container, get a URL"	App Runner	Cloud Run	Container Apps / App Service
Batch	AWS Batch	Cloud Batch / Dataproc	Azure Batch

Cloud Run is the prettiest of the bunch. Push a container, get autoscaling from zero to thousands, pay per request, scale-to-zero. AWS App Runner is the equivalent and a few years behind. Azure Container Apps is the third option. If you're starting fresh and don't already live in EKS, Cloud Run-shaped services are usually the right default.

4 · How to pick — a decision diagram

Five questions, four outcomes. The unstated rule: in 2026, "I'd reach for Fargate / Cloud Run-shaped services" is the right default for new builds unless one of the upper branches answers yes.

Event-driven and bursty? Lambda / Cloud Functions / Azure Functions. The pay-per-invocation model wins.
Long-running web service, no K8s? ECS-on-Fargate, Cloud Run, Container Apps. The sweet spot.
Already on Kubernetes? EKS / GKE / AKS. The managed control plane is worth the money; running your own K8s in 2026 is a hobby.
Specific kernel, persistent local disk, GPU? EC2 / GCE / Azure VMs. The escape hatch.
Lifting a legacy stack as-is? VMs. Refactor after, not during.

5 · Six real workloads, six picks

Workload	Right pick	Why
Public web app, ~5K QPS, single region	ECS-on-Fargate behind ALB	Stateless, auto-scaling without K8s overhead. ~$2K/month at this scale.
S3-event glue (resize image on upload)	Lambda	Sporadic, ≤15 min, IAM-driven trigger. Pennies per million events.
Microservices fleet, 30+ services	EKS / GKE	K8s pays for itself past ~15 services. CRDs, namespaces, service mesh become useful.
Nightly batch ETL, 6 hours, 200-node burst	AWS Batch on Spot	Embarrassingly parallel, fault-tolerant per task. 80% cheaper than on-demand.
GPU inference (model serving)	EC2 with G/P-family + Inferentia	Need specific accelerators not all containers expose. SageMaker if you'd rather not manage.
Lifted legacy monolith (.NET on Windows)	EC2 Windows + RDS for SQL Server	Don't refactor while migrating. App Modernization is a separate workstream.

The mental shortcut. Match the workload to the smallest service that runs it. If a Lambda is enough, don't reach for Fargate. If Fargate is enough, don't reach for EKS. Each step up the chain doubles operational complexity without doubling capability.

6 · What breaks

Cold starts on functions. First invocation after idle is slow — anywhere from 50 ms (Node, simple) to several seconds (Java with a fat init). Mitigations: provisioned concurrency, Lambda SnapStart, or just keep the function warm with a scheduled ping.
15-minute Lambda cap. Long jobs that look serverless-friendly until they don't fit. Move to Step Functions, Fargate, or a Batch job.
Container cold starts. Cloud Run and Fargate scale to zero by default; the first request after idle warms a container. Usually fine, but a P99 killer if you serve a tiny amount of traffic.
EKS upgrade pain. Kubernetes versions move quarterly, control-plane upgrades are mostly painless, node upgrades are not. Plan for a day per cluster, twice a year.
VM drift. Long-lived EC2s accumulate manual changes nobody documented. The cure is immutable AMIs and replace-don't-patch — easier said than done.
Spot interruptions. Spot/Preemptible/Spot VM instances vanish with a few minutes' notice. Fine for batch; not fine for stateful workloads unless you've designed for it.

7 · Cost note

Three line items quietly run the bill:

Idle VMs. Anything left on overnight or over the weekend is pure waste. Auto-stop on schedule, or move to a service that scales to zero.
Reserved capacity vs on-demand. Steady-state workloads at on-demand prices are 30–60% more expensive than they need to be. Reserved Instances, Savings Plans, GCP Committed Use, Azure Reservations — they all give roughly the same discount in exchange for a 1- or 3-year commitment.
Spot for stateless / batch. Same hardware, ~70–90% cheaper, can vanish. Stateless web tiers and batch jobs are perfect candidates. Combine with on-demand baseline so an interruption doesn't take you down.

A reasonable mix at scale: ~60% on Savings Plans / RIs (your baseline), ~30% on Spot (for the elastic and batch portion), ~10% on-demand (to absorb surprises). Cuts the compute bill by roughly half versus all-on-demand.

Compute — three buckets, pick one.

1 · The three buckets

2 · The AWS canonical version

3 · GCP and Azure equivalents

4 · How to pick — a decision diagram

5 · Six real workloads, six picks

6 · What breaks

7 · Cost note

Further reading

Storage — object, block, file →