Multi-page · for engineers on call
Observability

Knowing what a system is actually doing.

Monitoring tells you the things you already knew to watch. Observability is being able to ask a new question of a running system at 3am, without shipping new code first. The whole field comes down to a few signals used well: logs, metrics, and traces, stitched together with distributed tracing so one request can be followed across a dozen services, and pointed at goals you have written down as SLOs. Get those right and most incidents turn from a guessing game into a query.

All four sub-pages are live. Practical mental models for the people who get paged, not a vendor tour.


Live deep dives

Start here.