observability
Browse all articles, tutorials, and guides about observability
3posts
Posts
⌘K
DevOps
2026-05-11|11 min read
Distributed Tracing with OpenTelemetry: From Instrumentation to Visualization
A walkthrough of instrumenting a real service with OpenTelemetry, running the Collector, and finding the slow span in Jaeger when a request hops across five microservices.
DevOps
2026-04-13|10 min read
SLOs, SLIs, and Error Budgets: A Practical Implementation Guide
Your service went down at 2 AM and nobody could agree on whether it was "bad enough" to page someone. SLOs, SLIs, and error budgets fix that. Here is how to define, measure, and act on them with real Prometheus queries and alerting rules.
DevOps
2025-03-12|6 min read
What is P99 Latency?
P99 latency measures the response time at the 99th percentile, showing how fast your slowest 1% of requests are. Learn why P99 is more important than average latency for understanding real user experience.