How to finally stop Confusing SLA, SLO, and SLI and Why the Distinction Actually Matters
I'm currently working through a distributed systems course, and one section that genuinely made me stop and think was the breakdown of SLAs, SLOs, and SLIs. I'd seen these terms thrown around before and assumed they were basically synonyms for 'uptime stuff'. However, the course laid out the distinction clearly and it clicked in a way that made me want to write it out in my own words. I think it's one of those things that sounds simple but has real depth once you sit with it.
SLA
Say you're a software company and you have a SaaS product that other companies or individuals are using as a critical part of their workflow. They want to ensure that your service operates at an expected level of quality. To make expectations clear, you draft up a Service Level Agreement, a legal contract that clearly represents your promised quality of service. Crucially, in case of contract breach, this contract also explicitly states penalties such as service credits or contract termination. SLAs are everywhere: AWS promises 99.99% monthly uptime for EC2 in its SLA, and if they miss it, you get billing credits. The key thing about SLAs is that they're written for business and legal stakeholders, not engineers. That's exactly why we need the next layer.
SLO
A Service Level Objective or SLO translates the legal jargon of your SLA into something your engineering team can actually build toward. If your SLA says: “the service will be available with a Monthly Uptime Percentage of at least 99.9%”, your SLO could say: “99.9% of HTTP requests will return a successful response within 300ms, measured over a 30-day rolling window”. Notice how the SLO provides a specific metric with a clear time constraint. SLOs are internal and owned by the engineering team. Something important to note here is that you’ll want your SLO to be stricter than your SLA. If your SLA promises 99.9% uptime, you might target 99.95%, giving you a margin of error before you are in breach of your SLA.
SLI
An SLI, Service Level Indicator, is an actual measurement. It is a number that tells you whether you’re meeting your SLO. Going back to our example: if the SLO is “99.9% of requests succeed within 300ms” then the SLI is the real-time ratio you are calculating:
SLI = (successful_requests / total_requests) * 100
You get that number from your logs, Prometheus metrics, Datadog dashboard etc… A good SLI is:
- Meaningful: It reflects user experience.
- Measurable: You can compute it reliably.
- Actionable: When it degrades, you know what to do.
Bringing it all Together
The diagram below shows how all three concepts stack up together: SLIs inform SLOs. SLOs protect SLAs. Break that chain at any point and you're either flying blind or heading for a breach.

What happens when teams skip SLOs
Imagine a startup that lands their first enterprise client. The legal teams get together and draft an SLA: "the service will be available 99.9% of the time." Everyone signs. The engineering team celebrates.
Six months later, the client calls. They're claiming a breach. The engineering team pushes back: they have logs, the servers were running, nothing crashed. But here's the problem: nobody had ever defined what "available" meant in measurable terms. No SLOs were written. No SLIs were identified. The logs existed, but nobody had agreed on which number in those logs corresponded to "99.9% availability."
Was it the server uptime? The percentage of successful HTTP responses? The percentage of users who could log in? Each metric told a different story. The client's definition and the engineering team's definition didn't match and there was nothing in writing to settle it.
That gap between the SLA and a concrete SLO didn't just cause a legal dispute. It meant the engineering team had been building and shipping for six months with no shared definition of what "working well" actually meant.
TL;DR
SLAs, SLOs, and SLIs are essential to the survival of any dev team. SLAs legally define your promised quality of service, SLOs technically define what goals your engineers should build towards to honor the SLAs, and SLIs give you the data to determine whether your are reaching these goals. Ignore any of these three and you risk getting into a situation where clients claim you are not meeting your promised quality of service and your dev team disagrees while you have no idea how to tell.