SLA Uptime Calculator

How much downtime does your SLA actually allow?

Allowed downtime
0 d 00 h 00 m 00 s

Quick reference

SLA Per month Per year

SLA, SLO, SLI; What's the difference?

SLI (Service Level Indicator)

The actual measurement. A ratio of good events to total events over a time window. Example: 99.95% of requests returned successfully in the last 30 days.

SLO (Service Level Objective)

Your internal target for an SLI. "We aim for 99.9% availability" is an SLO. Miss it and your team gets paged. It's a threshold you set for yourself.

SLA (Service Level Agreement)

A contract with your customers. If you promise 99.9% uptime in your SLA and miss it, you owe credits or refunds. SLAs are typically looser than SLOs, so you want a buffer before contractual penalties kick in.

In practice: You measure SLIs, set internal SLOs tighter than your public SLA, and only burn error budget when SLIs drop below SLO thresholds. If your SLA is 99.9%, your SLO might be 99.95% or higher so you have margin before customers notice.

How to Calculate Allowed Downtime from an SLA Percentage

The formula is straightforward: take the total minutes in your period, multiply by the downtime percentage (100% minus your SLA target), and you get your allowed downtime in minutes.

Allowed downtime = Total minutes × (1 - SLA%)

Monthly example (99.9% SLA):

43,200 min × (1 - 0.999) = 43,200 × 0.001 = 43.2 minutes

A month has 43,200 minutes (30 days × 24 hours × 60 minutes). A year has 525,600 minutes. Each additional "nine" in your SLA cuts allowed downtime by a factor of 10. Going from 99.9% to 99.99% doesn't sound like much, but it means going from ~43 minutes per month to ~4.3 minutes. That's the difference between "we can do a planned maintenance window" and "every second of downtime counts."

Error Budgets Explained

An error budget flips the SLA question on its head. Instead of asking "how much uptime do we need?", it asks "how much downtime can we afford to spend?" That reframing changes how teams make decisions.

If your SLO is 99.9% monthly, you have a budget of 43.2 minutes. Every incident eats into that budget. A 15-minute outage leaves you with 28.2 minutes for the rest of the month. This makes risk tangible: that risky deployment isn't "probably fine," it costs 5 minutes of budget if it goes wrong.

Google's SRE teams popularized this approach. When the budget is healthy, engineering moves fast: ship features, experiment, take calculated risks. When the budget is nearly spent, the team shifts to reliability work: fix flaky tests, improve rollback speed, add better monitoring. No arguments about priorities needed; the number decides.

The key insight: some downtime is acceptable. Chasing 100% uptime is infinitely expensive and slows down everything else. Error budgets let you move fast while staying accountable to your customers.

Common SLA Uptime Tiers and When to Use Them

99% (two nines) -7h 18m/month

Fine for internal tools, staging environments, and non-critical batch processing. You get over 3.5 days of downtime per year. Most internal dashboards and admin panels operate at this level, and nobody complains.

99.9% (three nines) -43m 50s/month

The most common SLA for SaaS products and business web applications. Allows about 8 hours and 46 minutes of downtime per year. Achievable with basic redundancy, health checks, and a solid deployment process. Most startups and mid-size companies target this tier.

99.99% (four nines) -4m 23s/month

Required for payment processing, authentication services, and APIs that other businesses depend on. Roughly 52 minutes of downtime per year. Achieving this demands redundant infrastructure across availability zones, automated failover, and zero-downtime deployments.

99.999% (five nines) -26s/month

Telecommunications, emergency services, and core financial infrastructure. About 5 minutes of downtime per year. At this level, even rolling upgrades and certificate rotations need to be seamless. Very few organizations genuinely operate at five nines end-to-end.

Why SLA Monitoring Matters for Your Infrastructure

Knowing your SLA target is one thing. Knowing whether you're actually meeting it is another. Without continuous monitoring, you're guessing at your actual availability. A 3-minute blip at 2 AM that nobody noticed still counts against your SLA, and those small incidents add up faster than most teams expect.

The worst scenario is finding out you've breached your SLA from a customer, not from your own monitoring. By then, the damage is done: credits are owed, trust is lost, and you're debugging an incident that happened hours or days ago with cold logs and fuzzy memories.

Effective SLA monitoring means checking your services every 30-60 seconds, tracking availability over rolling windows that match your SLA period, and alerting your team before the error budget runs out, not after. Set up warnings at 50% budget consumed and critical alerts at 80%. That gives you time to react before a breach.

Related Resources

Track your uptime and error budget automatically

Start monitoring with fivenines.io