# Customer-facing SLA + SLO error-budget worksheet (AWS)

Use this before you write an availability number into a customer contract. The
rule that breaks most SLAs: **you cannot promise more uptime than your
dependency stack delivers, composed.** This worksheet does the composition math
and the error-budget conversion.

> Convention: 30-day month = 43,200 minutes. Adjust if your contract defines
> the measurement window differently (some use a 30.44-day average).

## 1. Availability → downtime budget

| Availability | Downtime / 30-day month | Downtime / year |
|--------------|--------------------------|------------------|
| 99%      | 432 min (7.2 h)   | 3.65 days  |
| 99.5%    | 216 min (3.6 h)   | 1.83 days  |
| 99.9%    | 43.2 min          | 8.77 h     |
| 99.95%   | 21.6 min          | 4.38 h     |
| 99.99%   | 4.32 min          | 52.6 min   |
| 99.999%  | 25.9 sec          | 5.26 min   |

## 2. Compose your dependency SLA (serial path)

Multiply the SLA of every service on the **critical request path**. A request
that must touch ALB → EC2 → RDS → S3 is only as available as the product.

| AWS dependency | Published SLA (verify current) | On critical path? |
|----------------|-------------------------------|-------------------|
| EC2 (multi-AZ / Region-level) | 99.99% | |
| EC2 (single instance) | 99.5% | |
| RDS Multi-AZ | 99.95% | |
| S3 Standard | 99.9% | |
| ALB / NLB | 99.99% | |
| (add yours) | | |

**Worked example** — ALB (99.99%) × EC2 multi-AZ (99.99%) × RDS Multi-AZ
(99.95%) × S3 (99.9%):

```
0.9999 × 0.9999 × 0.9995 × 0.999 = 0.99830  ≈  99.83%
```

A composite dependency floor of **~99.83%** means ~73 min/month of expected
downtime from dependencies alone. **You cannot contractually promise 99.9%**
(43.2 min) on this stack without changing it.

## 3. Three ways to raise the ceiling

1. **Remove the weak link from the path** (cache S3 reads, make S3 async, etc.).
2. **Add redundancy** (multi-Region, read replicas, graceful degradation) so a
   single dependency failure does not fail the request.
3. **Lower the promise.** Promise 99.5%, deliver 99.8%, keep the goodwill.

## 4. SLO vs SLA — set the internal target tighter

- **SLA** = the external promise (with penalties). Set it *below* your measured
  capability.
- **SLO** = the internal target you operate to. Set it *above* the SLA so the
  error budget between them is your warning track.
- **Error budget** = (1 − SLO) × window. Burn it on releases and risk; freeze
  changes when it is exhausted.

Example: SLA 99.5% (216 min/mo), SLO 99.9% (43.2 min/mo). The ~173-min gap is
your monthly error budget before you risk breaching the contract.

## 5. What AWS service credits actually cover

AWS SLA credits (Compute SLA: 10%-100% of the affected service bill) reimburse
**your AWS bill**, not your customers' losses or your SLA penalties. Never model
AWS credits as funding your own SLA payouts — they are not remotely the same
order of magnitude. Request them via AWS Support; they apply as future credits.
