AWS KMS Encryption Architecture (2026): The Per-Tenant CMK Trap, the 10,000 req/s Shared Quota, and When AWS-Owned Keys Win
Quick summary: Most KMS guides stop at "enable encryption." The architecture decision that actually bites is the key boundary: split one CMK into 3,200 per-tenant keys and you pay ~$3,200/mo in key storage alone while still sharing a single 10,000 req/s symmetric quota. Here is the decision matrix, the throttle math, and the encryption-context pattern that gives per-tenant isolation without per-tenant keys.
Key Takeaways
- Most KMS guides stop at "enable encryption
- Here is the decision matrix, the throttle math, and the encryption-context pattern that gives per-tenant isolation without per-tenant keys
- As of June 2026, the hard part of KMS is not turning on encryption — it is choosing the key boundary
- This post is the architecture layer that the "enable default encryption" checklists skip
- This is for platform engineers, security architects, and CTOs designing encryption for a multi-account or multi-tenant AWS estate
Table of Contents
As of June 2026, the hard part of KMS is not turning on encryption — it is choosing the key boundary. AWS managed keys (the aws/<service> aliases) have been a legacy type since 2021; new services default to AWS owned keys, and the active decision in front of most teams is how many customer-managed keys (CMKs) to create and where to draw the lines between them. Get that wrong in a multi-tenant system and you either overpay by thousands of dollars a month in key storage or you throttle your own production traffic. This post is the architecture layer that the “enable default encryption” checklists skip.
This is for platform engineers, security architects, and CTOs designing encryption for a multi-account or multi-tenant AWS estate. We model the cost and throttle math in a downloadable KMS throttle + cost model CSV, ship a key-strategy decision matrix, and include the key-policy templates for the pattern we recommend.
Benchmark pattern (not a cited client) — A composite multi-tenant B2B SaaS: ~3,200 tenants, SSE-KMS on S3 for tenant document storage, ~40M KMS cryptographic requests/month at steady state, peaky 9–6 traffic. Modeled in the cost CSV: one-CMK-per-classification (3 keys) runs ~$123/mo all-in; one-CMK-per-tenant (3,200 keys) runs ~$3,320/mo for the same request volume — the extra ~$3,200/mo is pure key-storage tax. Turning on S3 Bucket Keys drops the request portion from ~$120/mo to ~$1.80/mo by cutting request volume ~99%. None of those three changes the shared 10,000 req/s symmetric quota.
The decision is the key boundary, not the algorithm
KMS gives you three ownership tiers, and the cost/control trade-off is stark:
| Tier | When to use | Cost | You can audit/rotate? |
|---|---|---|---|
| AWS owned key | Encryption-by-default, no need to control or audit the key | $0 | No |
AWS managed key (aws/<service>) | Legacy; no new ones created since 2021 | $0 storage, per-request billed | View only |
| Customer managed key (CMK) | You need policy, rotation, deletion, audit, or grants | $1/key/mo + rotation + per-request | Yes |
Opinionated take: default to AWS owned keys for convenience workloads, and step up to a CMK only when you can name the control you need — a specific key policy, a rotation cadence a regulator demands, an audit requirement, BYOK, or grant-based delegation. “We should use CMKs everywhere for security” is how teams end up with thousands of unaudited keys that satisfy no actual requirement.
Once you are on CMKs, the boundary question is where the money and the throttling live.
The per-tenant CMK trap
The instinct in multi-tenant systems is one key per tenant — it feels like stronger isolation. Two things make it the wrong default:
- Cost scales linearly with tenant count. Every CMK is $1/month to exist, plus up to $2/month if you rotate it (first and second rotation each add $1; the AWS KMS pricing page has current numbers). At 3,200 tenants that is ~$3,200/month before a single
Decrypt. - It does not buy you throughput. This is the part that surprises people.
KMS request-rate quotas are enforced per account + per Region + per key type — not per key. In most Regions the Cryptographic operations (symmetric) request rate is 10,000 requests/second shared across every symmetric CMK in that account and Region, HMAC keys included. (RSA keys share a separate 1,000 req/s pool; ECC/SM2 share their own; all are Region-dependent and adjustable except the CloudHSM key store quota.) AWS’s own example: at a 10,000 req/s symmetric quota, 9,500 GenerateDataKey + 1,000 Encrypt requests in one second gets throttled because together they exceed the shared pool.
So splitting one CMK into 3,200 per-tenant keys does not give you 3,200 × 10,000 req/s. Every call still draws from the same 10,000 req/s account+Region pool. You bought the storage bill of 3,200 keys and none of the throughput you imagined.
What broke — A batch re-encryption job (key-rotation hygiene, rewriting ~9M S3 objects over a weekend) ran on a shared classification CMK with no data key caching. It generated ~14,000
GenerateDataKeyreq/s at peak against a 10,000 req/s symmetric quota. KMS returnedThrottlingException, the job’s retry logic hammered the quota harder, and SSE-KMS-backed S3 reads for live tenant traffic — which share the same quota — started failing with 503s. Detected via a CloudWatch alarm on KMSThrottlingExceptioncount plus a spike in S3 5xx. Fix that worked overnight: throttled the batch job to 6,000 req/s, enabled data key caching for the re-encrypt path (cutGenerateDataKey~90%), and added S3 Bucket Keys to the buckets. The “per-tenant keys would have isolated this” instinct is wrong — per-tenant keys share the same quota; only caching, Bucket Keys, or a quota increase actually move the ceiling.
Tenant isolation without per-tenant keys
You get the same cryptographic isolation guarantee from one classification CMK + encryption context:
- Encrypt every object with
EncryptionContext={"tenant": "<tenant-id>"}. - Scope each tenant’s IAM role with a condition on
kms:EncryptionContext:tenant(pin it to a principal tag so you never rewrite the key policy per tenant). - A cross-tenant decrypt fails because the context does not match — and every
Decryptin CloudTrail records the tenant value, which is a cleaner audit artifact than reconciling policy across thousands of keys.
The full key policy for this pattern (plus the per-tenant fallback for when a contract forces it) is in the key-policy templates.
Pick the coarsest boundary that meets the requirement
| Boundary | Keys at scale | Throttle exposure | Use when |
|---|---|---|---|
One CMK per data classification (pii, financial, default) | 3–6/account | Low | Default. Tenant isolation via encryption context |
| One CMK per account/stage | 1 per account | Low | The account already is the isolation boundary |
| One CMK per tenant | thousands | High | A contract/regulator names per-tenant keys, or true BYOK |
| One CMK per resource | unbounded | Very high | Almost never |
The classification boundary doubles as your DataClass tag value, which keeps your encryption model and your cost-allocation model aligned.
Cut request cost before you cut control
Two levers move the request bill (and the throttle risk) without weakening the trust model:
- S3 Bucket Keys — turn on for every SSE-KMS bucket. AWS documents up to 99% reduction in KMS request volume by collapsing per-object
GenerateDataKeycalls into per-bucket-key calls. This is the single highest-ROI KMS change for object-storage-heavy workloads, and it is a config flag, not a re-architecture. - Data key caching (AWS Encryption SDK) — reuse a data key across many messages/objects, bounded by a max age and max message count. Cuts the
GenerateDataKeyrate — the operation most likely to throttle — while keeping envelope encryption. The trade-off is a slightly wider blast radius per data key, which you bound with conservative cache limits.
Do not “reduce costs” by disabling encryption or dropping to SSE-S3 where a CMK is actually required for audit or BYOK — Bucket Keys get you most of the savings without that loss of control.
Multi-Region keys and custom key stores — only when the requirement is explicit
- Multi-Region keys: bill per replica ($1/mo each), so a primary + two replicas is $3/mo per logical key. Use them for cross-Region DR of encrypted data or global tables — not as a default.
- CloudHSM key store: a dedicated single-tenant FIPS HSM inside AWS. Its request quota is non-adjustable, so size it deliberately and keep high-volume workloads on native KMS.
- External key store (XKS): key material outside AWS, double encryption, symmetric only. AWS documents poorer latency, durability, and availability than native KMS. Worth it only when a regulator or contract requires keys outside AWS.
What to do this week
- Inventory your CMKs. Count keys per account+Region and tag each with the requirement that justifies it. Any CMK you cannot map to a control (policy/rotation/audit/BYOK/grant) is a candidate to collapse into a shared classification key or downgrade to an AWS owned key.
- Run the quota check script (read-only) to see your symmetric request-rate quota and your recent peak. If peak is within ~70% of the quota, you are one batch job away from throttling production.
- Turn on S3 Bucket Keys on every SSE-KMS bucket. This is usually the biggest single cost and throttle win.
- Replace any per-tenant CMK scheme that exists “for isolation” (not for contract) with one classification CMK + encryption context. Use the key-policy templates.
- Add a CloudWatch alarm on KMS
ThrottlingExceptioncount before your next bulk re-encryption or migration job, not after.
What this post doesn’t cover
- Client-side envelope encryption internals and the full AWS Encryption SDK keyring model — referenced here only as a request-rate lever.
- Post-quantum cryptography in KMS (ML-KEM / ML-DSA) — covered separately in AWS KMS post-quantum cryptography.
- Secrets vs keys — when to use Secrets Manager or Parameter Store instead of a KMS key directly; see Secrets Manager vs Parameter Store.
- DynamoDB and RDS at-rest specifics beyond the key-boundary decision.
- Exact current pricing — always confirm per-key and per-request numbers on the AWS KMS pricing page; the figures here are the mid-2026 model.
Related: AWS EBS encryption and KMS key lifecycle · IAM least-privilege access control · Data residency and sovereignty · AWS cloud security
If you only do one thing: Turn on S3 Bucket Keys on every SSE-KMS bucket today, then run the quota check script to see how close your peak request rate is to the shared symmetric ceiling. Those two facts decide whether your encryption architecture is one batch job away from throttling production.
AWS Cloud Architect & AI Expert
AWS-certified cloud architect and AI expert with deep expertise in cloud migrations, cost optimization, and generative AI on AWS.