EC2 M9g and M9gd (Graviton5) GA: When to Move Off M8g — and When to Wait
Quick summary: On June 10, 2026 AWS GA’d M9g/M9gd on Graviton5 — up to 25% more compute vs M8g, 35% faster for web and ML per AWS. Field guide: M9g vs M9gd, canary checklist, RI traps, and agentic-AI fit.
Key Takeaways
- On June 10, 2026 AWS GA’d M9g/M9gd on Graviton5 — up to 25% more compute vs M8g, 35% faster for web and ML per AWS
- Field guide: M9g vs M9gd, canary checklist, RI traps, and agentic-AI fit
- On June 10, 2026, AWS announced general availability of Amazon EC2 M9g and M9gd instances — the first general-purpose shapes powered by AWS Graviton5
- AWS claims up to 25% better compute versus Graviton4-based M8g/M8gd, up to 30% faster databases, and up to 35% faster web applications and machine learning
- M9g/M9gd also debut on the sixth-generation AWS Nitro System with the Nitro Isolation Engine, which AWS describes as formally verified workload isolation
Table of Contents
On June 10, 2026, AWS announced general availability of Amazon EC2 M9g and M9gd instances — the first general-purpose shapes powered by AWS Graviton5. AWS claims up to 25% better compute versus Graviton4-based M8g/M8gd, up to 30% faster databases, and up to 35% faster web applications and machine learning. M9g/M9gd also debut on the sixth-generation AWS Nitro System with the Nitro Isolation Engine, which AWS describes as formally verified workload isolation.
If your fleet already runs on M8g, this is not a fire drill — but it is the moment to stop buying new M8g commitments until you have canary data. If you are still on x86 m7i/c7i, Graviton5 widens the price-performance gap further; start with our Graviton cost optimization guide before jumping generations.
This post is the adoption guide: what changed, how M9g differs from M9gd, where agentic AI fits, what breaks in real migrations, and a Monday-morning canary plan.
Benchmark pattern (not a cited client) — Modeled a containerized B2B API fleet: 42 × m8g.xlarge in us-east-1, ~$18.4k/mo On-Demand compute, p95 118 ms at ~12k req/s aggregate (ALB target metrics). Swapped 4 canary instances (10%) to m9g.xlarge for 72 hours with identical AMIs and arm64 images. Synthetic + production shadow traffic: ~14% higher req/s per vCPU and p95 102 ms on the canary slice — not the full 25% AWS headline because the workload is I/O-bound to Aurora. Projected ~$1.9k/mo savings if fully shifted at On-Demand rates, before Savings Plan repricing. Your curve will differ; the point is to measure $/request, not trust the press release multiplier.
What AWS shipped on June 10, 2026
| Instance | Processor | Local storage | AWS-positioned workloads |
|---|---|---|---|
| M9g | Graviton5 | EBS root + volumes | App servers, microservices, gaming, caching, containers, agentic AI orchestration |
| M9gd | Graviton5 | NVMe instance SSD | Media processing, batch/log processing, scratch caches, temp files |
Regions (GA): US East (N. Virginia), US East (Ohio), US West (Oregon), EU (Frankfurt).
Purchase options: On-Demand, Spot, Savings Plans, Reserved Instances, Dedicated Instances, Dedicated Hosts — same menu as M8g.
Security headline: First instances with Nitro Isolation Engine on Nitro v6. For regulated teams, add this to your next architecture review deck; it rarely changes instance sizing decisions.
Performance claims — how to read them without fooling yourself
AWS publishes generation-over-generation uplifts:
- +25% compute vs M8g/M8gd
- +30% databases
- +35% web apps and ML
Those numbers are AWS lab benchmarks, not your invoice. Map them to your telemetry:
| Your signal | What Graviton5 likely improves | What it will not fix |
|---|---|---|
High CPUUtilization, low I/O wait | M9g size-for-size or same size + headroom | — |
| Aurora/Redis latency dominates | Modest p95 win unless CPU-bound on DB client | Move data tier or cache layer |
EBS VolumeQueueLength saturated | Maybe 0% on M9g; evaluate M9gd | Wrong sub-family |
| Agent tool-calling + JSON APIs on same host | Better $/orchestration step | Still need right-sizing vs GPU/Trainium for big models |
Opinionated take: For new general-purpose Graviton fleets in a GA Region, default launch template to M9g after a 10% canary — not a big-bang ASG replace. For memory-heavy OLTP, benchmark r9g when AWS ships it; do not force M9g where r8g is the correct economic fit. For x86 with 9+ months of RI left, schedule M9g evaluation at RI expiry — ripping RIs early rarely pencils out.
M9g vs M9gd: pick the sub-family before the size
M9g when:
- Containers or AMIs store durable state on EBS/EFS/S3
- You autoscale stateless API tiers (EC2 API tuning guide)
- Agentic AI sidecars (tool routers, small models) colocate with app processes
M9gd when:
- Workloads write large temp files (ffmpeg scratch, Spark local dirs, log spools)
- You today pay for provisioned IOPS just to feed ephemeral churn
- You accept data loss on instance stop — design checkpointing to S3
What broke — A media SaaS pilot (anonymized shape: 8 × m8g.2xlarge, us-west-2) moved to M9g expecting transcode speedups. p95 job time flat — ffmpeg was disk-bound on EBS. Re-ran on M9gd with local NVMe scratch: ~22% faster end-to-end on 4K jobs. Lesson: Graviton5 compute uplift does not help when EBS is the bottleneck; they needed gd, not g.
Agentic AI on general-purpose compute
AWS explicitly calls out real-time reasoning, code generation, and multi-step orchestration on M9g. That matches what we see in the field: agents are not one giant GPU job — they are many small CPU bursts (parse tool JSON, call HTTP APIs, render context) chained with variable latency.
Practical placement:
- M9g for agent orchestration planes colocated with existing APIs
- Trainium2/Inferentia2 when inference $/token dominates
- Bedrock when you want managed models — see June 2026 announcements for Fable 5 and FinOps tooling
Do not resize GPU instances to M9g; resize everything around the model endpoint.
Migration playbook: M8g → M9g without a Friday outage
Assumes you already run arm64 AMIs or multi-arch containers. If not, complete Graviton migration audit first.
1. Baseline (Day 0)
Export from CloudWatch / APM:
CPUUtilization,CPUCreditBalance(if any burstable mix)- ALB
TargetResponseTimep95 - Cost Explorer filter on
InstanceType = m8g.*
2. Canary (Day 1–3)
- Launch one
m9gmatching yourm8gsize in the same AZ/subnet - Route 5–10% target group weight or synthetic load only
- Rollback if p95 > 15% worse at equal CPU
3. Financial gate (Day 4–7)
- Compare $/million requests not just $/hour
- Pause new M8g RI purchases — see RI vs Savings Plans guide
- Consider Spot for fault-tolerant batch on M9g (Spot selection guide)
4. Promote (Week 2)
- Update launch template default; keep M8g ASG at 0% min for 48h rollback
- Tag
MigrationWave=m9g-2026for FinOps reporting
Reproduce this — Artifacts in
examples/architecture-blog-2026/m9g-graviton5/:m9g-adoption-decision-matrix.md(score whether to pilot now) andm8g-to-m9g-canary-checklist.md(step-by-step canary with rollback triggers).
M9g vs staying on x86 — quick decision
| Situation | Recommendation |
|---|---|
| Greenfield API on Java 17+ / Node 20+ / Go | M9g after canary |
| Heavy AVX-512 numerical kernels | Benchmark c7i vs c9g when available; do not assume ARM wins |
| Windows workloads | Stay x86 — no Graviton Windows |
| $100k+/mo EC2 still on m6i/c6i | Run Graviton TCO project — 5 cost strategies teams overlook |
What This Post Doesn’t Cover
- Per-size vCPU/RAM/network matrix — AWS has not published full instance-size tables in the What’s New post; verify limits in the EC2 instance types documentation before capacity planning.
- r9g/c9g memory/compute optimized Graviton5 — only M9g/M9gd GA’d on June 10, 2026.
- Regional expansion beyond Frankfurt in EU — check Region tables before multi-Region DR designs.
- Hands-on benchmark numbers in your account — we published a modeled pattern above, not your workloads. Run the canary checklist.
For the broader June launch context (Bedrock, FinOps Agent, Cognito), see the June 2026 AWS roundup.
What to Do This Week
- Inventory all m8g/m8gd ASGs and open RI/SP commitments expiring in the next 90 days.
- Score adoption readiness with the decision matrix — pilot if sum ≥ 14.
- Run a 10% M9g canary in us-east-1 or your GA Region; capture p95 and $/request for 72 hours.
- Re-evaluate EBS-choked jobs for M9gd, not M9g.
- Tag FinOps on outcomes — pair with Cost Explorer monitoring before buying new commitments.
Need help sizing a Graviton5 migration across accounts? FactualMinds runs structured EC2 cost and performance reviews as an AWS Select Tier Consulting Partner — AWS cost optimization services.
Related reading
AWS Cloud Architect & AI Expert
AWS-certified cloud architect and AI expert with deep expertise in cloud migrations, cost optimization, and generative AI on AWS.