# M8g → M9g canary checklist

Run in a **non-production** account first. Assumes Amazon Linux 2023 or Ubuntu 22.04+ on Graviton4 today.

## Pre-flight (30 minutes)

- [ ] Confirm target Region is GA for M9g: `us-east-1`, `us-east-2`, `us-west-2`, or `eu-central-1` (per [June 10, 2026 announcement](https://aws.amazon.com/about-aws/whats-new/2026/06/ec2-m9g-m9gd-instances-graviton5-processors-available/)).
- [ ] Export baseline from CloudWatch: `CPUUtilization`, `CPUCreditBalance` (if burstable), `EBSByteBalance%`, application p95 latency (ALB target or APM).
- [ ] Snapshot launch template / ASG config; note AMI ID and `uname -m` (`aarch64` expected).
- [ ] Grep container manifests for `platform: linux/amd64` — rebuild multi-arch if found.

## Canary deploy (same day)

- [ ] Launch **one** M9g instance matching M8g size (e.g. `m8g.xlarge` → `m9g.xlarge`) in the same VPC/subnet/AZ.
- [ ] Attach to target group at **5–10%** weight OR run synthetic load only (`hey`, `k6`, `vegeta`) against the canary private IP.
- [ ] Run application smoke tests: auth, DB read/write, any native `.so` / JNI paths.
- [ ] Compare **requests/sec per vCPU** and **p95 latency** vs M8g control for ≥ 1 hour at representative load.

## Rollback triggers (stop canary if any fire)

- [ ] p95 latency **&gt; 15%** worse than M8g control at equal CPU utilization.
- [ ] Unexpected `Illegal instruction` / `SIGILL` in app logs (native x86 artifact on ARM).
- [ ] EBS throughput saturation that M9gd would fix — if so, re-evaluate **M9gd** instead of aborting Graviton5.

## Promote or park (end of week)

- [ ] If canary wins: update launch template default to M9g; keep M8g ASG at 0% for 48h rollback.
- [ ] Tag canary instances `MigrationWave=m9g-canary` for Cost Explorer filter.
- [ ] Schedule RI/SP review — do not buy M8g commitments after a successful M9g canary.
- [ ] Document outcome in optimization backlog (see [post-migration FinOps handoff template](../post-migration-finops-handoff/optimization-backlog-template.csv)).
