Do Compute Savings Plans cover SageMaker workloads?

No — and this catches teams routinely. Compute Savings Plans cover EC2, Fargate, and Lambda. SageMaker has its own product-specific savings plan (SageMaker AI Savings Plans) that must be purchased separately. The two cannot be substituted. An organization with significant SageMaker spend that has only Compute Savings Plans is paying on-demand rates for all SageMaker compute. Audit the SageMaker line in Cost Explorer; if it is meaningful and currently on on-demand, evaluate the SageMaker AI Savings Plan break-even.

What SageMaker compute types does the SP cover?

SageMaker Training, Real-Time Inference, Asynchronous Inference, Serverless Inference, Notebook Instances, Studio Notebooks, and Processing Jobs. The plan applies hour-by-hour to qualifying SageMaker usage across all these compute types within the committed dollar amount. SageMaker components that bill on volume rather than compute hours (model registry storage, feature store storage, etc.) are not covered by the SP and continue billing at standard rates.

What is the discount range vs the commitment term?

Up to 64% off the on-demand SageMaker rate for a 3-year All-Upfront commitment. 1-year terms with partial-upfront typically deliver 30–45% discount. The exact discount depends on the instance type and term — newer GPU instance families (ml.p5d, ml.p6e) tend to offer slightly less aggressive discounts than older families. The All-Upfront vs Partial-Upfront vs No-Upfront payment option affects the discount by a few percentage points; All-Upfront delivers the best per-dollar rate, No-Upfront preserves cash but at a slightly lower discount.

How does the commitment work — is it instance-type-specific?

No — SageMaker AI Savings Plans commit to a dollar amount per hour, not to specific instance types. This makes them more flexible than instance-specific RIs. Commit $10/hour for 1 year, and AWS automatically applies that commitment to any qualifying SageMaker compute usage up to the committed rate, then bills on-demand for usage above. The flexibility means you can change training instance types, scale inference endpoints, or shift between real-time and async inference without losing SP coverage as long as the dollar rate stays within the commitment.

Should I commit before or after I have a steady production workload?

After — almost always. SageMaker AI Savings Plans are a commitment to spend over the term. Committing before the workload is stable risks paying for capacity that the workload never uses, or being capacity-constrained when the workload spikes beyond the commitment. The right pattern: deploy the workload on-demand for 60–90 days, measure the steady-state hourly rate, then commit to roughly 80% of the observed steady-state rate. The remaining 20% absorbs growth and unexpected usage; the 80% delivers the discount on the predictable base.

How does this compare to running models on Bedrock instead?

Different primitive entirely. Bedrock bills per token or per provisioned throughput; SageMaker bills per instance-hour. For workloads using AWS foundation models (Claude, Nova, etc.), Bedrock is operationally simpler and the bill is predictable per-token. For workloads using custom models, fine-tuned models, or models not available on Bedrock, SageMaker is the right primitive — and Savings Plans are the cost-optimization mechanism. The choice is determined by model availability, not by pricing.

Are there break-even scenarios where on-demand stays cheaper?

Yes — when SageMaker usage is significantly variable, intermittent, or growing rapidly. Savings Plans deliver discount in exchange for commitment; if the commitment exceeds actual usage, you pay for capacity not used. The rule of thumb: if your hourly SageMaker spend varies by more than 4× across the month (peak vs valley), on-demand for the variable portion is often cheaper than over-committing. Commit to the steady base only; let the variable portion bill on-demand.

SageMaker AI Savings Plans: Up to 64% Off, Not Compute SP

AWS SageMaker AI Savings Plans are the product-specific commitment-based discount mechanism for SageMaker compute. They deliver up to 64% off on-demand rates for training, inference (real-time, asynchronous, serverless), notebook instances, and processing jobs — in exchange for a 1-year or 3-year hourly-rate commitment. The most consequential fact about them: Compute Savings Plans do not cover SageMaker workloads. Teams with significant SageMaker spend who have only purchased Compute Savings Plans are paying full on-demand rates on every SageMaker line.

SageMaker AI Savings Plans commit to consistent compute spend (training or inference) for 1 or 3 years in exchange for up to 64% discount versus on-demand. In us-east-1 (June 2026), plans apply across instance families within the same compute type. Most surprises come from mixing training and inference commitments or buying before workload shape stabilizes.

Plan type	Term	Flexibility
SageMaker SP (training)	1 or 3 yr	Training instance families
SageMaker SP (inference)	1 or 3 yr	Real-time/batch endpoints
Compute SP	1 or 3 yr	Broader EC2/Fargate/Lambda
On-demand	None	Highest $/hour

This post focuses on the Savings Plan side of SageMaker cost optimization. For the operational side of SageMaker cost — instance selection, training job sizing, spot training — see our SageMaker training cost-efficiency guide.

What SageMaker AI Savings Plans Cover

SageMaker AI Savings Plans — coverage scope

Prices in us-east-1

The SP applies hour-by-hour to qualifying SageMaker compute. Non-compute SageMaker features (model registry storage, feature store, etc.) bill separately at standard rates.

Dimension	Unit price	Example workload	Monthly cost
SageMaker Training Includes distributed training across instances	Per ml.* instance-hour	ml.p5d.24xlarge training	Up to 64% off
Real-Time Inference Auto-scaling supported within commit	Per ml.* instance-hour	Persistent endpoint	Up to 64% off
Asynchronous Inference Scales to zero when idle	Per ml.* instance-hour	Queue-based inference	Up to 64% off
Serverless Inference Cold-start latency trade-off	Per memory-hour	On-demand model serving	Up to 64% off
Processing Jobs For pre/post-training data work	Per ml.* instance-hour	Data prep, model evaluation	Up to 64% off
Notebook Instances SageMaker Studio Notebooks included	Per ml.* instance-hour	Persistent dev notebooks	Up to 64% off
Model registry storage S3 lifecycle for cost control	Standard S3 rates	Versioned model artifacts	Not covered by SP
Feature store online storage Bills separately	Per GB-month	Real-time feature serving	Not covered by SP

SageMaker Training

Up to 64% off

Includes distributed training across instances

Unit price: Per ml.* instance-hour
Example workload: ml.p5d.24xlarge training

Real-Time Inference

Up to 64% off

Auto-scaling supported within commit

Unit price: Per ml.* instance-hour
Example workload: Persistent endpoint

Asynchronous Inference

Up to 64% off

Scales to zero when idle

Unit price: Per ml.* instance-hour
Example workload: Queue-based inference

Serverless Inference

Up to 64% off

Cold-start latency trade-off

Unit price: Per memory-hour
Example workload: On-demand model serving

Processing Jobs

Up to 64% off

For pre/post-training data work

Unit price: Per ml.* instance-hour
Example workload: Data prep, model evaluation

Notebook Instances

Up to 64% off

SageMaker Studio Notebooks included

Unit price: Per ml.* instance-hour
Example workload: Persistent dev notebooks

Model registry storage

Not covered by SP

S3 lifecycle for cost control

Unit price: Standard S3 rates
Example workload: Versioned model artifacts

Feature store online storage

Not covered by SP

Bills separately

Unit price: Per GB-month
Example workload: Real-time feature serving

The SP coverage is across all compute primitives — flexibility to shift between training and inference without losing the discount.

If you only audit one dimension first, stabilize monthly inference hours before committing — Savings Plans penalize under-utilization.

The Two-Plan Trap

The single most common SageMaker cost mistake: assuming Compute Savings Plans cover SageMaker. They don’t.

The Commitment Mechanism

SageMaker AI Savings Plans commit to a dollar amount per hour for the chosen term (1-year or 3-year) with three payment options (All-Upfront, Partial-Upfront, No-Upfront). Higher upfront delivers higher discount; No-Upfront preserves cash at slightly lower discount.

Key flexibility: the commitment is in dollars per hour, not instance types. Commit $10/hour and AWS applies that to any qualifying SageMaker compute usage up to $10/hour, then bills on-demand for usage above. You can shift between training instance types, change inference endpoint configurations, or move between real-time and async inference without losing SP coverage.

Savings Plan tier comparison — illustrative 1-year and 3-year commits

Prices in us-east-1

Higher commitment terms deliver larger discounts. The trade-off is reduced flexibility to abandon the commitment.

Dimension	Unit price	Example workload	Monthly cost
No commitment (on-demand) Full flexibility; no discount	$10/hr SageMaker spend	$7,300/month baseline	$7,300
1-year No-Upfront Preserves cash; lowest discount tier	~30% discount	Same workload	~$5,110
1-year Partial-Upfront Better discount; partial cash commit	~35% discount	Same workload	~$4,745
1-year All-Upfront Maximum 1-year discount	~38% discount	Same workload	~$4,526
3-year All-Upfront Maximum discount; longest commitment	Up to 64% discount	Same workload, 3-year commit	~$2,628

No commitment (on-demand)

$7,300

Full flexibility; no discount

Unit price: $10/hr SageMaker spend
Example workload: $7,300/month baseline

1-year No-Upfront

~$5,110

Preserves cash; lowest discount tier

Unit price: ~30% discount
Example workload: Same workload

1-year Partial-Upfront

~$4,745

Better discount; partial cash commit

Unit price: ~35% discount
Example workload: Same workload

1-year All-Upfront

~$4,526

Maximum 1-year discount

Unit price: ~38% discount
Example workload: Same workload

3-year All-Upfront

~$2,628

Maximum discount; longest commitment

Unit price: Up to 64% discount
Example workload: Same workload, 3-year commit

Exact discount percentages vary by instance type. Newer GPU families (ml.p5d, ml.p6e) typically have slightly less aggressive discount tiers.

Commit After, Not Before

The right pattern for purchasing SageMaker AI Savings Plans:

Deploy the workload on on-demand for 60–90 days.
Measure steady-state hourly SageMaker spend via Cost Explorer with hourly granularity.
Commit to roughly 80% of the observed steady-state rate — leave 20% headroom for growth and variability.
Re-evaluate quarterly. As the workload grows, layer additional Savings Plans on top of the existing commitment.

The wrong pattern: committing before the workload is stable. SP commitments are obligations to pay for the committed hourly rate whether you use it or not. Over-committing on a workload that turns out to use less than expected wastes the commitment.

How the SP Stack Applies

When you have multiple Savings Plans, AWS applies them in priority order:

Instance-specific Savings Plans (EC2) — first to apply for matching EC2 usage.
SageMaker AI Savings Plans — apply to SageMaker usage.
Compute Savings Plans — apply to remaining qualifying EC2, Fargate, Lambda.
On-demand — billing for any usage above the combined plan coverage.

The implication: SageMaker AI SPs and Compute SPs cover different scopes and stack additively. An organization with significant EC2 + Fargate + SageMaker spend should consider both products.

When to Commit and When to Stay On-Demand

Commit on stable production workloads; stay on-demand for variable / new / declining workloads.

Use when

Steady production inference endpoints with predictable traffic over a 12+ month horizon
Stable training pipelines running on a regular cadence (daily, weekly retraining)
Long-running production workloads where 1-year commit clearly fits the workload lifecycle
Mature ML programs with 12+ months of historical usage data to base commitment sizing on
3-year commitment only on workloads with 18+ months of stable history and clear roadmap continuation

Avoid when

New workloads without stable usage history — commit too early and pay for capacity not used
Research / experimentation workloads with intermittent or burstable usage
Workloads with high peak-to-average ratio (>4×) — commit to base only, on-demand for spikes
Workloads expected to decline or migrate to a different platform within the SP term
Workloads on instance types being deprecated by AWS — newer families often deliver better economics

Commit to the stable base of your SageMaker workload, not to peak. The remaining usage on on-demand is more flexible and absorbs variability.

A 30-Day SageMaker SP Evaluation Plan

Week 1 — Measure baseline. Pull SageMaker compute spend from Cost Explorer with hourly granularity over the last 90 days. Calculate the average steady-state hourly rate; identify peak-to-average ratio.

Week 2 — Model the savings. For 1-year and 3-year terms with each payment option, calculate the projected savings at 80% commitment of baseline. Compare against the workload’s planned lifecycle (will this workload still run in 12/36 months?).

Week 3 — Stack with existing SPs. Audit existing Compute Savings Plans (which do NOT cover SageMaker). Plan the SageMaker AI SP as an additive purchase; verify the combined coverage scope makes sense.

Week 4 — Purchase and monitor. Purchase the chosen SP. Monitor SP utilization in the Savings Plans console for the first 30 days. Adjust if actual usage diverges from the projected baseline.

What This Post Doesn’t Cover

SageMaker workload-specific optimization (Spot training, instance right-sizing) — covered in our SageMaker training cost-efficiency guide.
Bedrock pricing comparison in depth — covered in our Bedrock cost optimization.
EC2 Reserved Instances vs Savings Plans decision for non-SageMaker workloads — covered in our RIs vs Savings Plans guide.
Multi-region SP behavior — SPs apply globally; covered in our FinOps content.

If You Only Do One Thing This Week

Audit your SageMaker spend in Cost Explorer. If it is meaningful (above $5K/month) and currently on on-demand, the SageMaker AI Savings Plan break-even is almost always in your favor — typical first-year savings are 30–45% on the steady-state base. Start with a conservative 1-year No-Upfront commitment at 80% of baseline; re-evaluate after 90 days. The SageMaker-specific SP is one of the easiest cost-optimization wins on AWS when SageMaker is a meaningful line — and one of the most commonly missed because teams assume Compute Savings Plans cover it.

For the broader commitment-purchasing strategy across EC2, Fargate, Lambda, RDS, and SageMaker, the Reserved Instances vs Savings Plans decision guide covers the full landscape.

AWS SageMaker AI Savings Plans: Up to 64% Off Training and Inference Compute

What SageMaker AI Savings Plans Cover