Skip to main content

AI & assistant-friendly summary

This section provides structured content for AI assistants and search engines. You can cite or summarize it when referencing this page.

Summary

That `user_id` label on every HTTP metric turns Amazon Managed Prometheus into a five-figure line item. This guide explains cardinality mechanics, EMF vs remote write, and Application Signals defaults worth disabling.

Key Facts

  • That `user_id` label on every HTTP metric turns Amazon Managed Prometheus into a five-figure line item
  • Amazon Managed Prometheus (AMP) pricing (June 2026) scales with metrics ingested and stored—cardinality is dollars
  • Benchmark pattern — OTel demo workload on EKS: enabling with raw URL paths pushed active series from 12k → 890k in 6 h; AMP estimate +$2,400/mo
  • Relabel to template routes ( ) restored 14k series
  • See observability beyond CloudWatch for stack wiring

Entity Definitions

Lambda
Lambda is an AWS service discussed in this article.
CloudWatch
CloudWatch is an AWS service discussed in this article.
EKS
EKS is an AWS service discussed in this article.
ECS
ECS is an AWS service discussed in this article.

Prometheus Cardinality Explosion on AWS: AMP, EMF, and Cost-Aware Metrics

DevOps & CI/CD Palaniappan P 2 min read

Quick summary: That `user_id` label on every HTTP metric turns Amazon Managed Prometheus into a five-figure line item. This guide explains cardinality mechanics, EMF vs remote write, and Application Signals defaults worth disabling.

Key Takeaways

  • That `user_id` label on every HTTP metric turns Amazon Managed Prometheus into a five-figure line item
  • Amazon Managed Prometheus (AMP) pricing (June 2026) scales with metrics ingested and stored—cardinality is dollars
  • Benchmark pattern — OTel demo workload on EKS: enabling with raw URL paths pushed active series from 12k → 890k in 6 h; AMP estimate +$2,400/mo
  • Relabel to template routes ( ) restored 14k series
  • See observability beyond CloudWatch for stack wiring
Prometheus Cardinality Explosion on AWS: AMP, EMF, and Cost-Aware Metrics
Table of Contents

Amazon Managed Prometheus (AMP) pricing (June 2026) scales with metrics ingested and stored—cardinality is dollars. A single histogram with path label including UUIDs can create millions of active series within hours.

Benchmark pattern — OTel demo workload on EKS: enabling http.route with raw URL paths pushed active series from 12k → 890k in 6 h; AMP estimate +$2,400/mo. Relabel to template routes (/users/{id}) restored 14k series. See observability beyond CloudWatch for stack wiring.

Mechanism

Prometheus identifies a time series by metric name + label set. Each unique combination is billed storage and query cost. High-cardinality labels (IDs) multiply series combinatorially with other labels (status, method, pod).

AWS controls

ApproachServiceUse when
Managed backendAMP + AMGEKS/ECS metrics at scale
Embedded metricsCloudWatch EMFLambda/custom apps without scrape
SLO-nativeApplication SignalsService golden signals—watch auto-discovered ops
Cost guardMetric filters + alarms on IncomingLogEvents / AMP workspace limitsFinOps gate

Opinionated take: Relabel at the collector (ADOT) before remote_write—do not fix cardinality in Grafana dashboards.

When this advice breaks

  • Short-lived batch jobs — High churn series may be acceptable if retention is 24h and jobs are few.
  • Debugging incidents — Temporary high-cardinality scrape OK with documented TTL and owner.

What to do this week

  1. Export top 20 labels by series count from AMP or Prometheus label_values sampling.
  2. Add drop/labelmap processors in ADOT config for forbidden labels (user_id, trace_id).
  3. Set CloudWatch alarm on AMP DiscardedSamples or workspace ingestion rate spike.
  4. Pair with log sampling guide (part 3 of this track).

What this guide doesn’t cover

Distributed tracing propagation—see part 1 OTel guide in this track.

PP
Palaniappan P

AWS Cloud Architect & AI Expert

AWS-certified cloud architect and AI expert with deep expertise in cloud migrations, cost optimization, and generative AI on AWS.

AWS ArchitectureCloud MigrationGenAI on AWSCost OptimizationDevOps

Recommended Reading

Explore All Articles »
6 min

Observability Beyond CloudWatch (2026): When to Add Application Signals, ADOT, Managed Prometheus, and Grafana — and When Not To

The reflex to bolt Amazon Managed Prometheus + Grafana onto every workload is how observability bills quietly double. CloudWatch Application Signals now gives you an auto-discovered service map, SLOs, and traces with near-zero setup; AMP only earns its keep when you are PromQL-native or drowning in high-cardinality metrics — where ingestion (not retention) is the cost driver. Here is the decision matrix, an ADOT dual-export config, and the three levers that actually cut the AMP bill.

5 min

From One FIS Experiment to a Resilience Program (2026): AWS Fault Injection Service, Stop Conditions, and GameDays That Actually Change Behavior

Running one AWS FIS experiment in a demo account is not chaos engineering — it is a screenshot. A program ties experiments to SLOs, scopes blast radius with tags, halts on CloudWatch alarm stop conditions, schedules via EventBridge, and closes the loop by re-testing the fix. FIS now ships AZ Power Interruption and cross-Region connectivity scenarios in its Scenario Library. Here is the L0→L3 maturity matrix, a GameDay runbook, and a stop-condition-wired experiment skeleton.