Skip to main content

AI & assistant-friendly summary

This section provides structured content for AI assistants and search engines. You can cite or summarize it when referencing this page.

Summary

CloudWatch Logs Insights bills $0.005 per GB scanned and high-cardinality custom metrics multiply costs. Cardinality budgets, sampling rules, and FinOps fixes.

Key Facts

  • CloudWatch Logs Insights bills $0
  • 005 per GB scanned and high-cardinality custom metrics multiply costs
  • On May 8, 2026, Amazon CloudWatch observability bills bite hardest where teams least audit them: Logs Insights at $0
  • 005 per GB scanned (US East list pricing) and custom metrics that explode when a per-user-ID dimension lands on a hot path
  • This guide is the FinOps view of AWS observability — cardinality budgets, sampling rules, and query hygiene that keep CloudWatch and OpenTelemetry costs predictable

Entity Definitions

Lambda
Lambda is an AWS service discussed in this article.
CloudWatch
CloudWatch is an AWS service discussed in this article.
Amazon CloudWatch
Amazon CloudWatch is an AWS service discussed in this article.
cost optimization
cost optimization is a cloud computing concept discussed in this article.
compliance
compliance is a cloud computing concept discussed in this article.

AWS Observability Costs: Cardinality Budgets & FinOps Limits

Quick summary: CloudWatch Logs Insights bills $0.005 per GB scanned and high-cardinality custom metrics multiply costs. Cardinality budgets, sampling rules, and FinOps fixes.

Key Takeaways

  • CloudWatch Logs Insights bills $0
  • 005 per GB scanned and high-cardinality custom metrics multiply costs
  • On May 8, 2026, Amazon CloudWatch observability bills bite hardest where teams least audit them: Logs Insights at $0
  • 005 per GB scanned (US East list pricing) and custom metrics that explode when a per-user-ID dimension lands on a hot path
  • This guide is the FinOps view of AWS observability — cardinality budgets, sampling rules, and query hygiene that keep CloudWatch and OpenTelemetry costs predictable
AWS Observability Costs: Cardinality Budgets & FinOps Limits
Table of Contents

On May 8, 2026, Amazon CloudWatch observability bills bite hardest where teams least audit them: Logs Insights at $0.005 per GB scanned (US East list pricing) and custom metrics that explode when a per-user-ID dimension lands on a hot path. A “cheap” wide query scheduled every minute, or a PutMetricData storm on a high-cardinality dimension, becomes a five-figure monthly line item before finance reads the dashboard.

This guide is the FinOps view of AWS observability — cardinality budgets, sampling rules, and query hygiene that keep CloudWatch and OpenTelemetry costs predictable. For the broader CloudWatch capability surface (metrics, logs, alarms, Application Signals), see CloudWatch observability best practices. For full FinOps remediation across compute and observability, see our AWS cloud cost optimization services.

Reproduce this — Paste starter queries from examples/architecture-blog-2026/observability/logs-insights-queries.txt into Logs Insights; replace log groups and trace IDs.

The three cost surfaces — metrics, logs, traces

  • Custom metrics bill per dimension combination. Adding user_id to a metric with 500k DAU creates 500k time series — that is not a metric, that is a database.
  • Logs Insights bills per GB scanned ($0.005/GB US East, May 2026). A SELECT * across 30 log groups for “the last hour” is bigger than people think.
  • Traces bill per-span ingested in AWS X-Ray; without sampling rules, instrument-everything day one is the most expensive learning experience your team will have.

Cardinality budgets

Cardinality is the only observability concept where a single mis-applied tag silently triples your bill.

  • Set a budget per service: max distinct dimension combinations per metric, max distinct attributes per trace span. Review weekly.
  • Prefer aggregated business metrics with bounded dimensions (region, env, tier) over per-user-ID metrics. Use traces for per-user debugging, not metrics.
  • Use embedded metric format (EMF) carefully — EMF generates metrics from log lines, which means a runaway log can mint runaway metrics.

What broke — A “temporary” debugging query (SELECT * across 30 log groups) landed in a CI cron for weeks. Finance flagged an $18k monthly delta — logs were “cheap” until automation scaled humans out of the loop. Fix: query linting, required group filters, and budget alarms per log product.

Sampling — the lever most teams skip

  • Head-based sampling (sample at trace start) is simple and predictable; default to it before tail-based.
  • Tail-based sampling requires a collector tier (AWS Distro for OpenTelemetry, OTel Collector). Useful for keeping all error traces while dropping normal ones — if the rules do not accidentally keep 100% of “normal” anyway.
  • Validate trace counts against billing weekly. Misconfigured rules look correct in code and silently re-include the world.

OpenTelemetry adoption pacing

Half-instrumented OpenTelemetry can cost more than central AWS X-Ray with known sampling rules. Phase adoption: critical paths first, tail-based sampling second, broad rollout last. For chaos-driven OTel learning, see OTel Demo Game.

Monitoring collects continuously; alerting routes to humans. The cost link: every alarm that fires more than twice a week without an action update is a candidate for deletion or re-tiering. Noisy alarms keep their underlying metric pipelines hot and humans paged.

Opinionated take — If an alert fires more than twice a week without an action checklist update, delete or re-tier the alarm — noisy paging is a security incident waiting for ignored fatigue.

Training wheels: real cost of skipping 24/7 monitoring quantifies the business risk of silent failure. Operational mantra from our distributed debugging guide: alert on metrics, triage with traces, diagnose with logs.

FinOps coupling

Observability spend belongs in the same review as compute. Cross-references: engineering cost ownership and logging cost deep dive.

What This Post Doesn’t Cover

  • Third-party APM pricing negotiations — vendor-specific.
  • Security analytics in Security Lake vs CloudWatch — different retention/compliance story (Security Lake guide when OCSF normalization matters).
  • Capability deep dive on metrics, logs, alarms, Application Signals — that lives in CloudWatch observability best practices.

If You Only Do One Thing

Set a per-service cardinality budget and review it weekly with finance. Every other observability cost lever flows from that one number.

What to Do This Week

  1. Inventory scheduled Logs Insights queries; delete or constrain time windows.
  2. Audit every PutMetricData call for per-user-ID or per-request-ID dimensions; remove or move to traces.
  3. Verify W3C trace context crosses service boundaries in staging (not just Lambda console traces).
  4. Pair every new alarm with a runbook link or delete the alarm — no orphan pages.

When resilience patterns intersect telemetry, read retries, circuits, and graceful shutdown.

PP
Palaniappan P

AWS Cloud Architect & AI Expert

AWS-certified cloud architect and AI expert with deep expertise in cloud migrations, cost optimization, and generative AI on AWS.

AWS ArchitectureCloud MigrationGenAI on AWSCost OptimizationDevOps

Ready to discuss your AWS strategy?

Our certified architects can help you implement these solutions.

Recommended Reading

Explore All Articles »
14 min

How to Migrate to AWS Without Cost Surprises

AWS migration cost estimates are consistently wrong — not because the tools are bad, but because they miss the parallel run period, data transfer during migration, and the operational tax of learning a new environment. Here is what to actually model.

5 min

AWS Global Accelerator vs CloudFront & Route 53 (2026)

Global Accelerator charges about $0.025 per provisioned accelerator per hour—even while disabled—and adds Data Transfer-Premium on top of normal data transfer. Two static Anycast IPv4 addresses (or four addresses in dual-stack: two IPv4 and two IPv6) front ALBs, NLBs, EC2, or EIPs across Regions; that pricing model changes whether you beat CloudFront or Route 53 latency records alone.