Log Aggregation and Intelligent Sampling with CloudWatch and OpenTelemetry
Quick summary: Ingesting every debug log to CloudWatch is how observability becomes a FinOps incident. Tail sampling with ADOT, Logs Insights, and Firehose to S3 for the long tail.
Key Takeaways
- Ingesting every debug log to CloudWatch is how observability becomes a FinOps incident
- Tail sampling with ADOT, Logs Insights, and Firehose to S3 for the long tail
- CloudWatch Logs ingestion (June 2026) bills per GB—100% trace/log correlation without sampling destroyed margins on a $40k/mo observability line item for a mid-market SaaS we benchmarked
- Aggregation architecture 1
- App → structured JSON (correlation ID) 2
Table of Contents
CloudWatch Logs ingestion (June 2026) bills per GB—100% trace/log correlation without sampling destroyed margins on a $40k/mo observability line item for a mid-market SaaS we benchmarked.
Aggregation architecture
- App → structured JSON (correlation ID)
- ADOT collector → tail sampling (keep errors + slow)
- CloudWatch Logs hot path + Firehose → S3/Glue for audit
Sampling rules
- Always keep:
level=ERROR,http.status>=500, latency > SLO - Sample info: 1–5% baseline
- Never sample security audit events
Logs Insights
Use for incident search; not primary metrics store—pair with cardinality guide.
What to do this week
- Enable ADOT tail sampling processor in collector config.
- Set log retention tiers (7d hot, 90d S3).
- Dashboard ingestion GB/day with anomaly detection.
What this guide doesn’t cover
Full OTel stack setup—part 1 canonical post in track.
AWS Cloud Architect & AI Expert
AWS-certified cloud architect and AI expert with deep expertise in cloud migrations, cost optimization, and generative AI on AWS.