Skip to main content

AI & assistant-friendly summary

This section provides structured content for AI assistants and search engines. You can cite or summarize it when referencing this page.

Summary

On November 25, 2025, AWS raised the per-stream enhanced fan-out consumer limit on Kinesis Data Streams to 50 — each consumer gets a dedicated 2 MB/s read pipe instead of sharing the shard read cap. That change matters because many "we need MSK for throughput" conversations are really consumer fa...

Key Facts

  • On November 25, 2025, AWS raised the per-stream enhanced fan-out consumer limit on Kinesis Data Streams to 50 — each consumer gets a dedicated 2 MB/s read pipe instead of sharing the shard read cap
  • It is not the Kinesis vs MSK platform pick alone, not sync vs async boundaries, not SQS reliability patterns, not the Kinesis→Lambda→DynamoDB reference pipeline, and not JVM runtime throughput tuning
  • Phase 1 used SQS FIFO without high-throughput mode — effective ceiling ~300 TPS/API action, backlog age peaked ~14 min, modeled FIFO line ~$95/mo plus on-call time
  • Phase 2 enabled high-throughput FIFO with 50 message groups — same business TPS, backlog age < 30 s, modeled line ~$118/mo
  • Tier 1 — SQS: request economics beat broker hours AWS documents nearly unlimited throughput on SQS Standard queues

Entity Definitions

Lambda
Lambda is an AWS service discussed in this article.
S3
S3 is an AWS service discussed in this article.
DynamoDB
DynamoDB is an AWS service discussed in this article.
CloudWatch
CloudWatch is an AWS service discussed in this article.
Step Functions
Step Functions is an AWS service discussed in this article.
EventBridge
EventBridge is an AWS service discussed in this article.
SQS
SQS is an AWS service discussed in this article.
serverless
serverless is a cloud computing concept discussed in this article.

High-Throughput Event Processing on AWS (2026): SQS, Kinesis, MSK, and Flink Tier Selection With Cost-Cliff Math

Cloud ArchitecturePalaniappan P5 min read

Quick summary: On a composite ingest workload (~8k ordered TPS, 1 KB payloads), staying on SQS FIFO without high-throughput mode capped effective throughput near 300 TPS/API and modeled queue backlog cost near $95/mo before ops time — enabling high-throughput FIFO or switching to Kinesis on-demand changed the ceiling, not the consumer code.

Key Takeaways

  • On November 25, 2025, AWS raised the per-stream enhanced fan-out consumer limit on Kinesis Data Streams to 50 — each consumer gets a dedicated 2 MB/s read pipe instead of sharing the shard read cap
  • It is not the Kinesis vs MSK platform pick alone, not sync vs async boundaries, not SQS reliability patterns, not the Kinesis→Lambda→DynamoDB reference pipeline, and not JVM runtime throughput tuning
  • Phase 1 used SQS FIFO without high-throughput mode — effective ceiling ~300 TPS/API action, backlog age peaked ~14 min, modeled FIFO line ~$95/mo plus on-call time
  • Phase 2 enabled high-throughput FIFO with 50 message groups — same business TPS, backlog age < 30 s, modeled line ~$118/mo
  • Tier 1 — SQS: request economics beat broker hours AWS documents nearly unlimited throughput on SQS Standard queues
High-Throughput Event Processing on AWS (2026): SQS, Kinesis, MSK, and Flink Tier Selection With Cost-Cliff Math
Table of Contents

On November 25, 2025, AWS raised the per-stream enhanced fan-out consumer limit on Kinesis Data Streams to 50 — each consumer gets a dedicated 2 MB/s read pipe instead of sharing the shard read cap. That change matters because many “we need MSK for throughput” conversations are really consumer fan-out problems, not Kafka protocol problems.

This post is the throughput tier ladder — when to stay on SQS, when Kinesis on-demand wins, when MSK is worth broker hours, and when Managed Service for Apache Flink is compute, not transport. It is not the Kinesis vs MSK platform pick alone, not sync vs async boundaries, not SQS reliability patterns, not the Kinesis→Lambda→DynamoDB reference pipeline, and not JVM runtime throughput tuning.

Artifacts: throughput tier decision matrix, throughput cost model CSV. Pricing math uses SQS calculator assumptions where applicable.

Benchmark pattern (not a cited client) — Composite order-ingest platform, ~8k TPS peak with per-customer ordering, ~1 KB payloads, us-east-1, three downstream consumers (fraud scoring, fulfillment, analytics). Phase 1 used SQS FIFO without high-throughput mode — effective ceiling ~300 TPS/API action, backlog age peaked ~14 min, modeled FIFO line ~$95/mo plus on-call time. Phase 2 enabled high-throughput FIFO with 50 message groups — same business TPS, backlog age < 30 s, modeled line ~$118/mo. Phase 3 (analytics-only fork) moved firehose-style telemetry to Kinesis on-demand at ~25k events/s — modeled ~$890/mo ingest/retrieval vs mis-sized MSK provisioned “for jobs” at ~$1,100/mo on the CSV failure row.

The four-tier ladder

TierThroughput shapeYou buyYou do not get
SQS StandardNearly unlimited horizontal scalePer-request $, polling disciplineGlobal order, stream replay
SQS FIFO300 TPS/API → 3k batched → 70k high-throughputPer-group orderingSingle-lane groups at peak
Kinesis Data StreamsShard or on-demand MB/sRetention, Lambda ESM, 50 EFO consumersKafka wire protocol
MSKPartition + broker hoursKafka Connect, consumer groups, compacted topicsZero broker thinking
Flink (on Kinesis/MSK)Stateful parallelismWindows, joins, CEPSimple queue semantics

Opinionated take: Default SQS Standard for work queues; Kinesis on-demand for AWS-native multi-consumer streams; MSK only with dated Kafka requirements; Flink only when stateful stream SQL/joins are the product. Escalate tier when a documented ceiling bites — not when a resume mentions Kafka.

Tier 1 — SQS: request economics beat broker hours

AWS documents nearly unlimited throughput on SQS Standard queues. Your ceiling is almost always consumer count × handler duration × idempotency, not SQS itself.

FIFO is different. Without high-throughput mode, plan around 300 transactions per second per API action batch (async messaging boundaries — May 2026 refresh). Batching raises practical throughput; high-throughput FIFO mode targets up to ~70,000 TPS with explicit opt-in and per-message-group parallelism.

MistakeSymptomFix
Single FIFO message groupOne lane at peakShard groups by customer_id, order_id, etc.
Short polling on idle queuesEmpty-receive billWaitTimeSeconds=20 — see SQS pricing
200 KB bodies4× request chunksS3 pointer pattern

What broke — Black Friday prep for a retail order pipeline. Ops raised FIFO maxReceiveCount but not throughput mode. ApproximateAgeOfOldestMessage hit ~22 min while CloudWatch NumberOfMessagesSent plateaued near ~280/s. Producers reported “no errors.” Detection: backlog age alarm + CSV failure row sqs_fifo_wrong_tier. Fix: high-throughput FIFO + 40 message groups; fraud lane stayed FIFO, analytics fork moved to Kinesis the following sprint.

Tier 2 — Kinesis: MB/s, shards, and fan-out

Provisioned mode (per shard): 1 MB/s write, 2 MB/s read for standard consumers. On-demand mode scales shards automatically — ideal for variable ingest if you model retrieval and fan-out.

Enhanced fan-out (EFO): each registered consumer gets 2 MB/s dedicated read — critical when >5 services read the same stream. AWS increased the per-stream EFO consumer maximum to 50 (November 2025). Standard consumers still share shard read — adding Lambdas does not add read bandwidth.

October 2025 also raised max record size to 10 MiB — fewer chunking hacks for fat events.

Context — AWS CLI 2.x, read-only inventory:

# List streams and mode (on-demand vs provisioned)
aws kinesis list-streams --region us-east-1
aws kinesis describe-stream-summary --stream-name ORDER_EVENTS --region us-east-1

Tier 3 — MSK: when Kafka protocol is non-negotiable

Choose MSK when you need Kafka consumer groups, Kafka Connect, compacted topics, or existing clients without rewrite.

ModeCeiling (documented)Fit
MSK Serverless~200 MBps write, ~400 MBps read per clusterBursty Kafka without cluster sizing
MSK provisionedBroker-type dependentSustained >500 MB/s with RIs

Do not provision three kafka.m5.large brokers to move 2k job messages/s — the CSV wrong_tier_kafka_for_jobs row models ~$1,100/mo vs ~$18/mo SQS for the same shape.

Managed Service for Apache Flink belongs when you need session windows, stream joins, or CEP atop Kinesis or MSK. Transport stays on the log; Flink holds state and checkpoints.

Skip Flink when Lambda + DynamoDB + Step Functions already meet latency — you are buying checkpoint operations and key-group skew debugging.

Consumer parallelism cheat sheet

TransportScale reads byAnti-pattern
SQS StandardMore workersAssuming exactly-once
SQS FIFOMore message group IDsOne group for all traffic
KinesisShards + EFO consumers20 Lambdas on standard consumers
MSKPartitions × group membersRebalance during peak deploy
FlinkParallelism / slotsHot key in keyBy

What to do this week

  1. Write peak TPS, payload KB, and ordering scope on one page.
  2. Run the decision matrix — if Kafka is not in the answer column, stop.
  3. Plug numbers into the cost CSV including the wrong-tier rows.
  4. For FIFO, confirm high-throughput mode and message-group spread before peak season.
  5. For Kinesis multi-team reads, model EFO cost vs shared-consumer lag.

What this post doesn’t cover

Related: Architecture review · SQS pricing calculator · Kinesis vs MSK

PP
Palaniappan P

AWS Cloud Architect & AI Expert

AWS-certified cloud architect and AI expert with deep expertise in cloud migrations, cost optimization, and generative AI on AWS.

AWS ArchitectureCloud MigrationGenAI on AWSCost OptimizationDevOps

Recommended Reading

Explore All Articles »