When should we stay on SQS Standard instead of Kinesis or MSK?

Stay on SQS Standard when you need a job queue or work buffer without strict global ordering, sustained ingest is under roughly 50k messages per second with idempotent consumers, and you do not need stream retention or replay semantics beyond DLQ redrive. SQS bills per request with nearly unlimited horizontal throughput on Standard queues — you pay for polling discipline and 64 KB chunking, not broker hours. Escalate only when ordering per entity, stream retention, or MB/s-level fan-out forces a streaming primitive.

When should we NOT choose Amazon MSK for high throughput?

Skip MSK when you have no Kafka clients, Connect jobs, or compacted topics — broker hours for "future Kafka" burn budget on workloads SQS or Kinesis already solve. MSK Serverless caps at roughly 200 MBps write and 400 MBps read per cluster; above sustained hundreds of MB/s, provisioned MSK with Reserved Instances usually wins, but only if Kafka protocol is a hard requirement. If the team cannot operate consumer group rebalances and partition planning, MSK is the wrong first tier regardless of TPS.

What breaks when FIFO throughput limits are ignored?

SQS FIFO without high-throughput mode is commonly planned around 300 transactions per second per API action batch — producers and consumers appear healthy while backlog age grows because AWS throttles silently rather than throwing obvious errors to every caller. Symptoms: ApproximateAgeOfOldestMessage climbing during peak, downstream lag measured in minutes, and finance seeing FIFO request charges without matching business throughput. Fix: enable high-throughput FIFO (up to roughly 70k TPS with batching), add message group parallelism, or move ordered streams to Kinesis shards.

How does Kinesis enhanced fan-out change consumer scaling?

Standard Kinesis consumers share each shard 2 MB/s read pipe — adding consumers does not add read bandwidth. Enhanced fan-out gives each registered consumer a dedicated 2 MB/s HTTP/2 pipe; AWS increased the per-stream fan-out consumer limit to 50 in November 2025. Use fan-out when many independent services read the same stream without competing for shard read capacity. You pay per consumer-shard hour for fan-out; model it in the cost CSV before enabling on every microservice.

Where does Managed Service for Apache Flink fit?

Flink is compute for stateful windows, joins, and complex event processing — not a transport replacement for SQS. Put Kinesis or MSK underneath as the durable log; run Flink when aggregations, sessionization, or late-arriving event handling exceed Lambda plus DynamoDB patterns. Do not adopt Flink for simple map-and-write pipelines — operational surface (checkpoints, state backends, parallelism tuning) is justified only when business logic needs continuous state.

What could go wrong after tier escalation?

Hot keys: a single Kinesis partition key or SQS FIFO message group becomes one lane at peak. Consumer rebalance: MSK deploys stall consumption during partition reassignment. Cost cliff: Kinesis on-demand is cheap at low MB/s but expensive at sustained high MB/s versus right-sized provisioned shards or MSK with RIs. Always load-test the new tier with production key distribution, not uniform random keys.

AWS High-Throughput Event Processing 2026: Tier Selection

High-Throughput Event Processing on AWS (2026): SQS, Kinesis, MSK, and Flink Tier Selection With Cost-Cliff Math

Quick summary: On a composite ingest workload (~8k ordered TPS, 1 KB payloads), staying on SQS FIFO without high-throughput mode capped effective throughput near 300 TPS/API and modeled queue backlog cost near $95/mo before ops time — enabling high-throughput FIFO or switching to Kinesis on-demand changed the ceiling, not the consumer code.

Key Takeaways

On November 25, 2025, AWS raised the per-stream enhanced fan-out consumer limit on Kinesis Data Streams to 50 — each consumer gets a dedicated 2 MB/s read pipe instead of sharing the shard read cap
It is not the Kinesis vs MSK platform pick alone, not sync vs async boundaries, not SQS reliability patterns, not the Kinesis→Lambda→DynamoDB reference pipeline, and not JVM runtime throughput tuning
Phase 1 used SQS FIFO without high-throughput mode — effective ceiling ~300 TPS/API action, backlog age peaked ~14 min, modeled FIFO line ~$95/mo plus on-call time
Phase 2 enabled high-throughput FIFO with 50 message groups — same business TPS, backlog age < 30 s, modeled line ~$118/mo
Tier 1 — SQS: request economics beat broker hours AWS documents nearly unlimited throughput on SQS Standard queues

On November 25, 2025, AWS raised the per-stream enhanced fan-out consumer limit on Kinesis Data Streams to 50 — each consumer gets a dedicated 2 MB/s read pipe instead of sharing the shard read cap. That change matters because many “we need MSK for throughput” conversations are really consumer fan-out problems, not Kafka protocol problems.

This post is the throughput tier ladder — when to stay on SQS, when Kinesis on-demand wins, when MSK is worth broker hours, and when Managed Service for Apache Flink is compute, not transport. It is not the Kinesis vs MSK platform pick alone, not sync vs async boundaries, not SQS reliability patterns, not the Kinesis→Lambda→DynamoDB reference pipeline, and not JVM runtime throughput tuning.

Artifacts: throughput tier decision matrix, throughput cost model CSV. Pricing math uses SQS calculator assumptions where applicable.

Benchmark pattern (not a cited client) — Composite order-ingest platform, ~8k TPS peak with per-customer ordering, ~1 KB payloads, us-east-1, three downstream consumers (fraud scoring, fulfillment, analytics). Phase 1 used SQS FIFO without high-throughput mode — effective ceiling ~300 TPS/API action, backlog age peaked ~14 min, modeled FIFO line ~$95/mo plus on-call time. Phase 2 enabled high-throughput FIFO with 50 message groups — same business TPS, backlog age < 30 s, modeled line ~$118/mo. Phase 3 (analytics-only fork) moved firehose-style telemetry to Kinesis on-demand at ~25k events/s — modeled ~$890/mo ingest/retrieval vs mis-sized MSK provisioned “for jobs” at ~$1,100/mo on the CSV failure row.

The four-tier ladder

Tier	Throughput shape	You buy	You do not get
SQS Standard	Nearly unlimited horizontal scale	Per-request $, polling discipline	Global order, stream replay
SQS FIFO	300 TPS/API → 3k batched → 70k high-throughput	Per-group ordering	Single-lane groups at peak
Kinesis Data Streams	Shard or on-demand MB/s	Retention, Lambda ESM, 50 EFO consumers	Kafka wire protocol
MSK	Partition + broker hours	Kafka Connect, consumer groups, compacted topics	Zero broker thinking
Flink (on Kinesis/MSK)	Stateful parallelism	Windows, joins, CEP	Simple queue semantics

Opinionated take: Default SQS Standard for work queues; Kinesis on-demand for AWS-native multi-consumer streams; MSK only with dated Kafka requirements; Flink only when stateful stream SQL/joins are the product. Escalate tier when a documented ceiling bites — not when a resume mentions Kafka.

Tier 1 — SQS: request economics beat broker hours

AWS documents nearly unlimited throughput on SQS Standard queues. Your ceiling is almost always consumer count × handler duration × idempotency, not SQS itself.

FIFO is different. Without high-throughput mode, plan around 300 transactions per second per API action batch (async messaging boundaries — May 2026 refresh). Batching raises practical throughput; high-throughput FIFO mode targets up to ~70,000 TPS with explicit opt-in and per-message-group parallelism.

Mistake	Symptom	Fix
Single FIFO message group	One lane at peak	Shard groups by `customer_id`, `order_id`, etc.
Short polling on idle queues	Empty-receive bill	`WaitTimeSeconds=20` — see SQS pricing
200 KB bodies	4× request chunks	S3 pointer pattern

What broke — Black Friday prep for a retail order pipeline. Ops raised FIFO maxReceiveCount but not throughput mode. ApproximateAgeOfOldestMessage hit ~22 min while CloudWatch NumberOfMessagesSent plateaued near ~280/s. Producers reported “no errors.” Detection: backlog age alarm + CSV failure row sqs_fifo_wrong_tier. Fix: high-throughput FIFO + 40 message groups; fraud lane stayed FIFO, analytics fork moved to Kinesis the following sprint.

Tier 2 — Kinesis: MB/s, shards, and fan-out

Provisioned mode (per shard): 1 MB/s write, 2 MB/s read for standard consumers. On-demand mode scales shards automatically — ideal for variable ingest if you model retrieval and fan-out.

Enhanced fan-out (EFO): each registered consumer gets 2 MB/s dedicated read — critical when >5 services read the same stream. AWS increased the per-stream EFO consumer maximum to 50 (November 2025). Standard consumers still share shard read — adding Lambdas does not add read bandwidth.

October 2025 also raised max record size to 10 MiB — fewer chunking hacks for fat events.

Context — AWS CLI 2.x, read-only inventory:

# List streams and mode (on-demand vs provisioned)
aws kinesis list-streams --region us-east-1
aws kinesis describe-stream-summary --stream-name ORDER_EVENTS --region us-east-1

Tier 3 — MSK: when Kafka protocol is non-negotiable

Choose MSK when you need Kafka consumer groups, Kafka Connect, compacted topics, or existing clients without rewrite.

Mode	Ceiling (documented)	Fit
MSK Serverless	~200 MBps write, ~400 MBps read per cluster	Bursty Kafka without cluster sizing
MSK provisioned	Broker-type dependent	Sustained >500 MB/s with RIs

Do not provision three kafka.m5.large brokers to move 2k job messages/s — the CSV wrong_tier_kafka_for_jobs row models ~$1,100/mo vs ~$18/mo SQS for the same shape.

Tier 4 — Flink: compute layer, not queue replacement

Managed Service for Apache Flink belongs when you need session windows, stream joins, or CEP atop Kinesis or MSK. Transport stays on the log; Flink holds state and checkpoints.

Skip Flink when Lambda + DynamoDB + Step Functions already meet latency — you are buying checkpoint operations and key-group skew debugging.

Consumer parallelism cheat sheet

Transport	Scale reads by	Anti-pattern
SQS Standard	More workers	Assuming exactly-once
SQS FIFO	More message group IDs	One group for all traffic
Kinesis	Shards + EFO consumers	20 Lambdas on standard consumers
MSK	Partitions × group members	Rebalance during peak deploy
Flink	Parallelism / slots	Hot key in `keyBy`

What to do this week

Write peak TPS, payload KB, and ordering scope on one page.
Run the decision matrix — if Kafka is not in the answer column, stop.
Plug numbers into the cost CSV including the wrong-tier rows.
For FIFO, confirm high-throughput mode and message-group spread before peak season.
For Kinesis multi-team reads, model EFO cost vs shared-consumer lag.

What this post doesn’t cover

Amazon MQ / RabbitMQ — see event-driven boundaries.
EventBridge Pipes pricing — see EventBridge pricing.
IoT MQTT ingest — see IoT Core MQTT.
Exactly-once end-to-end proofs — see Kafka partition rebalancing.

High-Throughput Event Processing on AWS (2026): SQS, Kinesis, MSK, and Flink Tier Selection With Cost-Cliff Math

The four-tier ladder

Tier 1 — SQS: request economics beat broker hours

Tier 2 — Kinesis: MB/s, shards, and fan-out

Tier 3 — MSK: when Kafka protocol is non-negotiable

Tier 4 — Flink: compute layer, not queue replacement

Consumer parallelism cheat sheet

What to do this week

What this post doesn’t cover

Related AWS Services

AWS Architecture Review

AWS Serverless

AWS Migration

Recommended Reading

Amazon Kinesis Data Streams vs MSK: Real-Time Streaming Decision Guide

Event-Driven Boundaries on AWS: Async vs Sync, Amazon MSK vs Amazon MQ (RabbitMQ), and When SQS Wins

How to Build Reliable Queue Systems on AWS (SQS, Kafka, Redis)

Real-Time Data Pipelines on AWS: Kinesis Data Streams + Lambda + DynamoDB

AI & assistant-friendly summary

Summary

Key Facts

Entity Definitions

Related Content

The four-tier ladder

Tier 1 — SQS: request economics beat broker hours

Tier 2 — Kinesis: MB/s, shards, and fan-out

Tier 3 — MSK: when Kafka protocol is non-negotiable

Tier 4 — Flink: compute layer, not queue replacement

Consumer parallelism cheat sheet

What to do this week

What this post doesn’t cover

Related AWS Services

AWS Architecture Review

AWS Serverless

AWS Migration

Recommended Reading

Amazon Kinesis Data Streams vs MSK: Real-Time Streaming Decision Guide

Event-Driven Boundaries on AWS: Async vs Sync, Amazon MSK vs Amazon MQ (RabbitMQ), and When SQS Wins

How to Build Reliable Queue Systems on AWS (SQS, Kafka, Redis)

Real-Time Data Pipelines on AWS: Kinesis Data Streams + Lambda + DynamoDB