Skip to main content

Case Study

SaaS Cost Optimization on AWS: From $85k to $58k/Month Without Performance Trade-offs

Cut AWS spend from $85k to $58k per month — a 32% reduction — through rightsizing, Reserved Instance coverage, NAT Gateway elimination, and data transfer optimization. Zero performance impact.

Ask AI: ChatGPT Claude Perplexity Gemini

Challenge: AWS Bill Growing Faster Than Revenue

A B2B SaaS platform serving 1,200 enterprise customers had watched its AWS bill grow 180% over two years while revenue grew 120% over the same period. The unit economics were moving in the wrong direction.

The engineering team was building the product — they did not have time for systematic cost analysis. The finance team saw the monthly AWS invoice but lacked the technical context to understand what they were looking at. The result was a growing spend problem with no owner.

Three specific patterns had emerged in the Cost Explorer data but had never been acted on:

Rising data transfer costs. Data transfer charges had grown from $3,200/month two years earlier to $14,800/month, despite the product not changing its data transfer patterns significantly. Nobody knew why.

RDS costs that did not match database utilization. The RDS line item was $28,000/month for a platform that had never seen more than 40% CPU utilization on its primary database instance.

No Reserved Instance coverage. The company had been on AWS for four years and was still paying 100% On-Demand pricing for all infrastructure.

Solution: Three-Track Cost Reduction

The engagement ran an AWS Cost Explorer analysis, CloudWatch metrics review, and architecture audit in parallel before presenting findings. The work proceeded in three tracks based on implementation risk and time to savings.

Track 1: Reserved Instance Coverage (Immediate, Low Risk)

The highest-leverage and lowest-risk intervention was purchasing Reserved Instances for stable compute workloads.

The Cost Explorer analysis identified $38,400/month in EC2 and RDS On-Demand spend for instances that had been running continuously for more than 6 months — clear candidates for Reserved Instance purchasing.

EC2 Convertible Reserved Instances (3-year, no upfront). The SaaS platform runs on a fleet of r6i.large and r6i.xlarge instances for application servers and worker nodes. Purchasing 3-year Convertible RIs for these instance families delivered a 38% discount versus On-Demand pricing. The Convertible RI flexibility allows exchanging for equivalent instance families as the application evolves — important for a platform still making architectural decisions.

RDS Reserved Instances (1-year, all upfront). The primary and replica RDS Aurora instances were purchased on 1-year RIs with all-upfront payment — the highest discount tier at 42% versus On-Demand. The 1-year term matched the company’s architectural planning horizon; Aurora instance types were expected to be stable over that period.

Total savings from RI purchasing: $11,200/month with zero operational changes. This was the fastest win in the engagement — savings began accumulating immediately after purchase.

Track 2: Rightsizing Overprovisioned Database

The $28,000/month RDS line item reflected a db.r6g.8xlarge primary instance and two db.r6g.4xlarge read replicas — provisioned two years ago when the company anticipated rapid growth in database load that did not materialize at the expected scale.

CloudWatch metrics showed:

The appropriate configuration for this utilization profile:

Primary: db.r6g.2xlarge — provides 4x the compute of the current peak utilization ceiling, with headroom for 2-3x growth before the next rightsizing event.

Replicas: Reduced from 2x db.r6g.4xlarge to 2x db.r6g.xlarge — still providing read scale-out capacity, at the correct size for actual query load.

The migration used Aurora’s blue-green deployment feature — a new cluster at the target instance sizes was provisioned, synchronized with the production cluster via logical replication, and the cutover completed during a low-traffic maintenance window with 45 seconds of application downtime.

Post-migration monitoring: p99 query latency unchanged at 4ms for read queries, 12ms for write queries. CPU utilization on the new primary peaked at 38% during the first promotional event after migration — confirming adequate headroom.

Database savings: $9,800/month.

Track 3: Eliminating NAT Gateway Data Transfer Waste

The $14,800/month data transfer line item required the most investigation. AWS Cost Explorer showed NAT Gateway as the source, but did not explain what traffic was flowing through it.

VPC Flow Log analysis (run via Athena query against 30 days of flow logs in S3) revealed the breakdown:

Traffic TypeMonthly VolumeMonthly Cost
S3 API calls (GetObject, PutObject)4.2 TB$4,800
DynamoDB read/write operations2.8 TB$3,200
ECR container image pulls1.1 TB$1,300
SQS/SNS API calls0.9 TB$1,000
External API calls (legitimate)3.6 TB$4,100
Total12.6 TB$14,400

The first four categories — S3, DynamoDB, ECR, and SQS/SNS — were AWS service calls routing through the NAT Gateway because no VPC Endpoints were deployed. This traffic should cost nothing to route once Gateway and Interface VPC Endpoints are in place.

Gateway VPC Endpoints (free): Deployed for S3 and DynamoDB. Traffic to these services now routes over the AWS private network without the NAT Gateway, at no data transfer cost.

Interface VPC Endpoints: Deployed for ECR (both ecr.api and ecr.dkr), SQS, and SNS. These Interface Endpoints have an hourly cost (~$85/month total for all five endpoints across three AZs) but eliminate $5,500/month in NAT Gateway data transfer charges.

The $4,100/month in external API traffic (webhooks, payment processor calls, third-party integrations) is legitimate NAT Gateway usage that cannot be eliminated — but it is now the only traffic using the NAT Gateway.

Data transfer savings: $10,300/month (net of VPC Endpoint hourly costs).

Results: $27,000/Month Saved, Engineering Team Unblocked

The three-track approach delivered cumulative savings across a 9-week implementation:

TrackMonthly SavingsImplementation Time
Reserved Instance Coverage$11,200Week 2
Database Rightsizing$9,800Week 6
NAT Gateway Elimination$10,300Week 4
VPC Endpoint costs-$85Week 4
Total$31,215

The final realized savings landed at $27,000/month after accounting for additional monitoring costs, enhanced logging for the VPC flow log analysis infrastructure, and a modest increase in Support plan tier.

No performance regression. The database rightsizing was the highest-risk intervention and the one the engineering team was most concerned about. Post-migration monitoring across 8 weeks showed no measurable change in application latency, error rates, or database query performance. The application is running on appropriately-sized infrastructure for its current load — not undersized.

Unit economics reversed. AWS spend as a percentage of revenue dropped from 18.4% to 12.5% — closer to the 10-12% benchmark for mature SaaS businesses at this scale. The trajectory is now flat as revenue grows, rather than growing ahead of revenue.

Engineering team focus restored. The SRE team that had been periodically pulled into cost conversations was provided a monthly RI coverage report and a VPC endpoint health dashboard. Cost review is now 2 hours per month, not an ongoing distraction.


This case study describes a composite engagement based on anonymized client work. All identifying details have been removed or modified.

Results

$85k → $58k
Monthly AWS Spend
$27k ($324k/Year)
Monthly Savings
32%
Reduction Percentage
None (p99 Unchanged)
Performance Impact

Cut Your AWS Bill Without Cutting Performance

We audit AWS environments and identify concrete savings opportunities — typically 20-35% of spend — with no performance trade-offs.