When should you shard MongoDB and what are the cost implications?

Shard MongoDB when a single replica set can no longer handle write throughput or when your working set exceeds the RAM of the largest available instance. The cost implication is significant: sharding adds at minimum 3 config server nodes plus your shard nodes and mongos routers. A 2-shard cluster on Atlas M30 runs roughly $780/month vs $390/month for a single replica set — double the cost for initially modest throughput gains. Shard only when you have hit a measurable bottleneck: sustained write saturation (>80% CPU on primary for writes), working set exceeding instance RAM causing frequent disk reads, or data volume making individual replica set backups impractical (>5 TB). Pre-sharding a small workload adds cost and operational complexity without benefit.

How does index bloat increase MongoDB memory costs on AWS?

Every index in MongoDB is loaded into the WiredTiger cache along with working set data. A collection with 50 indexes will have all 50 index B-trees competing for cache space alongside the actual documents. On a M30 (8 GB RAM, ~4.8 GB cache), 50 indexes on a moderately-sized collection can consume 3–4 GB of cache, leaving only 800 MB to 1.8 GB for documents. Cache pressure causes WiredTiger to evict pages to disk, turning what should be memory-speed operations into disk I/O operations. On Atlas or EC2, this manifests as needing a larger instance (e.g., M30→M50 at $1.04/hr vs $0.54/hr) to accommodate working set that could fit in cache without redundant indexes. Audit indexes regularly: db.collection.getIndexes() reveals all indexes; db.collection.stats().indexSizes shows memory consumption per index.

When is MongoDB Atlas cheaper than self-hosted on EC2?

Atlas is cheaper than self-hosted EC2 when your total operational budget for the database team is under approximately $2,000/month. At that level, the managed operations Atlas provides — automated backups, monitoring, patching, failover testing, security hardening — would cost more in engineering time than Atlas charges in premium over raw EC2 compute. The breakeven calculation: Atlas M30 ($390/month) vs EC2 r7g.large ($0.126/hr = $92/month) + EBS 100 GB ($11.50/month) + backup storage (~$20/month) + ops time (even 4 hours/month × $150/hr = $600/month). The self-hosted total is ~$724/month, making Atlas $334/month cheaper once you account for engineering time. Atlas only becomes more expensive than self-managed when you have a dedicated DBA team managing multiple large clusters (10+ TB) where the per-instance premium accumulates faster than ops savings.

How do hot shards occur and what do they cost you?

A hot shard occurs when a shard key causes disproportionate write or read traffic to route to a single shard. Common causes: monotonically increasing shard keys (ObjectId, timestamp) where all new inserts go to the highest chunk (always the same shard), or low-cardinality shard keys (e.g., country code with most traffic from US) where one value represents 80% of writes. The cost impact is two-fold: the hot shard requires a larger instance to handle the concentrated load (potentially 2-4× the instance size of other shards), and you still pay for underutilized cold shards. A 4-shard cluster where one shard handles 70% of writes is paying for 4 instances of compute while getting throughput equivalent to 1.4 instances. Detection: use mongostat --discover to see per-shard operation rates, or Atlas metrics which show per-shard CPU and operation graphs.

MongoDB on AWS: Scalable, Cost-Efficient Architecture Guide

MongoDB is deceptively easy to get started with and deceptively expensive to run at scale without deliberate design. A schema-less document model lets teams ship fast, but the same flexibility that accelerates development — arbitrary document shapes, ad-hoc indexing, implicit collections — becomes a cost liability when workloads grow past a single replica set.

This guide covers the decisions that determine whether your MongoDB deployment on AWS is cost-efficient or burning money: where Atlas vs EC2 each wins, how shard key selection cascades into your entire cost structure, and the operational edge cases that quietly triple your instance size requirements.

Atlas vs EC2: Total Cost of Ownership

The Atlas-vs-EC2 decision is not about compute unit cost. It is about whether you are paying Atlas’s premium to offload operational work that would otherwise consume engineering time.

Atlas Pricing Model

Atlas charges by instance tier. Pricing for dedicated clusters (M10 and above) on AWS us-east-1:

Tier	vCPU	RAM	Storage	Price/hr	Monthly
M10	2	2 GB	10 GB	$0.09	~$65
M20	2	4 GB	20 GB	$0.20	~$144
M30	2	8 GB	40 GB	$0.54	~$390
M40	4	16 GB	80 GB	$1.04	~$749
M50	8	32 GB	160 GB	$2.00	~$1,440
M60	16	64 GB	320 GB	$3.95	~$2,844

Atlas M30 ($390/month) is a replica set (primary + 2 secondaries) with 8 GB RAM per node. The $390 covers compute for 3 nodes, storage replication across 3 AZs, automated backups (continuous PITR), monitoring, alerting, and managed patching.

EC2 Self-Managed TCO

For equivalent specs to Atlas M30 on EC2 (3-node replica set, r7g.large = 2 vCPU, 16 GB RAM):

EC2 r7g.large × 3 nodes: $0.126/hr × 3 = $0.378/hr = ~$276/month (on-demand), ~$170/month (1-year reserved)
EBS gp3 storage × 3 (40 GB each): $0.08/GB × 40 × 3 = $9.60/month
EBS snapshots (daily, ~30% daily change): ~$15/month
Data transfer (inter-AZ replication): ~$10/month
Pure infrastructure: ~$205/month reserved

Engineering costs (conservative estimate for a team without a dedicated DBA):

Backup monitoring and testing: 1 hr/month
Patching and upgrades: 2 hrs/month
Incident response and monitoring setup: 2 hrs/month
Security configuration (auth, TLS, network): 1 hr/month (amortized)
Total ops time: 6 hours/month × $150/hr = $900/month

True EC2 TCO: $205 + $900 = ~$1,105/month Atlas M30: $390/month

Atlas is $715/month cheaper than self-managed EC2 at M30 scale when you account for engineering time.

When EC2 Beats Atlas

The math inverts at larger scale with a dedicated DBA team. At Atlas M60 equivalents (3 × r7g.4xlarge on EC2):

EC2 r7g.4xlarge × 3: $1.008/hr × 3 = ~$2,197/month on-demand, ~$1,250/month reserved
Storage (320 GB × 3): ~$77/month
Atlas M60: $2,844/month (3-node replica set)

With a dedicated DBA who manages multiple clusters, the engineering overhead per cluster is minimal — perhaps 4 hours/month. At 4 hrs × $150 = $600/month, the EC2 total is ~$1,927/month vs Atlas $2,844/month. EC2 saves $917/month.

Practical breakeven: Atlas wins until your cluster reaches M40–M50 tier AND you have dedicated database operations capability. Below that, Atlas almost always wins on true TCO.

Sharding Strategy: Getting the Shard Key Right

Sharding is MongoDB’s horizontal scaling mechanism. Done correctly, it distributes writes and reads across multiple replica sets. Done incorrectly, it concentrates traffic on one shard (hot shard) while others sit idle — paying for capacity you do not use.

When to Shard

Do not shard preemptively. Sharding adds complexity and cost (mongos routers, config servers, 2+ shard replica sets). Shard when you hit a concrete bottleneck:

Write saturation: Primary CPU consistently >80% from write operations after vertical scaling to M50/M60
Working set overflow: Data + indexes exceed instance RAM, causing frequent disk reads (check WiredTiger cache eviction rate in Atlas metrics: wiredTiger.cache.pages evicted because they exceeded the in-memory maximum)
Storage scale: Single replica set >5 TB makes backup windows and storage operations unwieldy

Shard Key Selection Rules

The shard key determines which shard receives each document and cannot be changed after collection creation. A bad shard key is permanent until you reshard (MongoDB 6.0+ supports online resharding, but it is expensive in I/O).

Rule 1: High cardinality. The shard key must have enough distinct values to allow MongoDB to create many chunks. A boolean field (2 values) means 2 chunks maximum — only 2 shards can be used meaningfully.

Rule 2: Even write distribution. Monotonically increasing keys (ObjectId, timestamps, auto-increment integers) cause all new writes to land on the shard holding the highest chunk. Use hashed sharding to distribute these:

// BAD: All new documents go to one shard (monotonically increasing _id)
db.events.createIndex({ _id: "hashed" })
sh.shardCollection("mydb.events", { _id: "hashed" })

// GOOD: Hashed sharding distributes _id writes evenly
// Tradeoff: range queries on _id are scatter-gather across all shards

Rule 3: Query isolation. A shard key that matches common query patterns allows targeted queries (hitting one shard) rather than scatter-gather queries (hitting all shards). A scatter-gather query multiplies latency by the number of shards.

Compound shard key example (user_id + created_at):

// Compound shard key: user_id (high cardinality) + created_at (range)
sh.shardCollection("mydb.events", { user_id: 1, created_at: 1 })

// Benefits:
// - user_id distributes load across shards
// - created_at allows range queries per user to be targeted
// - Query for a single user's events hits only 1 shard

// Query pattern that uses the shard key efficiently:
db.events.find({
  user_id: "user_123",
  created_at: { $gte: ISODate("2026-01-01") }
})
// Targeted to 1 shard — fast, cheap

Chunk Balancing

MongoDB automatically balances chunks across shards. The balancer runs in the background and migrates chunks when shard data imbalance exceeds a threshold. Chunk migrations consume I/O and can impact performance.

Schedule balancer windows during off-peak hours:

// Restrict balancer to maintenance window (2 AM - 4 AM UTC)
use config
db.settings.updateOne(
  { _id: "balancer" },
  {
    $set: {
      activeWindow: {
        start: "02:00",
        stop:  "04:00"
      }
    }
  },
  { upsert: true }
)

// Check balancer status
sh.getBalancerState()
sh.isBalancerRunning()

// Check chunk distribution per shard
db.adminCommand({ listShards: 1 })
use config
db.chunks.aggregate([
  { $group: { _id: "$shard", count: { $sum: 1 } } },
  { $sort: { count: -1 } }
])

Index Design for Memory Efficiency

Every MongoDB index occupies WiredTiger cache space. Cache is typically 60% of available RAM (configurable). On an M30 (8 GB RAM), cache = 4.8 GB. The working set — documents and indexes accessed by recent queries — must fit in this cache for memory-speed performance.

Compound Index Field Order: ESR Rule

Compound index field order follows the ESR (Equality, Sort, Range) pattern: equality fields first, sort fields second, range fields third.

// Query pattern:
db.orders.find({
  status: "active",          // Equality
  user_id: "user_123"        // Equality
}).sort({ created_at: -1 })  // Sort
  .limit(20)

// CORRECT compound index: equality fields first, sort field last
db.orders.createIndex({ status: 1, user_id: 1, created_at: -1 })

// WHY: MongoDB uses the equality fields to narrow to a small set,
//      then traverses in sort order — no in-memory sort required.

// WRONG: sort field first forces a full collection scan with sort
db.orders.createIndex({ created_at: -1, status: 1, user_id: 1 })
// This index cannot efficiently serve the equality + sort pattern above

Index Explosion: The Hidden Memory Cost

Index explosion occurs when a collection accumulates many indexes over time as developers add indexes to fix slow queries without removing old ones. A collection with 50 indexes may have 30 redundant ones.

// Audit index usage (requires MongoDB 3.2+)
// Run for at least 24 hours of production traffic
db.orders.aggregate([
  { $indexStats: {} },
  {
    $project: {
      name: 1,
      accesses: "$accesses.ops",
      since: "$accesses.since",
      key: 1
    }
  },
  { $sort: { accesses: 1 } }
])

// Indexes with accesses: 0 or very low counts are candidates for removal
// Always verify during a full weekly traffic cycle before dropping

// Check index sizes in memory
db.orders.stats({ indexDetails: true }).indexSizes
// Example output:
// { "_id_": 2048000, "status_1_user_id_1_created_at_-1": 8192000, "old_unused_idx": 15360000 }

TTL Indexes for Automatic Expiry

TTL (Time To Live) indexes automatically delete documents after a specified duration. For session data, temporary tokens, or time-limited events, TTL indexes eliminate the need for a separate cleanup job (and its associated compute cost).

// Expire documents 7 days after created_at
db.sessions.createIndex(
  { created_at: 1 },
  { expireAfterSeconds: 604800 }  // 7 days
)

// Expire at a specific time stored in the document
// Document must contain: { expireAt: ISODate("2026-04-01T00:00:00Z") }
db.temp_tokens.createIndex(
  { expireAt: 1 },
  { expireAfterSeconds: 0 }
)

// TTL background task runs every 60 seconds (not real-time)
// For applications needing sub-minute precision, supplement with application-level checks

Partial Indexes to Reduce Working Set

A partial index only indexes documents matching a filter expression. This is the MongoDB equivalent of PostgreSQL’s partial index — it dramatically reduces index size when a small subset of documents is frequently queried.

// Only index active orders (5% of total collection)
// 95% of documents (completed, cancelled) never need this index
db.orders.createIndex(
  { user_id: 1, created_at: -1 },
  {
    partialFilterExpression: {
      status: { $in: ["active", "pending"] }
    }
  }
)

// Sparse index: only index documents where field exists
// Useful for optional fields to avoid null entries in index
db.users.createIndex(
  { premium_expires_at: 1 },
  { sparse: true }  // Only indexes users with premium_expires_at field
)

// Memory impact: index with 50k active orders (2% of 2.5M total)
// vs full index on all 2.5M — roughly 50x smaller index

Memory and WiredTiger Cache Sizing

WiredTiger is MongoDB’s default storage engine. Its performance is fundamentally governed by how much of the working set fits in cache.

Cache Configuration

The default cache size is the larger of 256 MB or 60% of available RAM minus 1 GB. For a 16 GB instance:

max(256MB, (16GB - 1GB) × 0.6) = max(256MB, 9GB) = 9GB

Adjust cache size for workloads where MongoDB shares an EC2 instance with other processes, or where you want to reserve more OS buffer cache for filesystem operations:

# mongod.conf for production EC2 deployment
storage:
  dbPath: /var/lib/mongodb
  journal:
    enabled: true
  wiredTiger:
    engineConfig:
      cacheSizeGB: 12  # Explicit cache: 12 GB of 16 GB total RAM
      # Leave 4 GB for OS, filesystem cache, and connection overhead
    collectionConfig:
      blockCompressor: snappy  # snappy: best CPU/compression tradeoff
    indexConfig:
      prefixCompression: true  # Prefix compression for index keys (30-50% savings)

net:
  port: 27017
  bindIp: 0.0.0.0
  tls:
    mode: requireTLS
    certificateKeyFile: /etc/ssl/mongodb/server.pem
    CAFile: /etc/ssl/mongodb/ca.pem

security:
  authorization: enabled
  keyFile: /etc/mongodb/keyfile  # For replica set internal auth

replication:
  replSetName: "rs0"

operationProfiling:
  mode: slowOp
  slowOpThresholdMs: 100

setParameter:
  enableLocalhostAuthBypass: 0

When to Scale Up vs Scale Out

Scale up (larger instance) when:

Working set (actively queried documents + all indexes) can fit on a larger instance
Write operations are single-document (benefit from faster CPU on larger instance)
Sharding complexity outweighs the instance cost difference

Scale out (sharding) when:

Write throughput genuinely saturates the primary on the largest viable instance
Working set exceeds ~400 GB (requires M200+ tier on Atlas, extremely expensive)
Compliance requires data residency per region/shard

A useful working set sizing query:

// Check WiredTiger cache hit ratio (should be >95%)
db.serverStatus().wiredTiger.cache["pages read into cache"]
db.serverStatus().wiredTiger.cache["pages requested from the cache"]

// Atlas metrics: look for "Cache Usage" — if consistently >90% of max cache,
// your working set exceeds cache size and you need a larger instance

// Estimate working set size
let stats = db.runCommand({ dbStats: 1, scale: 1048576 }) // in MB
print("Data size: " + stats.dataSize + " MB")
print("Index size: " + stats.indexSize + " MB")
print("Total working set estimate: " + (stats.dataSize + stats.indexSize) + " MB")

Write-Heavy vs Read-Heavy Tuning

MongoDB’s replica set architecture supports different consistency and performance tradeoffs depending on whether your workload is write-dominated or read-dominated.

Write Concern Tuning

Write concern controls how many nodes must acknowledge a write before it is considered successful.

// w: 1 — Acknowledge from primary only
// Fastest, ~1-2ms latency, but risk of data loss if primary fails before replication
db.events.insertOne(
  { type: "page_view", user_id: "u123", ts: new Date() },
  { writeConcern: { w: 1, j: false } }  // No journal flush either
)
// Use for: high-volume analytics events, logs where occasional loss is acceptable

// w: "majority" — Acknowledge from majority of replica set nodes
// Slower, ~5-15ms latency (round trip to secondary), zero data loss guarantee
db.orders.insertOne(
  { total: 99.99, status: "pending", user_id: "u123" },
  { writeConcern: { w: "majority", j: true } }  // With journal flush
)
// Use for: financial transactions, order records, user data

// Cost implication: w:majority adds ~10ms latency per write
// At 10,000 writes/second, this adds 100 CPU-seconds of blocking per second
// on high-concurrency workloads — may require larger instance or write batching

Read Preference for Secondary Routing

Routing reads to secondaries reduces primary load without adding instances:

// Route analytics reads to secondary (allows eventual consistency)
const analyticsDb = client.db("mydb", {
  readPreference: "secondaryPreferred",
  readConcern: { level: "local" }  // Accept slightly stale data
})

// Route OLTP reads to primary (requires consistent data)
const transactionDb = client.db("mydb", {
  readPreference: "primary",
  readConcern: { level: "majority" }
})

// Tag-based routing: send reporting queries to tagged secondary
// Configure replica set member tags in mongod.conf:
// replication.members[2].tags: { use: "reporting" }

const reportingDb = client.db("mydb", {
  readPreference: new ReadPreference("secondary", [{ use: "reporting" }])
})

Change Streams for Event-Driven Patterns

Change streams provide a real-time feed of changes to a collection. For event-driven architectures, using change streams avoids polling queries that run on a schedule and consume unnecessary read capacity.

// Instead of: SELECT * FROM events WHERE processed = false (polling every 5 seconds)
// Use change streams to react immediately to new inserts
const changeStream = db.collection("orders").watch([
  { $match: { operationType: "insert" } },
  {
    $match: {
      "fullDocument.status": "pending",
      "fullDocument.total": { $gt: 100 }
    }
  }
])

changeStream.on("change", async (change) => {
  await processOrder(change.fullDocument)
})

// Change streams require replica set or sharded cluster
// They use the oplog — ensure oplog is sized for expected replication lag window

Aggregation Pipeline Optimization

Aggregation pipelines can be efficient or catastrophically expensive depending on stage ordering. The rule: push filtering and projection as early as possible to reduce document volume in subsequent stages.

Optimized Aggregation Pipeline

// Scenario: Find top-10 products by revenue in the last 30 days
// for US customers with order total > $50

// UNOPTIMIZED: $match late, processes all documents through $lookup
db.orders.aggregate([
  {
    $lookup: {
      from: "customers",
      localField: "customer_id",
      foreignField: "_id",
      as: "customer"
    }
  },
  { $unwind: "$customer" },
  {
    $match: {
      "customer.country": "US",
      created_at: { $gte: new Date(Date.now() - 30 * 24 * 60 * 60 * 1000) },
      total: { $gt: 50 }
    }
  },
  {
    $group: {
      _id: "$product_id",
      revenue: { $sum: "$total" }
    }
  },
  { $sort: { revenue: -1 } },
  { $limit: 10 }
])

// OPTIMIZED: $match first, $project early to reduce document size,
//            $limit after $group to cap work
db.orders.aggregate([
  // Stage 1: Filter early — only 30-day US orders > $50
  // This uses index { created_at: 1, total: 1 } efficiently
  {
    $match: {
      created_at: { $gte: new Date(Date.now() - 30 * 24 * 60 * 60 * 1000) },
      total: { $gt: 50 }
    }
  },
  // Stage 2: Project only needed fields before lookup (reduces data in pipeline)
  {
    $project: {
      customer_id: 1,
      product_id: 1,
      total: 1,
      _id: 0
    }
  },
  // Stage 3: Lookup only needed customer data
  {
    $lookup: {
      from: "customers",
      localField: "customer_id",
      foreignField: "_id",
      as: "customer",
      pipeline: [
        { $match: { country: "US" } },  // Filter within lookup
        { $project: { country: 1 } }    // Only return needed field
      ]
    }
  },
  // Stage 4: Remove non-US results (small set after lookup filter)
  { $match: { "customer.0": { $exists: true } } },
  // Stage 5: Group and sum
  {
    $group: {
      _id: "$product_id",
      revenue: { $sum: "$total" }
    }
  },
  { $sort: { revenue: -1 } },
  { $limit: 10 }
])

// Verify with explain
db.orders.explain("executionStats").aggregate([/* pipeline */])
// Check: nReturned vs totalKeysExamined ratio should be close to 1:1
// High totalKeysExamined with low nReturned = missing or wrong index

Edge Cases That Cause MongoDB Costs to Spiral

Hot Shard Detection and Remediation

A hot shard in a sharded cluster is the most insidious cost problem. You pay for N shards but one shard does 70%+ of the work, requiring a much larger instance to keep up — while other shards sit mostly idle.

# Detect hot shards using mongostat (run from mongos)
mongostat --host mongos.myapp.internal:27017 \
          --username admin \
          --password "${MONGO_ADMIN_PASSWORD}" \
          --authenticationDatabase admin \
          --discover \
          --rowcount 5 \
          2>/dev/null | grep -v "^$"

# Look for: one shard with insert/update/delete counts 5x higher than others
# Example output showing shard imbalance:
# shard01: insert: 850, update: 1200, delete: 300
# shard02: insert: 40,  update: 60,   delete: 15
# shard03: insert: 38,  update: 55,   delete: 12

When you detect a hot shard, the fix depends on the cause:

// Cause: Monotonically increasing shard key (e.g., ObjectId, timestamp)
// Fix: Add hashed field to shard key (requires resharding in MongoDB 6.0+)

// MongoDB 6.0+ online resharding:
db.adminCommand({
  reshardCollection: "mydb.events",
  key: { user_id: "hashed", created_at: 1 }  // Distribute by user_id hash
})

// Check resharding progress
db.getSiblingDB("admin").aggregate([
  { $currentOp: { allUsers: true, idleConnections: false } },
  { $match: { type: "op", "originatingCommand.reshardCollection": { $exists: true } } }
])

// For MongoDB < 6.0: no online resharding exists.
// Must migrate collection to new sharded collection with correct shard key.
// Use mongomirror or mongodump/mongorestore with background migration.

Oplog Overflow: Silent Data Integrity Risk

The oplog (operations log) is a capped collection that records all write operations. Replica set secondaries replicate by reading the oplog. If a secondary falls behind primary (replication lag) and the oplog window is smaller than the lag, the secondary cannot resume replication — it requires a full resync.

A full resync copies the entire database from primary, consuming significant bandwidth, I/O, and time. During resync, the resyncing node is unavailable.

// Check current oplog size and estimated window
use local
db.oplog.rs.stats().maxSize                  // Max oplog size in bytes
db.oplog.rs.stats().size                     // Current oplog size in bytes

// Oplog window: time range covered by current oplog entries
db.adminCommand({ replSetGetStatus: 1 }).members.forEach(member => {
  print(`Member: ${member.name}`);
  print(`  State: ${member.stateStr}`);
  print(`  Optime: ${member.optimeDate}`);
  if (member.optimeDurable) {
    print(`  Optime Durable: ${member.optimeDurableDate}`);
  }
})

// Calculate oplog window duration
const firstEntry = db.oplog.rs.find().sort({ $natural: 1 }).limit(1).next()
const lastEntry  = db.oplog.rs.find().sort({ $natural: -1 }).limit(1).next()
const windowHours = (lastEntry.ts.getTime() - firstEntry.ts.getTime()) / (1000 * 60 * 60)
print(`Oplog window: ${windowHours.toFixed(1)} hours`)
// Should be at least 2x your expected maximum replication lag
// Recommendation: 24+ hours for production

Increase oplog size on Atlas via the Atlas UI (Cluster → Edit → Advanced Options → Oplog Size). On self-managed EC2:

// Increase oplog size to 50 GB (50,000 MB)
// Run on primary
db.adminCommand({
  replSetResizeOplog: 1,
  size: 51200  // MB
})

Replication Lag Under Sustained Write Pressure

When write throughput on primary consistently exceeds secondary replication capacity, lag grows monotonically until the oplog window is exhausted. Causes include: large batch inserts, index builds on primary (replicated to secondaries), or secondary hardware that is slower than primary.

// Monitor replication lag per member
db.adminCommand({ replSetGetStatus: 1 }).members.filter(m => m.stateStr === "SECONDARY").forEach(m => {
  const lagSeconds = (new Date() - m.optimeDate) / 1000
  print(`${m.name}: lag = ${lagSeconds.toFixed(1)}s`)
})

// If secondary consistently lags > 10 seconds on an M30, consider:
// 1. Check if secondary has same instance type as primary (Atlas auto-provisions same tier)
// 2. Check for index builds — these replicate to secondaries and can cause lag spikes
// 3. Throttle batch write operations:

// Instead of: insertMany(10000 docs at once)
// Use: batch inserts with artificial throttle
async function throttledBatchInsert(docs, batchSize = 100, delayMs = 50) {
  for (let i = 0; i < docs.length; i += batchSize) {
    const batch = docs.slice(i, i + batchSize)
    await collection.insertMany(batch, { ordered: false })
    if (i + batchSize < docs.length) {
      await new Promise(resolve => setTimeout(resolve, delayMs))
    }
  }
}

Atlas Terraform Configuration

Terraform with the Atlas provider manages cluster creation, scaling, and configuration as code.

# main.tf

terraform {
  required_providers {
    mongodbatlas = {
      source  = "mongodb/mongodbatlas"
      version = "~> 1.15"
    }
  }
}

provider "mongodbatlas" {
  public_key  = var.atlas_public_key
  private_key = var.atlas_private_key
}

resource "mongodbatlas_project" "myapp" {
  name   = "myapp-production"
  org_id = var.atlas_org_id
}

resource "mongodbatlas_cluster" "primary" {
  project_id = mongodbatlas_project.myapp.id
  name       = "myapp-production"

  # M30 tier: 2 vCPU, 8 GB RAM
  provider_name               = "AWS"
  provider_region_name        = "US_EAST_1"
  provider_instance_size_name = "M30"
  cloud_backup                = true  # Continuous cloud backup (PITR)

  # MongoDB version
  mongo_db_major_version = "7.0"

  # Replication factor (3 = 1 primary + 2 secondaries)
  replication_factor = 3

  # Auto-scaling: scale up when CPU > 75%, scale down when < 10%
  auto_scaling_compute_enabled                    = true
  auto_scaling_compute_scale_down_enabled         = true
  provider_auto_scaling_compute_min_instance_size = "M30"
  provider_auto_scaling_compute_max_instance_size = "M60"

  # Storage auto-scaling
  auto_scaling_disk_gb_enabled = true

  # Advanced configuration
  advanced_configuration {
    javascript_enabled                   = false  # Disable server-side JS for security
    minimum_enabled_tls_protocol         = "TLS1_2"
    no_table_scan                        = false
    oplog_size_mb                        = 51200  # 50 GB oplog
    sample_refresh_interval_bi_connector = 300
    transaction_lifetime_limit_seconds   = 60
  }

  labels {
    key   = "environment"
    value = "production"
  }

  labels {
    key   = "team"
    value = "platform"
  }
}

# Database user with least-privilege access
resource "mongodbatlas_database_user" "app_user" {
  username           = "myapp_service"
  password           = var.mongodb_app_password
  project_id         = mongodbatlas_project.myapp.id
  auth_database_name = "admin"

  roles {
    role_name     = "readWrite"
    database_name = "myapp"
  }

  # Restrict to specific collections for extra security
  roles {
    role_name       = "read"
    database_name   = "myapp"
    collection_name = "audit_log"
  }

  scopes {
    name = mongodbatlas_cluster.primary.name
    type = "CLUSTER"
  }
}

# Network access: restrict to VPC CIDR
resource "mongodbatlas_project_ip_access_list" "vpc" {
  project_id = mongodbatlas_project.myapp.id
  cidr_block = var.vpc_cidr  # e.g., "10.0.0.0/16"
  comment    = "VPC private subnets"
}

# Atlas VPC peering with AWS VPC
resource "mongodbatlas_network_peering" "aws_peer" {
  project_id             = mongodbatlas_project.myapp.id
  accepter_region_name   = "us-east-1"
  provider_name          = "AWS"
  route_table_cidr_block = var.vpc_cidr
  vpc_id                 = var.vpc_id
  aws_account_id         = var.aws_account_id
  container_id           = mongodbatlas_cluster.primary.container_id
}

output "connection_string" {
  value     = mongodbatlas_cluster.primary.connection_strings[0].standard_srv
  sensitive = true
}

Cost Optimization Quick Reference

For a production MongoDB deployment on AWS, these are the highest-leverage actions ordered by effort-to-impact:

Action	Effort	Impact
Enable Atlas auto-scaling (M30→M60 range)	15 min	Avoid over-provisioning by 40%
Audit and drop zero-use indexes	2 hours	Reduce working set, defer instance upgrade
Add TTL indexes for temporary data	1 hour	Eliminate cleanup jobs, reduce storage
Move to partial indexes for filtered queries	3 hours	Reduce cache pressure on high-volume collections
Add hashed prefix to shard key if hot shard detected	1 day	Rebalance write load across shards
Switch to w:1 for non-critical high-volume writes	2 hours	Reduce write latency, lower primary CPU
Resize oplog to 24-hour window	30 min	Prevent full resync incidents

For teams evaluating MongoDB alongside DynamoDB for new workloads, see our DynamoDB single-table design guide for a direct comparison of when each fits best. For the broader cost reduction framework across all AWS services, the AWS cost control architecture playbook covers cross-service patterns.

How to Design MongoDB for Scalable, Cost-Efficient Workloads on AWS

Atlas vs EC2: Total Cost of Ownership

Atlas Pricing Model

EC2 Self-Managed TCO

When EC2 Beats Atlas

Sharding Strategy: Getting the Shard Key Right

When to Shard

Shard Key Selection Rules

Chunk Balancing

Index Design for Memory Efficiency

Compound Index Field Order: ESR Rule

Index Explosion: The Hidden Memory Cost

TTL Indexes for Automatic Expiry

Partial Indexes to Reduce Working Set

Memory and WiredTiger Cache Sizing

Cache Configuration

When to Scale Up vs Scale Out

Write-Heavy vs Read-Heavy Tuning

Write Concern Tuning

Read Preference for Secondary Routing

Change Streams for Event-Driven Patterns

Aggregation Pipeline Optimization

Optimized Aggregation Pipeline

Edge Cases That Cause MongoDB Costs to Spiral

Hot Shard Detection and Remediation

Oplog Overflow: Silent Data Integrity Risk

Replication Lag Under Sustained Write Pressure

Atlas Terraform Configuration

Cost Optimization Quick Reference

Ready to discuss your AWS strategy?

Recommended Reading

How to Run High-Scale Postgres on AWS Without Breaking the Bank

Building Real-Time Analytics Dashboards with AWS QuickSight

Building Fintech Applications on AWS: Architecture Patterns

Building a Data Lake on AWS: S3 + Glue + Athena Architecture

AI & assistant-friendly summary

Summary

Key Facts

Entity Definitions

Related Content

Atlas vs EC2: Total Cost of Ownership

Atlas Pricing Model

EC2 Self-Managed TCO

When EC2 Beats Atlas

Sharding Strategy: Getting the Shard Key Right

When to Shard

Shard Key Selection Rules

Chunk Balancing

Index Design for Memory Efficiency

Compound Index Field Order: ESR Rule

Index Explosion: The Hidden Memory Cost

TTL Indexes for Automatic Expiry

Partial Indexes to Reduce Working Set

Memory and WiredTiger Cache Sizing

Cache Configuration

When to Scale Up vs Scale Out

Write-Heavy vs Read-Heavy Tuning

Write Concern Tuning

Read Preference for Secondary Routing

Change Streams for Event-Driven Patterns

Aggregation Pipeline Optimization

Optimized Aggregation Pipeline

Edge Cases That Cause MongoDB Costs to Spiral

Hot Shard Detection and Remediation

Oplog Overflow: Silent Data Integrity Risk

Replication Lag Under Sustained Write Pressure

Atlas Terraform Configuration

Cost Optimization Quick Reference

Ready to discuss your AWS strategy?

Recommended Reading

How to Run High-Scale Postgres on AWS Without Breaking the Bank

Building Real-Time Analytics Dashboards with AWS QuickSight

Building Fintech Applications on AWS: Architecture Patterns

Building a Data Lake on AWS: S3 + Glue + Athena Architecture