NoSQL & Vector Database
MongoDB with AWS
MongoDB Atlas on AWS: document database, native vector search for RAG, stream processing, and dedicated Search Nodes — with AWS KMS and PrivateLink throughout.
Last updated:April 29, 2026Author:FactualMinds Cloud Integration TeamReviewed by:FactualMinds AWS-certified architects (Solutions Architect – Professional)
AI & assistant-friendly summary
This section provides structured content for AI assistants and search engines. You can cite or summarize it when referencing this page.
Summary
MongoDB Atlas on AWS in 2026: MongoDB 8.0, Vector Search GA, Stream Processing, Queryable Encryption, Edge Server — vs DynamoDB, OpenSearch, pgvector.
Key Facts
- • MongoDB Atlas on AWS in 2026: MongoDB 8
- • 0, Vector Search GA, Stream Processing, Queryable Encryption, Edge Server — vs DynamoDB, OpenSearch, pgvector
- • MongoDB Atlas on AWS: document database, native vector search for RAG, stream processing, and dedicated Search Nodes — with AWS KMS and PrivateLink throughout
- • What is MongoDB Atlas on AWS in 2026
- • On AWS it runs inside MongoDB-owned VPCs in your chosen region; you connect via AWS PrivateLink (recommended) or VPC peering
Entity Definitions
- Amazon Bedrock
- Amazon Bedrock is relevant to mongodb with aws.
- Bedrock
- Bedrock is relevant to mongodb with aws.
- Lambda
- Lambda is relevant to mongodb with aws.
- EC2
- EC2 is relevant to mongodb with aws.
- S3
- S3 is relevant to mongodb with aws.
- Amazon S3
- Amazon S3 is relevant to mongodb with aws.
- RDS
- RDS is relevant to mongodb with aws.
- Amazon RDS
- Amazon RDS is relevant to mongodb with aws.
- Aurora
- Aurora is relevant to mongodb with aws.
- DynamoDB
- DynamoDB is relevant to mongodb with aws.
- Amazon DynamoDB
- Amazon DynamoDB is relevant to mongodb with aws.
- CloudWatch
- CloudWatch is relevant to mongodb with aws.
- IAM
- IAM is relevant to mongodb with aws.
- VPC
- VPC is relevant to mongodb with aws.
- EKS
- EKS is relevant to mongodb with aws.
## MongoDB Atlas on AWS
MongoDB Atlas is the officially supported way to run MongoDB in the cloud. Hosted on AWS (as well as Azure and GCP), Atlas removes the operational overhead of running replica sets and sharded clusters while adding a developer platform: Atlas Search, Atlas Vector Search, Atlas Stream Processing, Atlas Edge Server, and Queryable Encryption. For AWS customers, the standard 2026 integration is PrivateLink plus AWS KMS customer-managed keys plus IAM authentication — no Atlas credentials ever live on disk.
## What's new for MongoDB on AWS in 2026
- **MongoDB 8.0** — roughly 32% faster write performance on mixed workloads versus 7.0, improved time-series collections, block-level compression defaults, and a new Bulk Write API.
- **Atlas Vector Search GA** — hybrid queries combining vector similarity with MongoDB filters in one aggregation stage; HNSW index type; integrates with Amazon Bedrock (Titan Embeddings v2, Cohere Embed) and any OpenAI-compatible embedding API.
- **Search Nodes GA** — dedicated nodes that isolate search/vector workloads from operational reads and writes; scale independently.
- **Atlas Stream Processing GA** — stream processing in the MongoDB aggregation framework; sources from Kafka, MSK, and Atlas change streams.
- **Queryable Encryption GA** — equality queries on client-side-encrypted fields; range queries in public preview (2025).
- **Atlas Edge Server** — disconnected-capable local MongoDB that syncs with Atlas for retail, manufacturing, and healthcare edge workloads.
- **Atlas Data Lake / Online Archive** — tier cold data to S3 automatically; query both warm and cold data through the same connection.
- **AWS PrivateLink improvements** — lower-latency paths and support for additional regions; cross-region PrivateLink for multi-region Atlas deployments.
- **IAM authentication** — assume-role flow means apps running on EC2, ECS, EKS, or Lambda authenticate to Atlas without a static credential in AWS Secrets Manager.
## Why MongoDB for AWS applications
**Flexible schema + rich querying**
- Documents store nested data naturally; aggregation pipeline is expressive and index-aware.
- Rich drivers for every major language; official Node, Python, Java, Go, Rust, .NET, Ruby drivers.
**AI-ready out of the box**
- Atlas Vector Search hosts embeddings alongside the source documents — fewer moving parts than a separate vector DB.
- Native integrations with Bedrock, LangChain, LlamaIndex, Haystack, and Semantic Kernel.
- Hybrid search (vector + text + metadata) in one query.
**Operational maturity**
- Multi-region replica sets; automatic failover within ~30 seconds.
- Continuous oplog-based backup with up to 35-day point-in-time recovery.
- Sharded clusters for horizontal scale into multi-TB/PB ranges.
## Atlas Vector Search — the decision that actually matters
Atlas Vector Search is usually the right call when:
- The source-of-truth documents already live in MongoDB.
- The working-set of vectors fits in memory at an instance size you can afford (rule of thumb: ~1.5–2 KB per vector for float32 + HNSW graph overhead).
- You need hybrid filters (e.g., `userId = X AND vector similarity > 0.8`) in one query.
- You are using Amazon Bedrock for embeddings and want minimal pipeline glue.
Use **Amazon OpenSearch Service k-NN** when lexical BM25 search matters as much as vector similarity, you need fine-grained HNSW/IVF tuning, or you are already an OpenSearch shop.
Use **pgvector on Amazon RDS / Aurora Postgres** when the workload is predominantly relational and vector search is a secondary capability — no new database, no new ops surface.
Use **Amazon S3 Vectors** (new in 2025) for extremely large, cold vector corpora where query latency budget is in the hundreds of ms and storage cost dominates.
## Atlas Search Nodes
Search Nodes are dedicated Atlas instances that host Atlas Search and Atlas Vector Search indexes, isolated from the base cluster. Deploy them when:
- Search or vector workloads are bursty enough to steal memory and IOPS from operational queries.
- Memory requirements for the vector index diverge from operational node sizing.
- You want independent scaling — scale search without resizing the base tier.
Below ~M30 or with light search traffic, the base nodes are typically enough.
## Atlas Stream Processing
Atlas Stream Processing (ASP) runs stream-processing pipelines written in the MongoDB aggregation framework over Kafka, MSK, or Atlas change streams. Outputs can land in collections, Kafka topics, or HTTP webhooks.
Use ASP when:
- Source-of-truth data is already in MongoDB and transformations are JSON-document-shaped.
- You want to avoid a Kinesis Data Streams + Firehose + Lambda + output store stack for a simple pipeline.
Use **Amazon Kinesis Data Streams + MSK + Amazon Managed Service for Apache Flink** when you need cross-source joins, exactly-once semantics at scale, or a full event-sourcing architecture.
## Queryable Encryption
Queryable Encryption encrypts fields on the client before they leave the application; the server sees only ciphertext, yet still supports equality queries (GA) and range queries (public preview).
- Ideal for SSNs, tax IDs, email addresses under HIPAA or PCI scope.
- Pair with AWS KMS customer-managed keys as the encryption-key provider.
- Compare to field-level encryption in DynamoDB Encryption SDK — similar intent, different query surface.
## Connectivity patterns on AWS
**AWS PrivateLink (preferred)**
- Atlas cluster exposed through a VPC endpoint in your VPC; traffic stays on AWS backbone.
- Single-region and cross-region options.
- Gateway fee offset by avoided NAT data-transfer costs.
**VPC peering**
- Supported but less flexible; prefer PrivateLink for new deployments.
**Public access + IP allow list**
- Acceptable for dev and staging; avoid in production.
**Authentication**
- **IAM auth to Atlas** — EC2/ECS/EKS/Lambda assumes a role and Atlas validates against the role''s ARN.
- SCRAM-SHA-256 username/password (stored in AWS Secrets Manager) as fallback.
- X.509 client certs for machine-to-machine auth.
## MongoDB vs DynamoDB vs RDS — when to pick which
| Consideration | MongoDB Atlas | DynamoDB | RDS / Aurora Postgres |
| --------------------- | -------------------------------------- | -------------------------------------- | -------------------------- |
| Data model | Flexible document | Key-value / wide column | Relational |
| Schema | Dynamic | Fixed on primary keys | Enforced |
| Vector search | Atlas Vector Search | Not native | pgvector extension |
| Complex querying | Aggregation pipeline | Limited; use OpenSearch/Athena to join | Full SQL + JSONB |
| Operational overhead | Managed by MongoDB Inc. | Fully managed by AWS | Managed by AWS |
| Geo-replication | Global Clusters / Multi-region | Global Tables (strong in AWS) | Aurora Global Database |
| Typical starting cost | ~$57/month M10 | <$25/month light usage | ~$30/month db.t4g.micro |
| Best-fit AWS persona | Teams already on MongoDB, RAG builders | AWS-native high-scale KV workloads | Relational-first / fintech |
## Reference architecture (2026 default)
```
AWS VPC MongoDB Atlas project (single region)
───────── ─────────────────────────────────
Lambda / ECS / EKS Atlas replica set (3 nodes, 3 AZ)
│ IAM role (assumeRole) ├── Primary
│ ├── Secondary → driver readPreference
│ └── Secondary
▼ │
PrivateLink VPC endpoint ─────► Atlas PrivateLink endpoint
com.amazonaws.<region>.mongodb (per-region; cross-region available)
│
Search Nodes (dedicated)
└── Atlas Vector Search index
└── Atlas Search (BM25)
│
Online Archive ──► S3 (cold tier)
│
Backup snapshot store (Atlas-managed)
└── 35-day PITR window
```
**Encryption.** TLS 1.3 client → Atlas (mandatory); at-rest with AWS KMS customer-managed key (configured per cluster); Queryable Encryption for client-side field encryption with KMS-wrapped DEKs.
## Implementation
**Minimal — Node.js + IAM role auth (Lambda, ECS, EKS):**
```javascript
import { MongoClient } from 'mongodb';
const uri =
'mongodb+srv://cluster0.mongodb.net/?' +
'authSource=%24external&authMechanism=MONGODB-AWS&retryWrites=true&w=majority';
const client = new MongoClient(uri, {
maxPoolSize: 50,
minPoolSize: 5,
serverSelectionTimeoutMS: 5000,
socketTimeoutMS: 45000,
retryWrites: true,
retryReads: true,
});
export const handler = async (event) => {
// Reuse the connection across Lambda invocations (lifetime of the execution context)
await client.connect();
const db = client.db('orders');
const result = await db.collection('orders').findOne({ _id: event.id });
return result;
};
```
**Production — pooled connections, structured retry, vector search:**
```javascript
import { MongoClient, ReadPreference } from 'mongodb';
const client = new MongoClient(process.env.MONGODB_URI, {
maxPoolSize: 100,
minPoolSize: 10,
maxIdleTimeMS: 60_000,
serverSelectionTimeoutMS: 5_000,
socketTimeoutMS: 30_000,
retryWrites: true,
retryReads: true,
readPreference: ReadPreference.SECONDARY_PREFERRED,
readConcern: { level: 'majority' },
writeConcern: { w: 'majority', wtimeoutMS: 10_000 },
appName: 'orders-api', // shows up in Atlas Performance Advisor
});
await client.connect();
async function vectorSearch(queryEmbedding, userId) {
return client
.db('rag')
.collection('documents')
.aggregate([
{
$vectorSearch: {
index: 'embeddings_idx',
queryVector: queryEmbedding,
path: 'embedding',
numCandidates: 200,
limit: 10,
filter: { tenantId: userId },
},
},
{ $project: { _id: 1, text: 1, score: { $meta: 'vectorSearchScore' } } },
])
.toArray();
}
```
For Lambda, declare the client outside the handler so the connection pool survives across invocations within an execution context. Use Provisioned Concurrency for latency-sensitive paths to avoid cold-start handshake (TLS + replica-set discovery is ~200–400 ms).
## Failure modes & resilience
**1. Replica-set primary failover (~30s).** The driver detects, re-elects, and routes writes to the new primary. With `retryWrites: true` (default 5.x+) most application writes are transparently retried. Mitigation: keep `serverSelectionTimeoutMS` tight (5 s) so failover surfaces quickly; alarm on `connections.failed` spikes.
**2. Oplog window starvation.** A long-running migration or analytics scan that lags secondary replication can cause the secondary to fall outside the oplog window (default ~24h on M30). Recovery: re-sync the secondary (Atlas handles automatically, but takes hours on TB-scale). Prevention: monitor `oplogWindow.hours` — alarm if `< 24`.
**3. Vector-index memory pressure.** Vector indexes load into RAM. A working set that exceeds node memory causes evictions and p99 latency to climb 10–100×. Prevention: size with rule of thumb `~1.5–2 KB / vector × number of vectors × 1.5 (HNSW overhead)`; move to dedicated Search Nodes when the math says so.
**4. PrivateLink endpoint exhaustion.** Each Atlas PrivateLink endpoint supports a finite number of concurrent connections per AZ. Symptom: connection timeouts under burst load while CPU is idle. Mitigation: scale to multiple endpoints per region; ensure connection pool `maxPoolSize` × number-of-Lambda-concurrency ≤ endpoint capacity.
**5. IAM role token expiry mid-query.** STS credentials expire (default 1h, max 12h). Long-running aggregations can outlive the token. Mitigation: use connection lifetime ≤ token TTL, or set `maxIdleTimeMS` to force pool refresh; use `assumeRoleSessionDuration` of 12h for batch jobs.
**6. Multi-region read-preference traps.** `readPreference: secondary` against a multi-region cluster can route reads to a remote secondary, adding 100+ ms latency. Use `readPreferenceTags: { region: "us-east-1" }` to pin reads to the local region's secondary.
**7. Driver version drift.** Atlas server features (vector search aggregation stages, time-series collections, queryable encryption) require minimum driver versions. Pin and bump deliberately; the official driver release notes document server-version compatibility.
## Observability runbook
**Atlas-side alarms (configure via Atlas UI or Atlas Admin API):**
| Alarm | Threshold | Action |
| ----------------- | ---------------------- | ---------------------------------------------------------- |
| `Replication Lag` | `> 10s` for 5 min | Check secondary CPU / IOPS; recent schema migration? |
| `Oplog Window` | `< 24 hours` | Slow secondaries; consider larger oplog or fewer batch ops |
| `Cache Used` | `> 90%` for 15 min | Working set > RAM; resize tier or shard |
| `Connections` | `> 90%` of cluster max | Pool sizing; check for connection leaks |
| `Slow Operations` | sustained `> baseline` | Performance Advisor → suggested indexes |
**Datadog / CloudWatch correlation alarms:**
| Alarm | First action |
| ------------------------------------- | --------------------------------------------------------------------------- |
| App p99 query latency `> 500ms` | Atlas Performance Advisor → missing index? Consider Search Nodes for vector |
| Driver `MongoNetworkError` rate spike | Confirm PrivateLink endpoint health; check VPC route tables |
| Lambda `Init Duration` `> 1s` | Connection pool not reused — verify client outside handler |
| Vector search latency `> 200ms` | Index memory pressure; shard or move to dedicated Search Nodes |
**Debug path: "writes intermittently failing":**
1. Atlas → Cluster → Real-time → confirm primary is stable (no recent elections).
2. Driver logs: look for `MongoServerSelectionError` vs `MongoWriteConcernError`. Former = topology problem; latter = write didn't ack to majority.
3. Application: confirm `retryWrites: true` and `w: "majority"`. Without retryWrites, transient primary changes surface as user errors.
4. Multi-region: verify the region is listed in `priority` config; a non-priority region cannot accept writes after failover.
## Online Archive — the cost lever most teams miss
For collections with append-mostly time-series or audit data, configure Atlas Online Archive to tier records older than N days to S3-backed storage. Queryable through the same connection string with `archiveOnly` or `unioned` mode. Typical impact: 60–80% cluster storage reduction on logs/audit/event data; queries on cold data cost more (per-GB scan), so partition Online Archive by date for predictable economics.
## When MongoDB Atlas is NOT the right call
- You have an AWS-only workload with simple key-value access patterns and strict AWS-native governance — **DynamoDB** is cheaper and has zero cross-vendor review overhead.
- You need strict SQL and relational integrity — **Aurora Postgres** (plus pgvector if you need vectors) is the better fit.
- Your compliance team will not approve an external data controller — **Amazon DocumentDB (with MongoDB compatibility)** covers the API surface while keeping data inside AWS (note: not 100% feature-compatible with modern MongoDB).
## Pricing on AWS (2026 ballparks)
Verify current pricing at [mongodb.com/pricing](https://www.mongodb.com/pricing).
- **M10** (dev / small prod): ~$57/month
- **M30** (growing prod): ~$400/month
- **M50+** (high-throughput): $1,000+/month
- **Search Nodes**: billed separately; sized independently.
- **Serverless instances**: per-operation + storage; good for low-traffic or bursty apps.
- **Data transfer**: in-region same-cloud free; cross-region and cross-cloud chargeable.
## Best practices
**Data modeling**
- Embed small, high-affinity sub-documents; reference for large sub-entities or shared data.
- Avoid unbounded arrays; use a separate collection with a compound index instead.
- Time-series collections for IoT and telemetry.
**Performance**
- Create indexes guided by Performance Advisor.
- Use the aggregation pipeline; avoid fetching and processing client-side.
- Cache read-heavy endpoints with ElastiCache Redis for hot keys.
**Security**
- PrivateLink + IAM auth + customer-managed KMS keys for encryption-at-rest.
- Atlas Database Access with minimum-privilege roles.
- Queryable Encryption for regulated PII fields.
**Observability**
- Atlas built-in metrics, plus Datadog MongoDB integration or Prometheus exporter for unified dashboards with your AWS workload.
- Alert on replication lag, primary election count, oplog window, working-set-to-memory ratio.
## Related reading
- [`Amazon S3 Vectors: native vector storage`](/blog/amazon-s3-vectors-native-vector-storage/)
- [`Amazon MemoryDB vector search`](/blog/amazon-memorydb-vector-search/)
- [`Amazon Bedrock Data Automation`](/blog/amazon-bedrock-data-automation/)
## Related services
- [AWS Data Analytics](/services/aws-data-analytics/)
- [Generative AI on AWS](/services/generative-ai-on-aws/)
- [AWS Application Modernization](/services/aws-application-modernization/) MongoDB Atlas on AWS
MongoDB Atlas is the officially supported way to run MongoDB in the cloud. Hosted on AWS (as well as Azure and GCP), Atlas removes the operational overhead of running replica sets and sharded clusters while adding a developer platform: Atlas Search, Atlas Vector Search, Atlas Stream Processing, Atlas Edge Server, and Queryable Encryption. For AWS customers, the standard 2026 integration is PrivateLink plus AWS KMS customer-managed keys plus IAM authentication — no Atlas credentials ever live on disk.
What’s new for MongoDB on AWS in 2026
- MongoDB 8.0 — roughly 32% faster write performance on mixed workloads versus 7.0, improved time-series collections, block-level compression defaults, and a new Bulk Write API.
- Atlas Vector Search GA — hybrid queries combining vector similarity with MongoDB filters in one aggregation stage; HNSW index type; integrates with Amazon Bedrock (Titan Embeddings v2, Cohere Embed) and any OpenAI-compatible embedding API.
- Search Nodes GA — dedicated nodes that isolate search/vector workloads from operational reads and writes; scale independently.
- Atlas Stream Processing GA — stream processing in the MongoDB aggregation framework; sources from Kafka, MSK, and Atlas change streams.
- Queryable Encryption GA — equality queries on client-side-encrypted fields; range queries in public preview (2025).
- Atlas Edge Server — disconnected-capable local MongoDB that syncs with Atlas for retail, manufacturing, and healthcare edge workloads.
- Atlas Data Lake / Online Archive — tier cold data to S3 automatically; query both warm and cold data through the same connection.
- AWS PrivateLink improvements — lower-latency paths and support for additional regions; cross-region PrivateLink for multi-region Atlas deployments.
- IAM authentication — assume-role flow means apps running on EC2, ECS, EKS, or Lambda authenticate to Atlas without a static credential in AWS Secrets Manager.
Why MongoDB for AWS applications
Flexible schema + rich querying
- Documents store nested data naturally; aggregation pipeline is expressive and index-aware.
- Rich drivers for every major language; official Node, Python, Java, Go, Rust, .NET, Ruby drivers.
AI-ready out of the box
- Atlas Vector Search hosts embeddings alongside the source documents — fewer moving parts than a separate vector DB.
- Native integrations with Bedrock, LangChain, LlamaIndex, Haystack, and Semantic Kernel.
- Hybrid search (vector + text + metadata) in one query.
Operational maturity
- Multi-region replica sets; automatic failover within ~30 seconds.
- Continuous oplog-based backup with up to 35-day point-in-time recovery.
- Sharded clusters for horizontal scale into multi-TB/PB ranges.
Atlas Vector Search — the decision that actually matters
Atlas Vector Search is usually the right call when:
- The source-of-truth documents already live in MongoDB.
- The working-set of vectors fits in memory at an instance size you can afford (rule of thumb: ~1.5–2 KB per vector for float32 + HNSW graph overhead).
- You need hybrid filters (e.g.,
userId = X AND vector similarity > 0.8) in one query. - You are using Amazon Bedrock for embeddings and want minimal pipeline glue.
Use Amazon OpenSearch Service k-NN when lexical BM25 search matters as much as vector similarity, you need fine-grained HNSW/IVF tuning, or you are already an OpenSearch shop.
Use pgvector on Amazon RDS / Aurora Postgres when the workload is predominantly relational and vector search is a secondary capability — no new database, no new ops surface.
Use Amazon S3 Vectors (new in 2025) for extremely large, cold vector corpora where query latency budget is in the hundreds of ms and storage cost dominates.
Atlas Search Nodes
Search Nodes are dedicated Atlas instances that host Atlas Search and Atlas Vector Search indexes, isolated from the base cluster. Deploy them when:
- Search or vector workloads are bursty enough to steal memory and IOPS from operational queries.
- Memory requirements for the vector index diverge from operational node sizing.
- You want independent scaling — scale search without resizing the base tier.
Below ~M30 or with light search traffic, the base nodes are typically enough.
Atlas Stream Processing
Atlas Stream Processing (ASP) runs stream-processing pipelines written in the MongoDB aggregation framework over Kafka, MSK, or Atlas change streams. Outputs can land in collections, Kafka topics, or HTTP webhooks.
Use ASP when:
- Source-of-truth data is already in MongoDB and transformations are JSON-document-shaped.
- You want to avoid a Kinesis Data Streams + Firehose + Lambda + output store stack for a simple pipeline.
Use Amazon Kinesis Data Streams + MSK + Amazon Managed Service for Apache Flink when you need cross-source joins, exactly-once semantics at scale, or a full event-sourcing architecture.
Queryable Encryption
Queryable Encryption encrypts fields on the client before they leave the application; the server sees only ciphertext, yet still supports equality queries (GA) and range queries (public preview).
- Ideal for SSNs, tax IDs, email addresses under HIPAA or PCI scope.
- Pair with AWS KMS customer-managed keys as the encryption-key provider.
- Compare to field-level encryption in DynamoDB Encryption SDK — similar intent, different query surface.
Connectivity patterns on AWS
AWS PrivateLink (preferred)
- Atlas cluster exposed through a VPC endpoint in your VPC; traffic stays on AWS backbone.
- Single-region and cross-region options.
- Gateway fee offset by avoided NAT data-transfer costs.
VPC peering
- Supported but less flexible; prefer PrivateLink for new deployments.
Public access + IP allow list
- Acceptable for dev and staging; avoid in production.
Authentication
- IAM auth to Atlas — EC2/ECS/EKS/Lambda assumes a role and Atlas validates against the role”s ARN.
- SCRAM-SHA-256 username/password (stored in AWS Secrets Manager) as fallback.
- X.509 client certs for machine-to-machine auth.
MongoDB vs DynamoDB vs RDS — when to pick which
| Consideration | MongoDB Atlas | DynamoDB | RDS / Aurora Postgres |
|---|---|---|---|
| Data model | Flexible document | Key-value / wide column | Relational |
| Schema | Dynamic | Fixed on primary keys | Enforced |
| Vector search | Atlas Vector Search | Not native | pgvector extension |
| Complex querying | Aggregation pipeline | Limited; use OpenSearch/Athena to join | Full SQL + JSONB |
| Operational overhead | Managed by MongoDB Inc. | Fully managed by AWS | Managed by AWS |
| Geo-replication | Global Clusters / Multi-region | Global Tables (strong in AWS) | Aurora Global Database |
| Typical starting cost | ~$57/month M10 | <$25/month light usage | ~$30/month db.t4g.micro |
| Best-fit AWS persona | Teams already on MongoDB, RAG builders | AWS-native high-scale KV workloads | Relational-first / fintech |
Reference architecture (2026 default)
AWS VPC MongoDB Atlas project (single region)
───────── ─────────────────────────────────
Lambda / ECS / EKS Atlas replica set (3 nodes, 3 AZ)
│ IAM role (assumeRole) ├── Primary
│ ├── Secondary → driver readPreference
│ └── Secondary
▼ │
PrivateLink VPC endpoint ─────► Atlas PrivateLink endpoint
com.amazonaws.<region>.mongodb (per-region; cross-region available)
│
Search Nodes (dedicated)
└── Atlas Vector Search index
└── Atlas Search (BM25)
│
Online Archive ──► S3 (cold tier)
│
Backup snapshot store (Atlas-managed)
└── 35-day PITR window
Encryption. TLS 1.3 client → Atlas (mandatory); at-rest with AWS KMS customer-managed key (configured per cluster); Queryable Encryption for client-side field encryption with KMS-wrapped DEKs.
Implementation
Minimal — Node.js + IAM role auth (Lambda, ECS, EKS):
import { MongoClient } from 'mongodb';
const uri =
'mongodb+srv://cluster0.mongodb.net/?' +
'authSource=%24external&authMechanism=MONGODB-AWS&retryWrites=true&w=majority';
const client = new MongoClient(uri, {
maxPoolSize: 50,
minPoolSize: 5,
serverSelectionTimeoutMS: 5000,
socketTimeoutMS: 45000,
retryWrites: true,
retryReads: true,
});
export const handler = async (event) => {
// Reuse the connection across Lambda invocations (lifetime of the execution context)
await client.connect();
const db = client.db('orders');
const result = await db.collection('orders').findOne({ _id: event.id });
return result;
};
Production — pooled connections, structured retry, vector search:
import { MongoClient, ReadPreference } from 'mongodb';
const client = new MongoClient(process.env.MONGODB_URI, {
maxPoolSize: 100,
minPoolSize: 10,
maxIdleTimeMS: 60_000,
serverSelectionTimeoutMS: 5_000,
socketTimeoutMS: 30_000,
retryWrites: true,
retryReads: true,
readPreference: ReadPreference.SECONDARY_PREFERRED,
readConcern: { level: 'majority' },
writeConcern: { w: 'majority', wtimeoutMS: 10_000 },
appName: 'orders-api', // shows up in Atlas Performance Advisor
});
await client.connect();
async function vectorSearch(queryEmbedding, userId) {
return client
.db('rag')
.collection('documents')
.aggregate([
{
$vectorSearch: {
index: 'embeddings_idx',
queryVector: queryEmbedding,
path: 'embedding',
numCandidates: 200,
limit: 10,
filter: { tenantId: userId },
},
},
{ $project: { _id: 1, text: 1, score: { $meta: 'vectorSearchScore' } } },
])
.toArray();
}
For Lambda, declare the client outside the handler so the connection pool survives across invocations within an execution context. Use Provisioned Concurrency for latency-sensitive paths to avoid cold-start handshake (TLS + replica-set discovery is ~200–400 ms).
Failure modes & resilience
1. Replica-set primary failover (~30s). The driver detects, re-elects, and routes writes to the new primary. With retryWrites: true (default 5.x+) most application writes are transparently retried. Mitigation: keep serverSelectionTimeoutMS tight (5 s) so failover surfaces quickly; alarm on connections.failed spikes.
2. Oplog window starvation. A long-running migration or analytics scan that lags secondary replication can cause the secondary to fall outside the oplog window (default ~24h on M30). Recovery: re-sync the secondary (Atlas handles automatically, but takes hours on TB-scale). Prevention: monitor oplogWindow.hours — alarm if < 24.
3. Vector-index memory pressure. Vector indexes load into RAM. A working set that exceeds node memory causes evictions and p99 latency to climb 10–100×. Prevention: size with rule of thumb ~1.5–2 KB / vector × number of vectors × 1.5 (HNSW overhead); move to dedicated Search Nodes when the math says so.
4. PrivateLink endpoint exhaustion. Each Atlas PrivateLink endpoint supports a finite number of concurrent connections per AZ. Symptom: connection timeouts under burst load while CPU is idle. Mitigation: scale to multiple endpoints per region; ensure connection pool maxPoolSize × number-of-Lambda-concurrency ≤ endpoint capacity.
5. IAM role token expiry mid-query. STS credentials expire (default 1h, max 12h). Long-running aggregations can outlive the token. Mitigation: use connection lifetime ≤ token TTL, or set maxIdleTimeMS to force pool refresh; use assumeRoleSessionDuration of 12h for batch jobs.
6. Multi-region read-preference traps. readPreference: secondary against a multi-region cluster can route reads to a remote secondary, adding 100+ ms latency. Use readPreferenceTags: { region: "us-east-1" } to pin reads to the local region’s secondary.
7. Driver version drift. Atlas server features (vector search aggregation stages, time-series collections, queryable encryption) require minimum driver versions. Pin and bump deliberately; the official driver release notes document server-version compatibility.
Observability runbook
Atlas-side alarms (configure via Atlas UI or Atlas Admin API):
| Alarm | Threshold | Action |
|---|---|---|
Replication Lag | > 10s for 5 min | Check secondary CPU / IOPS; recent schema migration? |
Oplog Window | < 24 hours | Slow secondaries; consider larger oplog or fewer batch ops |
Cache Used | > 90% for 15 min | Working set > RAM; resize tier or shard |
Connections | > 90% of cluster max | Pool sizing; check for connection leaks |
Slow Operations | sustained > baseline | Performance Advisor → suggested indexes |
Datadog / CloudWatch correlation alarms:
| Alarm | First action |
|---|---|
App p99 query latency > 500ms | Atlas Performance Advisor → missing index? Consider Search Nodes for vector |
Driver MongoNetworkError rate spike | Confirm PrivateLink endpoint health; check VPC route tables |
Lambda Init Duration > 1s | Connection pool not reused — verify client outside handler |
Vector search latency > 200ms | Index memory pressure; shard or move to dedicated Search Nodes |
Debug path: “writes intermittently failing”:
- Atlas → Cluster → Real-time → confirm primary is stable (no recent elections).
- Driver logs: look for
MongoServerSelectionErrorvsMongoWriteConcernError. Former = topology problem; latter = write didn’t ack to majority. - Application: confirm
retryWrites: trueandw: "majority". Without retryWrites, transient primary changes surface as user errors. - Multi-region: verify the region is listed in
priorityconfig; a non-priority region cannot accept writes after failover.
Online Archive — the cost lever most teams miss
For collections with append-mostly time-series or audit data, configure Atlas Online Archive to tier records older than N days to S3-backed storage. Queryable through the same connection string with archiveOnly or unioned mode. Typical impact: 60–80% cluster storage reduction on logs/audit/event data; queries on cold data cost more (per-GB scan), so partition Online Archive by date for predictable economics.
When MongoDB Atlas is NOT the right call
- You have an AWS-only workload with simple key-value access patterns and strict AWS-native governance — DynamoDB is cheaper and has zero cross-vendor review overhead.
- You need strict SQL and relational integrity — Aurora Postgres (plus pgvector if you need vectors) is the better fit.
- Your compliance team will not approve an external data controller — Amazon DocumentDB (with MongoDB compatibility) covers the API surface while keeping data inside AWS (note: not 100% feature-compatible with modern MongoDB).
Pricing on AWS (2026 ballparks)
Verify current pricing at mongodb.com/pricing.
- M10 (dev / small prod): ~$57/month
- M30 (growing prod): ~$400/month
- M50+ (high-throughput): $1,000+/month
- Search Nodes: billed separately; sized independently.
- Serverless instances: per-operation + storage; good for low-traffic or bursty apps.
- Data transfer: in-region same-cloud free; cross-region and cross-cloud chargeable.
Best practices
Data modeling
- Embed small, high-affinity sub-documents; reference for large sub-entities or shared data.
- Avoid unbounded arrays; use a separate collection with a compound index instead.
- Time-series collections for IoT and telemetry.
Performance
- Create indexes guided by Performance Advisor.
- Use the aggregation pipeline; avoid fetching and processing client-side.
- Cache read-heavy endpoints with ElastiCache Redis for hot keys.
Security
- PrivateLink + IAM auth + customer-managed KMS keys for encryption-at-rest.
- Atlas Database Access with minimum-privilege roles.
- Queryable Encryption for regulated PII fields.
Observability
- Atlas built-in metrics, plus Datadog MongoDB integration or Prometheus exporter for unified dashboards with your AWS workload.
- Alert on replication lag, primary election count, oplog window, working-set-to-memory ratio.
Related reading
Amazon S3 Vectors: native vector storageAmazon MemoryDB vector searchAmazon Bedrock Data Automation
Related services
Tools & Calculators
Self-serve calculators and assessments that pair with this integration.
Generative AI on AWS
RAG and agentic pipelines that pair Atlas Vector Search with Amazon Bedrock.
Related AWS Services
Consulting engagements that frequently pair with this integration.
AWS Data Analytics Services — Glue, Athena & QuickSight
AWS data analytics services — scalable data warehouse, ETL/ELT pipelines, real-time analytics, and business intelligence.
Generative AI on AWS — Production-Ready LLM Apps in Weeks
Generative AI on AWS — Amazon Bedrock, SageMaker, RAG pipelines, agents, and LLM application development.
AWS Application Modernization — From Legacy to Cloud-Native
AWS application modernization — legacy migration, microservices, containers. Expert consulting from FactualMinds.
Who typically runs this integration?
The roles that most often own or review this stack.
AWS Solutions for CTOs
Cloud strategy, multi-account governance, agentic AI platform decisions, and FinOps culture for technology leaders scaling AWS in 2026 and beyond.
AWS Solutions for Startup Founders
AWS Activate credits, serverless-first architecture, agentic product patterns, SOC 2 sprints, and investor-ready infrastructure for founders shipping on AWS in 2026.
Related Integrations
Other AWS integration guides commonly deployed alongside this one.
Snowflake on AWS
Snowflake + AWS in 2026: Cortex Analyst, Iceberg Tables on S3, Hybrid Tables, Snowpark, Polaris Catalog — vs Redshift, Athena, SageMaker Lakehouse.
Datadog with AWS
Datadog on AWS in 2026: unified observability for CloudWatch, EKS, Lambda, Bedrock LLM workloads, and security posture across multi-cloud estates.
Frequently Asked Questions
What is MongoDB Atlas on AWS in 2026?
Atlas Vector Search vs OpenSearch k-NN vs pgvector on RDS — when to use which?
Should I use Atlas Search Nodes or just the base cluster?
How does Queryable Encryption change what data we can store on Atlas?
How do I connect from AWS to Atlas securely?
What is Atlas Stream Processing and when should I use it over Kinesis + Lambda?
What is Atlas Edge Server?
How much does MongoDB Atlas cost on AWS in 2026?
Related Reading
- Amazon S3 Vectors: Native Vector Storage Without a Separate Vector Database
Amazon S3 Vectors eliminates the dedicated vector database for many RAG workloads. We compare it to OpenSearch Serverless and MemoryDB and show when each wins.
- Amazon MemoryDB with Vector Search: Durable Redis-Compatible Storage for AI Workloads
ElastiCache loses your AI chatbot's session memory at every node replacement. MemoryDB doesn't. A decision framework for when to pick MemoryDB over ElastiCache, OpenSearch Serverless, and S3 Vectors for AI workloads — with the latency math and the failure mode that forces the switch.
- Amazon Bedrock Data Automation: Intelligent Document and Media Processing at Scale
Amazon Bedrock Data Automation replaces fragmented Textract + Comprehend + Lambda pipelines with a managed intelligent document processing service. Production guide.
Need Help with This Integration?
Our AWS-certified engineers can design, implement, and operate this integration end-to-end — or review what you already have.