---
title: Amazon S3 Vectors
description: S3 Vectors is the AWS native vector store — purpose-built vector storage on S3 with up to 90% lower cost than dedicated vector databases for RAG workloads.
url: https://www.factualminds.com/glossary/s3-vectors/
publishDate: 2026-06-17
updateDate: 2026-06-17
---

# Amazon S3 Vectors

> S3 Vectors is the AWS native vector store — purpose-built vector storage on S3 with up to 90% lower cost than dedicated vector databases for RAG workloads.

## Definition

Amazon **S3 Vectors** is a native vector storage tier on S3 for embeddings and similarity search. **Vector buckets** store high-dimensional vectors with metadata filters; indexes support cosine, Euclidean, and dot-product distance metrics. **S3 Vectors reached GA** in 2025 as a Bedrock Knowledge Bases vector store option alongside OpenSearch Serverless, Aurora pgvector, and partner engines — targeting RAG and semantic search where **storage cost** dominates OpenSearch OCU-hours or dedicated vector DB pods.

As of **June 16, 2026**, `QueryVectors` returns up to **10,000** similarity search results per query (100× the prior 100-result limit), with **paginated** responses via `nextToken`. Query **data-processed charges** on indexes with **more than 10 million vectors** dropped **up to 80%** automatically. Large result sets may incur **data-returned** fees beyond the first **512 KB** per query — see the [S3 pricing page](https://aws.amazon.com/s3/pricing/).

The trade-off is latency: expect roughly **sub-100ms to low hundreds of ms** query times suitable for batch retrieval, wide recall + rerank pipelines, and many chat RAG flows — not sub-10ms agent loops at thousands of QPS.

| Store (illustrative)  | Cost driver         | Latency profile               | Max topK (June 2026) |
| --------------------- | ------------------- | ----------------------------- | -------------------- |
| OpenSearch Serverless | OCU-hours + storage | Lower p99 on small indexes    | Index-dependent      |
| Dedicated vector SaaS | Pod/replica hours   | Tunable, vendor-specific      | Vendor-specific      |
| S3 Vectors            | Storage + per-query | Higher tail, lowest storage $ | 10,000 (paginated)   |

## When to use it

- **Bedrock Knowledge Bases** RAG with large corpora (10M+ chunks) where OpenSearch baseline OCUs inflate monthly cost — especially after the June 2026 large-index query discount.
- **Multi-stage retrieval** — wide `topK` recall, client-side rerank, dedup by `document_id` — now practical without sharding workarounds for the old 100-result cap.
- Multi-tenant SaaS needing **S3-native isolation** (prefix or bucket per tenant) with metadata filters at retrieval.
- Archival or **long-tail knowledge** sets queried occasionally but stored durably for compliance.

## When not to use it

- Agentic workflows requiring **sub-50ms retrieval** inside tight tool-call loops at high QPS — OpenSearch Serverless or in-memory caches win.
- Defaulting to **topK=10,000** for simple chat RAG — five chunks to the LLM does not need wide recall; you pay latency and data-returned fees for no gain.
- Hybrid lexical + vector search as a single managed engine — OpenSearch hybrid or Kendra may fit better.
- Graph-heavy relationship traversal — **Neptune Analytics** combines graph and vector where edges matter.

## Tips

- Design **metadata fields for mandatory filters** (tenant, ACL, doc version) before first ingest — re-indexing billion-vector buckets is painful.
- On wide recall passes, set **`returnMetadata=True`** and **`returnData=False`**; fetch chunk text only for post-rerank top-N.
- **Paginate** `QueryVectors` with `nextToken` — process the first page while fetching the next; do not buffer thousands of payloads in Lambda memory.
- Upgrade **AWS SDKs** after June 16, 2026 for pagination support on `QueryVectors`.
- Run **recall@k benchmarks** before raising `topK`; cheapest store is worthless if reranked quality does not improve.

## Gotchas

- **Serious:** Raising `topK` to thousands with **`returnData=True`** without pagination — OOM in Lambda and unexpected data-returned charges past the 512 KB free tier.
- **Serious:** Using S3 Vectors for **real-time agent tool retrieval** without load testing — tail latency spikes under concurrent sessions frustrate users.
- **Serious:** **Stale embeddings** when source documents change but sync jobs fail silently — pair with document version metadata and health alarms on sync lag.
- **Regular:** Assuming **hybrid keyword search** exists natively — you may still need OpenSearch or Athena on structured fields for keyword-heavy queries.
- **Regular:** Cross-region **inference in Bedrock** reading vectors in another region adds data transfer — colocate vector buckets with Knowledge Base and model region.

## Official references

- [Querying vectors](https://docs.aws.amazon.com/AmazonS3/latest/userguide/s3-vectors-query.html) — QueryVectors, filters, recall testing.
- [Create a vector index](https://docs.aws.amazon.com/AmazonS3/latest/userguide/s3-vectors-create-index.html) — index types and limits.
- [Knowledge Bases data source sync](https://docs.aws.amazon.com/bedrock/latest/userguide/kb-data-source-sync.html) — supported ingestion paths.

## Related FactualMinds content

- [Amazon S3 Vectors: 10,000 Results per Query (June 2026)](/blog/amazon-s3-vectors-native-vector-storage/)
- [Amazon Bedrock Consulting](/services/aws-bedrock/)
- [Generative AI on AWS](/services/generative-ai-on-aws/)
- [Amazon S3](/glossary/amazon-s3/)
- [Generative AI RAG on Bedrock Pattern](/patterns/generative-ai-rag-on-bedrock/)

## Related AWS Services

- aws-bedrock
- generative-ai-on-aws

## Related Posts

- amazon-s3-vectors-native-vector-storage

---

*Source: https://www.factualminds.com/glossary/s3-vectors/*
