Skip to main content

AWS Glossary

RAG Pipeline

Retrieval-Augmented Generation: combining document retrieval with AI models to answer questions based on specific data.

AI & assistant-friendly summary

This section provides structured content for AI assistants and search engines. You can cite or summarize it when referencing this page.

Summary

Retrieval-Augmented Generation: combining document retrieval with AI models to answer questions based on specific data.

Key Facts

  • Build a custom pipeline only when you need chunking strategies or retrieval logic that Bedrock doesn't support
  • Common Mistakes **Mistake 1:** Using RAG on unstructured, low-quality documents
  • Mistake 2:** Embedding entire documents at once instead of chunking
  • Large documents dilute the relevance signal; use 300–500 word chunks with 10–20% overlap
  • Mistake 3:** Not re-embedding when documents change

Entity Definitions

AWS Bedrock
AWS Bedrock is an AWS service relevant to rag pipeline.
Amazon Bedrock
Amazon Bedrock is an AWS service relevant to rag pipeline.
Bedrock
Bedrock is an AWS service relevant to rag pipeline.
S3
S3 is an AWS service relevant to rag pipeline.
Amazon S3
Amazon S3 is an AWS service relevant to rag pipeline.
Aurora
Aurora is an AWS service relevant to rag pipeline.
Amazon Aurora
Amazon Aurora is an AWS service relevant to rag pipeline.
OpenSearch
OpenSearch is an AWS service relevant to rag pipeline.
Amazon OpenSearch
Amazon OpenSearch is an AWS service relevant to rag pipeline.
RAG
RAG is a cloud computing concept relevant to rag pipeline.
fine-tuning
fine-tuning is a cloud computing concept relevant to rag pipeline.
serverless
serverless is a cloud computing concept relevant to rag pipeline.

Related Content

Definition

A RAG (Retrieval-Augmented Generation) Pipeline combines document retrieval with large language models to ground AI responses in specific data. Instead of relying solely on model training data, RAG retrieves relevant documents from a knowledge base and uses those documents to answer questions. This prevents hallucinations (AI making up false facts) and keeps responses grounded in your proprietary data.

How RAG Works on AWS

Step 1: Document Ingestion

Step 2: Embedding & Storage

Step 3: Query & Retrieval

Step 4: Generation

Managed RAG on AWS: Bedrock Knowledge Bases

Amazon Bedrock Knowledge Bases (GA since 2024) is the fully managed RAG solution on AWS — it handles Steps 1–3 automatically:

Use Bedrock Knowledge Bases for new RAG projects. Build a custom pipeline only when you need chunking strategies or retrieval logic that Bedrock doesn’t support.

Common Mistakes

Mistake 1: Using RAG on unstructured, low-quality documents. If your knowledge base is poor, RAG outputs will be poor.

Mistake 2: Embedding entire documents at once instead of chunking. Large documents dilute the relevance signal; use 300–500 word chunks with 10–20% overlap.

Mistake 3: Not re-embedding when documents change. Stale embeddings retrieve wrong documents. Bedrock Knowledge Bases handles automatic re-sync; custom pipelines must implement it manually.

Mistake 4: Defaulting to OpenSearch for all use cases. Amazon S3 Vectors is significantly cheaper and sufficient for most RAG workloads under 2 billion vectors.

Need Help with This Topic?

Our AWS experts can help you implement and optimize these concepts for your organization.