Advanced RAG Improvements
Advanced RAG refers to systems that build upon the limitations of Naive RAG by optimizing both retrieval and generation using additional strategies.
Pre-Retrieval Enhancements
These are techniques applied before fetching documents:
-
Query Rewriting
-
Rephrases or expands the user query to better match stored documents.
-
Example: Rewriting “LLM” → “large language model”.
-
-
Query Expansion
-
Adds additional terms or synonyms to improve recall.
-
Useful in domain-specific vocabularies or ambiguous queries.
-
-
Multi-query Retrieval
-
Splits the user query into several sub-queries, retrieves separately, and merges results.
-
Helps cover more ground and improve completeness.
-
-
Use of Metadata Filters
-
Filters retrieval using metadata (e.g., document_type = "FAQ").
-
Increases precision by eliminating irrelevant sections.
-
Post-Retrieval Enhancements
These are applied after initial document retrieval, before feeding them to the generator:
-
Reranking
-
Uses a secondary model to reorder retrieved documents by semantic relevance.
-
Can be done using cross-encoders or reranking LLMs.
-
-
Context Compression
-
Reduces retrieved content using summarization or token pruning.
-
Helps fit more content within LLM token limits.
-
-
Repacking and Ordering
-
Rearranges documents to improve logical flow.
-
Example: Grouping by source or time, sorting by importance.
-