Core Components
-
Retriever
-
Searches a database to find relevant documents based on the user query.
-
Uses similarity search (e.g., cosine similarity between vector embeddings).
-
Types:
-
Sparse retrievers: Traditional (e.g., BM25).
-
Dense retrievers: Modern (e.g., dual encoder models).
-
Hybrid: Combines both.
-
-
-
Generator
-
Receives the query and the retrieved content.
-
Produces a final answer using natural language.
-
Usually a pretrained model like GPT, LLaMA, or T5.
-
-
Knowledge Base (Corpus)
-
The source of truth (e.g., documents, web pages, PDFs).
-
Preprocessed into small “chunks” and embedded as vectors.
-
Stored in vector databases for efficient similarity search.
-
-
Augmentation Layer
-
Optional enhancements to improve quality:
-
Query rewriting
-
Reranking
-
Chunk compression or repacking
-
Metadata filtering
-
-