What is RAG?

Definition

Retrieval-Augmented Generation (RAG) is a method that enhances the capabilities of large language models (LLMs) by allowing them to fetch relevant information from an external knowledge base before generating a response. Instead of relying solely on their internal training data, RAG systems combine retrieval (searching for relevant documents) and generation (producing language based on those documents).


Key Points:

  • Retriever: A search component that identifies relevant content from a corpus.

  • Generator: An LLM that takes the query and retrieved documents to produce a final answer.

  • Real-time adaptability: Can respond with up-to-date knowledge without retraining.

  • Fact grounding: Answers are based on real documents, reducing hallucination.