All RAG Techniques: A Simpler, Hands-On Approach ✨

This repository takes a clear, hands-on approach to Retrieval-Augmented Generation (RAG), breaking down advanced techniques into straightforward, understandable implementations. Instead of relying on frameworks like LangChain or FAISS, everything here is built using familiar Python libraries openai, numpy, matplotlib, and a few others.

The goal is simple: provide code that is readable, modifiable, and educational. By focusing on the fundamentals, this project helps demystify RAG and makes it easier to understand how it really works.

Update: 📢

(20-Mar-2025) Added a new notebook on RAG with Reinforcement Learning.
(07-Mar-2025) Added 20 RAG techniques to the repository.

🚀 What's Inside?

This repository contains a collection of Jupyter Notebooks, each focusing on a specific RAG technique. Each notebook provides:

A concise explanation of the technique.
A step-by-step implementation from scratch.
Clear code examples with inline comments.
Evaluations and comparisons to demonstrate the technique's effectiveness.
Visualization to visualize the results.

Here's a glimpse of the techniques covered:

Notebook	Description
1. Simple RAG	A basic RAG implementation. A great starting point!
2. Semantic Chunking	Splits text based on semantic similarity for more meaningful chunks.
3. Chunk Size Selector	Explores the impact of different chunk sizes on retrieval performance.
4. Context Enriched RAG	Retrieves neighboring chunks to provide more context.
5. Contextual Chunk Headers	Prepends descriptive headers to each chunk before embedding.
6. Document Augmentation RAG	Generates questions from text chunks to augment the retrieval process.
7. Query Transform	Rewrites, expands, or decomposes queries to improve retrieval. Includes Step-back Prompting and Sub-query Decomposition.
8. Reranker	Re-ranks initially retrieved results using an LLM for better relevance.
9. RSE	Relevant Segment Extraction: Identifies and reconstructs continuous segments of text, preserving context.
10. Contextual Compression	Implements contextual compression to filter and compress retrieved chunks, maximizing relevant information.
11. Feedback Loop RAG	Incorporates user feedback to learn and improve RAG system over time.
12. Adaptive RAG	Dynamically selects the best retrieval strategy based on query type.
13. Self RAG	Implements Self-RAG, dynamically decides when and how to retrieve, evaluates relevance, and assesses support and utility.
14. Proposition Chunking	Breaks down documents into atomic, factual statements for precise retrieval.
15. Multimodel RAG	Combines text and images for retrieval, generating captions for images using LLaVA.
16. Fusion RAG	Combines vector search with keyword-based (BM25) retrieval for improved results.
17. Graph RAG	Organizes knowledge as a graph, enabling traversal of related concepts.
18. Hierarchy RAG	Builds hierarchical indices (summaries + detailed chunks) for efficient retrieval.
19. HyDE RAG	Uses Hypothetical Document Embeddings to improve semantic matching.
20. CRAG	Corrective RAG: Dynamically evaluates retrieval quality and uses web search as a fallback.
21. Rag with RL	Maximize the reward of the RAG model using Reinforcement Learning.

🗂️ Repository Structure

fareedkhan-dev-all-rag-techniques/
├── README.md                          <- You are here!
├── 1_simple_rag.ipynb
├── 2_semantic_chunking.ipynb
├── 3_chunk_size_selector.ipynb
├── 4_context_enriched_rag.ipynb
├── 5_contextual_chunk_headers_rag.ipynb
├── 6_doc_augmentation_rag.ipynb
├── 7_query_transform.ipynb
├── 8_reranker.ipynb
├── 9_rse.ipynb
├── 10_contextual_compression.ipynb
├── 11_feedback_loop_rag.ipynb
├── 12_adaptive_rag.ipynb
├── 13_self_rag.ipynb
├── 14_proposition_chunking.ipynb
├── 15_multimodel_rag.ipynb
├── 16_fusion_rag.ipynb
├── 17_graph_rag.ipynb
├── 18_hierarchy_rag.ipynb
├── 19_HyDE_rag.ipynb
├── 20_crag.ipynb
├── 21_rag_with_rl.ipynb
├── requirements.txt                   <- Python dependencies
└── data/
    └── val.json                       <- Sample validation data (queries and answers)
    └── AI_Information.pdf             <- A sample PDF document for testing.
    └── attention_is_all_you_need.pdf  <- A sample PDF document for testing (for Multi-Modal RAG).

🛠️ Getting Started

Clone the repository:

git clone https://github.com/FareedKhan-dev/all-rag-techniques.git
cd all-rag-techniques

Install dependencies:
```
pip install -r requirements.txt
```

Set up your OpenAI API key:

Obtain an API key from Nebius AI.

Set the API key as an environment variable:

export OPENAI_API_KEY='YOUR_NEBIUS_AI_API_KEY'

or

setx OPENAI_API_KEY "YOUR_NEBIUS_AI_API_KEY"  # On Windows

or, within your Python script/notebook:

import os
os.environ["OPENAI_API_KEY"] = "YOUR_NEBIUS_AI_API_KEY"

Run the notebooks:

Open any of the Jupyter Notebooks (.ipynb files) using Jupyter Notebook or JupyterLab. Each notebook is self-contained and can be run independently. The notebooks are designed to be executed sequentially within each file.

Note: The data/AI_Information.pdf file provides a sample document for testing. You can replace it with your own PDF. The data/val.json file contains sample queries and ideal answers for evaluation. The 'attention_is_all_you_need.pdf' is for testing Multi-Modal RAG Notebook.

💡 Core Concepts

Embeddings: Numerical representations of text that capture semantic meaning. We use Nebius AI's embedding API and, in many notebooks, also the BAAI/bge-en-icl embedding model.
Vector Store: A simple database to store and search embeddings. We create our own SimpleVectorStore class using NumPy for efficient similarity calculations.
Cosine Similarity: A measure of similarity between two vectors. Higher values indicate greater similarity.
Chunking: Dividing text into smaller, manageable pieces. We explore various chunking strategies.
Retrieval: The process of finding the most relevant text chunks for a given query.
Generation: Using a Large Language Model (LLM) to create a response based on the retrieved context and the user's query. We use the meta-llama/Llama-3.2-3B-Instruct model via Nebius AI's API.
Evaluation: Assessing the quality of the RAG system's responses, often by comparing them to a reference answer or using an LLM to score relevance.

🤝 Contributing

Contributions are welcome!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

All RAG Techniques: A Simpler, Hands-On Approach ✨

Update: 📢

🚀 What's Inside?

🗂️ Repository Structure

🛠️ Getting Started

💡 Core Concepts

🤝 Contributing

About

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
data		data
.gitignore		.gitignore
10_contextual_compression.ipynb		10_contextual_compression.ipynb
11_feedback_loop_rag.ipynb		11_feedback_loop_rag.ipynb
12_adaptive_rag.ipynb		12_adaptive_rag.ipynb
13_self_rag.ipynb		13_self_rag.ipynb
14_proposition_chunking.ipynb		14_proposition_chunking.ipynb
15_multimodel_rag.ipynb		15_multimodel_rag.ipynb
16_fusion_rag.ipynb		16_fusion_rag.ipynb
17_graph_rag.ipynb		17_graph_rag.ipynb
18_hierarchy_rag.ipynb		18_hierarchy_rag.ipynb
19_HyDE_rag.ipynb		19_HyDE_rag.ipynb
1_simple_rag.ipynb		1_simple_rag.ipynb
20_crag.ipynb		20_crag.ipynb
21_rag_with_rl.ipynb		21_rag_with_rl.ipynb
2_semantic_chunking.ipynb		2_semantic_chunking.ipynb
3_chunk_size_selector.ipynb		3_chunk_size_selector.ipynb
4_context_enriched_rag.ipynb		4_context_enriched_rag.ipynb
5_contextual_chunk_headers_rag.ipynb		5_contextual_chunk_headers_rag.ipynb
6_doc_augmentation_rag.ipynb		6_doc_augmentation_rag.ipynb
7_query_transform.ipynb		7_query_transform.ipynb
8_reranker.ipynb		8_reranker.ipynb
9_rse.ipynb		9_rse.ipynb
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

License

FareedKhan-dev/all-rag-techniques

Folders and files

Latest commit

History

Repository files navigation

All RAG Techniques: A Simpler, Hands-On Approach ✨

Update: 📢

🚀 What's Inside?

🗂️ Repository Structure

🛠️ Getting Started

💡 Core Concepts

🤝 Contributing

About

Topics

Resources

License

Stars

Watchers

Forks

Contributors 2

Languages