Skip to content

A modular framework for building and deploying Retrieval-Augmented Generation (RAG) systems with built-in evaluation and monitoring.

Notifications You must be signed in to change notification settings

feld-m/rag_blueprint

Folders and files

NameName
Last commit message
Last commit date
Feb 10, 2025
Mar 13, 2025
Feb 19, 2025
Feb 19, 2025
Mar 14, 2025
Feb 18, 2025
Mar 13, 2025
Feb 19, 2025
Feb 10, 2025
Feb 11, 2025
Feb 10, 2025
Feb 19, 2025
Mar 14, 2025
Feb 10, 2025
Mar 3, 2025
Mar 13, 2025
Mar 13, 2025

Repository files navigation

RAG Blueprint

A comprehensive open-source framework for building production-ready Retrieval-Augmented Generation (RAG) systems. This blueprint simplifies the development of RAG applications while providing full control over performance, resource usage, and evaluation capabilities.

While building or buying RAG systems has become increasingly accessible, deploying them as production-ready data products remains challenging. Our framework bridges this gap by providing a streamlined development experience with easy configuration and customization options, while maintaining complete oversight of performance and resource usage.

It comes with built-in monitoring and observability tools for better troubleshooting, integrated LLM-based metrics for evaluation, and human feedback collection capabilities. Whether you're building a lightweight knowledge base or an enterprise-grade application, this blueprint offers the flexibility and scalability needed for production deployments.

πŸš€ Features

  • Multiple Knowledge Base Integration: Seamless extraction from several Data Sources(Confluence, Notion, PDF)
  • Wide Models Support: Availability of numerous embedding and language models
  • Vector Search: Efficient similarity search using vector stores
  • Interactive Chat: User-friendly interface for querying knowledge on Chainlit
  • Performance Monitoring: Query and response tracking with Langfuse
  • Evaluation: Comprehensive evaluation metrics using RAGAS
  • Setup flexibility: Easy and flexible setup process of the pipeline

πŸ› οΈ Tech Stack

Core

Python β€’ LlamaIndex β€’ Chainlit β€’ Langfuse β€’ RAGAS


Data Sources

Notion β€’ Confluence β€’ PDF files


Embedding Models

VoyageAI β€’ OpenAI β€’ Hugging Face


Language Models

OpenAI β€’ Any OpenAI-compatible API models


Vector Stores

Qdrant β€’ Chroma β€’ PGVector


Infrastructure

PostgreSQL β€’ Docker

πŸš€ Quickstart

Check the detailed Quickstart Setup

πŸ—οΈ Architecture

Data Flow

  1. Extraction:

    • Fetches content from the data sources pages through their respective APIs
    • Handles rate limiting and retries
    • Extracts metadata (title, creation time, URLs, etc.)
  2. Processing:

    • Markdown-aware chunking using LlamaIndex's MarkdownNodeParser
    • Embedding generation using the selected embedding model
    • Vector storage in Qdrant
  3. Retrieval & Generation:

    • Context-aware retrieval with configurable filters
    • LLM-powered response generation
    • Human feedback collection

Evaluation

The system includes comprehensive evaluation capabilities:

  • Automated Metrics (via RAGAS):

    • Faithfulness β€’ Answer Relevancy β€’ Context Precision β€’ Context Recall β€’ Harmfulness
  • Human Feedback:

    • Integrated feedback collection through Chainlit
    • Automatic dataset creation from positive feedback
    • Manual expert feedback support
  • Observability:

    • Full tracing and monitoring with Langfuse
    • Separate traces for chat completion and deployment evaluation
    • Integration between Chainlit and Langfuse for comprehensive tracking

πŸ“ Project Structure

.
β”œβ”€β”€ build/            # Build and deployment scripts
β”‚   └── workstation/  # Build scripts for workstation setup
β”œβ”€β”€ configurations/   # Configuration and secrets files
β”œβ”€β”€ res/              # Assets
└── src/              # Source code
    β”œβ”€β”€ augmentation/   # Retrieval and UI components
    β”œβ”€β”€ common/         # Shared utilities
    β”œβ”€β”€ embedding/      # Data extraction and embedding
    └── evaluate/       # Evaluation system
β”œβ”€β”€ tests/            # Unit tests

πŸ“š Documentation

For detailed documentation on setup, configuration, and development: