ragl is a local-first RAG pipeline using ChromaDB for storage and Ollama for LLM inference. It supports document ingestion, chunking, vector storage, and querying via a FastAPI interface.
- Document ingestion with metadata
- Vector search using ChromaDB
- Flexible embeddings (Ollama, SentenceTransformers, or custom models)
- FastAPI interface for querying
- Dockerized for ease
git clone https://github.com/jth500/ragl.git
cd ragl
Ensure Docker is installed, then build and run the container:
docker-compose up --build -d
-
Copy CSV files into the container:
docker cp my_data.csv ragl:/app/data/my_data.csv
-
Enter the Docker container shell:
docker exec -it ragl /bin/sh
-
Run the ingestion script:
python -m ragl.scripts.ingest_csv /app/data text_col_name /app/chroma_db
curl -X POST "http://127.0.0.1:8000/ask" \
-H "Content-Type: application/json" \
-d '{"query": "What do we think of Tottenham?", "top_k": 3, "model": "llama3.2"}'
- Support for additional document formats (PDF, JSON, etc.)
- Cleaner data ingestion