A retrieval augmented generation (RAG) platform with open-source Large Language Models (LLMs) without exposing your data to online LLM providers.
Table of Contents
This project enables local LLM deployment with RAG capabilities, specifically designed for edge computing environments. It uses Ollama for LLM serving and supports various document types for context-aware conversations.
- 🤖 Local LLM deployment using Ollama
- 📚 RAG (Retrieval Augmented Generation) support
- 🔒 Privacy-focused (all data stays local)
- 🌐 Multiple input sources:
- GitHub repositories
- Web pages
- Local documents
- 🖥️ Containerized deployment with Podman
- 🎯 Optimized for edge devices
System Requirements
- Python 3.10+
- CUDA-capable GPU (optional, but recommended)
- 8GB RAM minimum (16GB recommended)
- Ubuntu/openSUSE system
Required Software
- Ollama Installation
curl -fsSL https://ollama.com/install.sh | sh
- Podman Setup (Ubuntu)
# Run the setup script
chmod +x setup-ubuntu-host.sh
./setup-ubuntu-host.sh
- Python Dependencies
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
- Clone the Repository
git clone https://github.com/rudrakshkarpe/Edge-GenAI-Workloads-openSUSE.git
cd Edge-GenAI-Workloads-openSUSE
2. Configure Environment
# Setup networking
chmod +x fix-network.sh
./fix-network.sh
# Configure Podman
chmod +x fix-podman-config.sh
./fix-podman-config.sh
3. Start the Application
# Start containers
chmod +x start-containers.sh
./start-containers.sh
# Access the application
open http://localhost:8502
Directory Layout
Edge-GenAI-Workloads-openSUSE/
├── components/ # UI components
├── modules/ # Core modules
├── utils/ # Utility functions
├── docs/ # Documentation
└── scripts/ # Shell scripts
Setting up Embeddings
The project uses HuggingFace embeddings for document processing. The embedding setup is handled in:
# startLine: 21
# endLine: 44
Key features:
- Automatic GPU detection
- Configurable embedding models
- Caching for better performance
Using Ollama Models
- Pull a model:
ollama pull mistral
- The Ollama integration is managed in:
# startLine: 1
# endLine: 66
- Models are automatically detected and listed in the UI
Document Processing
The system supports multiple document sources:
Local Development Setup
- Create a virtual environment
- Install dependencies
- Run Streamlit locally:
streamlit run main.py
Container Setup
The project uses Podman for containerization:
- Start containers:
./start-containers.sh
How to Contribute
- Fork the repository
- Create a feature branch
- Make your changes
- Submit a pull request
Please follow the project's coding standards and include tests for new features.
This project is licensed under the MIT License - see the LICENSE file for details.