Skip to content

rudrakshkarpe/Edge-GenAI-Workloads-openSUSE

Repository files navigation

Edge-GenAI-Workloads-openSUSE 🚀

A retrieval augmented generation (RAG) platform with open-source Large Language Models (LLMs) without exposing your data to online LLM providers.

Table of Contents

Overview

This project enables local LLM deployment with RAG capabilities, specifically designed for edge computing environments. It uses Ollama for LLM serving and supports various document types for context-aware conversations.

Features

  • 🤖 Local LLM deployment using Ollama
  • 📚 RAG (Retrieval Augmented Generation) support
  • 🔒 Privacy-focused (all data stays local)
  • 🌐 Multiple input sources:
    • GitHub repositories
    • Web pages
    • Local documents
  • 🖥️ Containerized deployment with Podman
  • 🎯 Optimized for edge devices

Prerequisites

System Requirements
  • Python 3.10+
  • CUDA-capable GPU (optional, but recommended)
  • 8GB RAM minimum (16GB recommended)
  • Ubuntu/openSUSE system
Required Software
  1. Ollama Installation
curl -fsSL https://ollama.com/install.sh | sh
  1. Podman Setup (Ubuntu)
# Run the setup script
chmod +x setup-ubuntu-host.sh
./setup-ubuntu-host.sh
  1. Python Dependencies
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
  1. Clone the Repository
git clone https://github.com/rudrakshkarpe/Edge-GenAI-Workloads-openSUSE.git
cd Edge-GenAI-Workloads-openSUSE
2. Configure Environment
# Setup networking
chmod +x fix-network.sh
./fix-network.sh

# Configure Podman
chmod +x fix-podman-config.sh
./fix-podman-config.sh
3. Start the Application
# Start containers
chmod +x start-containers.sh
./start-containers.sh

# Access the application
open http://localhost:8502

Project Structure

Directory Layout
Edge-GenAI-Workloads-openSUSE/
├── components/         # UI components
├── modules/           # Core modules
├── utils/            # Utility functions
├── docs/             # Documentation
└── scripts/          # Shell scripts

Usage

Setting up Embeddings

The project uses HuggingFace embeddings for document processing. The embedding setup is handled in:

# startLine: 21
# endLine: 44

Key features:

  • Automatic GPU detection
  • Configurable embedding models
  • Caching for better performance
Using Ollama Models
  1. Pull a model:
ollama pull mistral
  1. The Ollama integration is managed in:
# startLine: 1
# endLine: 66
  1. Models are automatically detected and listed in the UI
Document Processing

The system supports multiple document sources:

Development

Local Development Setup
  1. Create a virtual environment
  2. Install dependencies
  3. Run Streamlit locally:
streamlit run main.py

Containerization

Container Setup

The project uses Podman for containerization:

  1. Start containers:
./start-containers.sh

Contributing

How to Contribute
  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Submit a pull request

Please follow the project's coding standards and include tests for new features.

License

This project is licensed under the MIT License - see the LICENSE file for details.


Google Summer of Code 2024 @ openSUSE Project

About

Google Summer of Code 2024 @ openSUSE Project

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published