Edge-GenAI-Workloads-openSUSE 🚀

A retrieval augmented generation (RAG) platform with open-source Large Language Models (LLMs) without exposing your data to online LLM providers.

Table of Contents

Overview
Features
Prerequisites
Installation
Project Structure
Usage
Development
Containerization
Contributing
License

Overview

This project enables local LLM deployment with RAG capabilities, specifically designed for edge computing environments. It uses Ollama for LLM serving and supports various document types for context-aware conversations.

Features

🤖 Local LLM deployment using Ollama
📚 RAG (Retrieval Augmented Generation) support
🔒 Privacy-focused (all data stays local)
🌐 Multiple input sources:
- GitHub repositories
- Web pages
- Local documents
🖥️ Containerized deployment with Podman
🎯 Optimized for edge devices

Prerequisites

System Requirements

Python 3.10+
CUDA-capable GPU (optional, but recommended)
8GB RAM minimum (16GB recommended)
Ubuntu/openSUSE system

Required Software

Ollama Installation

curl -fsSL https://ollama.com/install.sh | sh

Podman Setup (Ubuntu)

# Run the setup script
chmod +x setup-ubuntu-host.sh
./setup-ubuntu-host.sh

Python Dependencies

python -m venv venv
source venv/bin/activate
pip install -r requirements.txt

Clone the Repository

git clone https://github.com/rudrakshkarpe/Edge-GenAI-Workloads-openSUSE.git
cd Edge-GenAI-Workloads-openSUSE

2. Configure Environment

# Setup networking
chmod +x fix-network.sh
./fix-network.sh

# Configure Podman
chmod +x fix-podman-config.sh
./fix-podman-config.sh

3. Start the Application

# Start containers
chmod +x start-containers.sh
./start-containers.sh

# Access the application
open http://localhost:8502

Project Structure

Directory Layout

Edge-GenAI-Workloads-openSUSE/
├── components/         # UI components
├── modules/           # Core modules
├── utils/            # Utility functions
├── docs/             # Documentation
└── scripts/          # Shell scripts

Usage

Setting up Embeddings

The project uses HuggingFace embeddings for document processing. The embedding setup is handled in:

# startLine: 21
# endLine: 44

Key features:

Automatic GPU detection
Configurable embedding models
Caching for better performance

Using Ollama Models

Pull a model:

ollama pull mistral

The Ollama integration is managed in:

# startLine: 1
# endLine: 66

Models are automatically detected and listed in the UI

Document Processing

The system supports multiple document sources:

Development

Local Development Setup

Create a virtual environment
Install dependencies
Run Streamlit locally:

streamlit run main.py

Containerization

Container Setup

The project uses Podman for containerization:

Start containers:

./start-containers.sh

Contributing

How to Contribute

Fork the repository
Create a feature branch
Make your changes
Submit a pull request

Please follow the project's coding standards and include tests for new features.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Google Summer of Code 2024 @ openSUSE Project

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Edge-GenAI-Workloads-openSUSE 🚀

Overview

Features

Prerequisites

Project Structure

Usage

Development

Containerization

Contributing

License

Files

README.md

Latest commit

History

README.md

File metadata and controls

Edge-GenAI-Workloads-openSUSE 🚀

Overview

Features

Prerequisites

Project Structure

Usage

Development

Containerization

Contributing

License