🚀 GenAI Application Platform

A comprehensive platform for deploying and managing Generative AI applications on OpenShift, featuring multiple model serving runtimes, vector databases, object storage, API gateways, web interfaces, monitoring, and testing suites.

📋 Overview

This project provides a complete, production-ready infrastructure for deploying Generative AI applications on OpenShift. It includes environment preparation, multiple LLM serving backends, vector databases for embeddings, S3-compatible storage, API gateways for unified access, user-friendly GUIs, comprehensive monitoring stacks, and extensive load/performance testing capabilities.

✨ Features

Environment Preparation: Automated setup and cleanup scripts for OpenShift environments
GitOps Integration: ArgoCD configurations for continuous deployment
Multiple LLM Backends: Support for Ollama (CPU), vLLM (CPU/GPU), and NVIDIA NIM (GPU)
Vector Databases: Milvus for high-performance vector similarity search
Object Storage: MinIO S3-compatible storage for models and data
API Gateways: LiteLLM for unified model API access
Web GUIs: AnythingLLM and OpenWebUI for intuitive AI interaction and document management
Monitoring Stack: Grafana dashboards and Prometheus metrics for GPU and model performance
Load Testing: Comprehensive test suite including smoke, stress, spike, and performance tests
LLM Performance Testing: Specialized benchmarks for model inference and throughput
Infrastructure Automation: Scripts for automated deployment and resource management

🏗️ Architecture Components

🤖 Model Serving Runtimes

Ollama: Lightweight runtime for CPU-based model serving
vLLM: High-performance serving runtime with GPU acceleration
NVIDIA NIM: Optimized microservices for NVIDIA GPU deployments

🗄️ Vector Databases

Milvus: Cloud-native vector database for similarity search and embeddings

💾 Object Storage

MinIO: S3-compatible object storage for models, documents, and artifacts

🌐 API Gateways

LiteLLM: Unified API gateway for accessing multiple LLM providers

🖥️ User Interfaces

AnythingLLM: Web-based GUI for document management and AI chat interactions
OpenWebUI: Alternative web interface for AI model interactions

📊 Monitoring & Observability

Grafana: Dashboards for monitoring GPU usage, model performance, and system metrics
Prometheus: Metrics collection and alerting system

🧪 Testing Infrastructure

Load Testing Suite: Smoke, stress, spike, and performance tests
LLM Performance Testing: Specialized benchmarks for model inference and throughput
Benchmarking Tools: Model performance and throughput testing

🚀 GitOps & Automation

ArgoCD Configurations: GitOps manifests for continuous deployment
Infrastructure Scripts: Automated setup and cleanup utilities

📋 Prerequisites

Kubernetes Cluster: OpenShift 4.x+ (preferred) or vanilla Kubernetes
CLI Tools: kubectl or oc installed and configured
Storage: Sufficient persistent storage for models and data
Compute Resources: CPU or GPU nodes depending on deployment type
Access: Cluster admin access for namespace and resource creation

Optional (for GPU deployments)

NVIDIA GPUs with appropriate drivers
NGC API key for NVIDIA NIM models
NVIDIA Developer Program membership

📁 Project Structure

genai-application/
├── docs/                   # Documentation and images
│   └── images/            # Documentation images and diagrams
├── env_preparation/        # Environment setup and cleanup scripts
├── gitops/                # GitOps configurations (ArgoCD)
├── models/                # LLM model deployments
│   ├── nvidia_nim/        # NVIDIA NIM GPU models
│   ├── ollama/            # Ollama CPU models
│   └── vllm/              # vLLM CPU/GPU models
├── monitoring_alerting/   # Monitoring stack and alerting rules
├── rag_usecase/           # RAG-specific configurations (example use case)
├── s3_storage/            # S3-compatible storage deployments
│   └── minio_on_openshift/ # MinIO storage deployment
├── tests/                 # Testing suites and performance benchmarks
│   ├── last_und_performance/ # Load and performance tests
│   └── llm_performance/   # LLM-specific performance testing
├── vectordb/              # Vector database deployments
│   └── milvus/            # Milvus vector database
├── web_interfaces/        # Web GUI deployments
│   ├── anythingllm/       # AnythingLLM GUI deployment
│   └── openwebui/         # OpenWebUI interface
├── gpu_deployment.md      # GPU deployment guide
├── infra_preparation_auto.sh  # Infrastructure automation script
├── LICENSE                # Apache 2.0 License
├── README.md              # This file
└── ROADMAP.md             # Project roadmap

🤝 Contributing

We welcome contributions! Please see our roadmap for planned features.

Development Setup

Fork the repository
Create a feature branch
Make your changes
Submit a pull request

Guidelines

Follow Kubernetes best practices for manifests
Include documentation for new components
Add tests for new functionality
Update the roadmap for significant changes

📄 License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

🙏 Acknowledgments

This project builds upon and includes components from:

📞 Support

For issues and questions:

Check existing issues
Create a new issue with detailed information
Review component-specific READMEs for troubleshooting

Note: This platform is optimized for OpenShift clusters and provides a foundation for various Generative AI use cases including RAG, chatbots, content generation, and more. Support for other Kubernetes distributions may require modifications. c:\Users\bahma\Desktop\projects\13_RAG_LLMs\genai-application\README.md

Name		Name	Last commit message	Last commit date
Latest commit History 111 Commits
ai-gateways/litemaas		ai-gateways/litemaas
docs/images		docs/images
env_preparation		env_preparation
gitops		gitops
models		models
monitoring_alerting		monitoring_alerting
rag_usecase		rag_usecase
s3_storage/minio_on_openshift		s3_storage/minio_on_openshift
tests/last_und_performance		tests/last_und_performance
vectordb/milvus		vectordb/milvus
web_interfaces/anythingllm		web_interfaces/anythingllm
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
ROADMAP.md		ROADMAP.md
gpu_deployment.md		gpu_deployment.md
infra_preparation_auto.sh		infra_preparation_auto.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🚀 GenAI Application Platform

📋 Overview

✨ Features

🏗️ Architecture Components

🤖 Model Serving Runtimes

🗄️ Vector Databases

💾 Object Storage

🌐 API Gateways

🖥️ User Interfaces

📊 Monitoring & Observability

🧪 Testing Infrastructure

🚀 GitOps & Automation

📋 Prerequisites

Optional (for GPU deployments)

📁 Project Structure

🤝 Contributing

Development Setup

Guidelines

📄 License

🙏 Acknowledgments

📞 Support

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🚀 GenAI Application Platform

📋 Overview

✨ Features

🏗️ Architecture Components

🤖 Model Serving Runtimes

🗄️ Vector Databases

💾 Object Storage

🌐 API Gateways

🖥️ User Interfaces

📊 Monitoring & Observability

🧪 Testing Infrastructure

🚀 GitOps & Automation

📋 Prerequisites

Optional (for GPU deployments)

📁 Project Structure

🤝 Contributing

Development Setup

Guidelines

📄 License

🙏 Acknowledgments

📞 Support

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages