Skip to content

Commit 14fdb00

Browse files
authored
[ISSUE-26] Implement ZenML orchestration (#27)
* chore(infra): add zenml-server service and volume (#26) Integrates the self-hosted ZenML server into the local infrastructure stack. This change adds the service to docker-compose and defines a persistent volume for MLOps metadata. * refactor(a-rag): replace ingest script with ZenML pipeline This commit finalizes the transition from the manual script to the new orchestrated ZenML pipeline. - The script and its associated README have been marked as depricated to eliminate redundancy and prevent misuse. All data ingestion should now be performed exclusively through the . Refs #26 * The script and its associated README have been removed to eliminate redundancy and prevent misuse. * fix: mermaid diagram sytax error fixed * fix: mermaid diagram sytax error fixed * fix: mermaid diagram sytax error fixed * feat(mlops): implement ZenML pipeline for data ingestion This commit introduces a foundational MLOps capability by refactoring the manual `ingest.py` script into a formal, orchestrated ZenML pipeline. This provides reproducibility, observability, and a clear path for future automation. Key Changes: - A self-hosted ZenML server has been added to `docker-compose.infra.yml` to act as a central orchestrator. - A new `pipelines/` directory structure has been created within the `a-rag` service to house all ZenML-related code, separating it from the online API logic. - The ingestion process has been decomposed into three distinct, reusable steps: `load_documents`, `ensure_vector_store_exists`, and `index_documents`. - A CLI entry point `pipelines/run_pipeline.py` is created for standardized pipeline execution. - The old `ingest.py` script is now deprecated and has been removed to ensure a single source of truth for data ingestion. Closes #26
1 parent a3196da commit 14fdb00

File tree

13 files changed

+923
-557
lines changed

13 files changed

+923
-557
lines changed

MLOPS_README.md

Lines changed: 138 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,138 @@
1+
# MLOps Pipelines Guide for TGB-MicroSuite
2+
3+
Welcome to the MLOps guide for the `a-rag` service. This document explains the architecture, purpose, and usage of our automated Machine Learning pipelines, orchestrated by **ZenML**.
4+
5+
## 🎯 Philosophy: From Scripts to Pipelines
6+
7+
Our core principle is to treat ML processes not as one-off scripts, but as versioned, reproducible, and automated software components. The previous manual `ingest.py` script was brittle and lacked observability. By migrating to ZenML, we gain:
8+
9+
- **Reproducibility:** Every pipeline run is tracked, including the code version, parameters, and inputs/outputs (artifacts).
10+
- **Observability:** A central dashboard (`http://localhost:8237`) provides a complete history of all runs, logs for each step, and visualization of the pipeline structure (DAG).
11+
- **Automation:** These pipelines are the foundation for our future CI/CD/CT (Continuous Integration/Delivery/Training) workflows.
12+
- **Modularity & Reusability:** Each step in a pipeline is an independent, reusable function that can be composed into different pipelines.
13+
14+
## 🏗️ MLOps Architecture
15+
16+
Our MLOps capabilities are integrated directly into the `a-rag` microservice and orchestrated by a self-hosted ZenML server managed via `docker-compose.infra.yml`.
17+
18+
```mermaid
19+
graph TD
20+
subgraph "Developer & CI-CD"
21+
A["1. Trigger Run: `uv run python pipelines/run_pipeline.py`"]
22+
end
23+
24+
subgraph "ZenML Server (Docker Container)"
25+
B["2. ZenML Orchestrator"]
26+
end
27+
28+
subgraph "Execution Logic (within a-rag service)"
29+
C["@pipeline: feature_ingestion_pipeline"]
30+
D["@step: load_documents"]
31+
E["@step: get_vector_store"]
32+
F["@step: index_documents"]
33+
end
34+
35+
subgraph "External Infrastructure"
36+
G["Source Docs on Disk"]
37+
H["ChromaDB (Docker Container)"]
38+
end
39+
40+
A --> B
41+
B -->|Executes Pipeline| C
42+
C --> D
43+
D -->|Documents| F
44+
C --> E
45+
E -->|VectorStore Client| F
46+
47+
D -->|Reads from| G
48+
E -->|Connects to| H
49+
F -->|Writes to| H
50+
```
51+
52+
Workflow Explanation:
53+
54+
A developer or a CI/CD job triggers a pipeline run via the central CLI entry point (pipelines/run_pipeline.py).
55+
56+
The ZenML client communicates the request to the ZenML Server, which begins orchestrating the pipeline.
57+
58+
The pipeline definition (feature_ingestion_pipeline) dictates the execution order of the steps.
59+
60+
Each step (@step) is executed as a tracked job.
61+
62+
load_documents reads files from the local volume.
63+
64+
get_vector_store connects to the existing ChromaDB container, reusing connection settings from src/core/config.py.
65+
66+
index_documents takes the loaded documents and the vector store client, performs embedding, and ingests the data.
67+
68+
All results, logs, and artifacts are tracked by the ZenML Server and visible in the UI.
69+
70+
71+
## 🛠️ Setup & Configuration
72+
73+
### Step 1: Launch Core Infrastructure
74+
75+
From the project root, ensure all services are running. This command starts ChromaDB, Redis, and our ZenML Server.
76+
77+
```bash
78+
docker-compose -f docker-compose.infra.yml up -d
79+
```
80+
81+
### Step 2: Set Up the a-rag Service Environment
82+
83+
Navigate to the a-rag service directory. All subsequent commands should be run from here. Then, activate its virtual environment and install dependencies.
84+
85+
```bash
86+
cd services/a-rag
87+
88+
# Create/activate virtual environment
89+
uv venv
90+
source .venv/bin/activate
91+
92+
# Install dependencies
93+
uv pip sync pyproject.toml
94+
```
95+
96+
### Step 3: Connect Your Local Client to the ZenML Server (One-Time Setup)
97+
98+
**This is a critical one-time setup step**. You must connect your local ZenML client (which you just installed) to the ZenML Server running in Docker. This tells your client where to send all pipeline information.
99+
100+
Once this is done, the configuration is saved locally, and all future pipeline runs will automatically be sent to and tracked by your local server.
101+
102+
```bash
103+
# Ensure your (a-rag) venv is active
104+
(a-rag) $ zenml connect --url http://127.0.0.1:8237 --username default
105+
```
106+
[!NOTE]
107+
The zenml connect command is being deprecated. You might see a warning suggesting to use zenml login. In recent versions, zenml connect might automatically open a browser window for authentication. Simply follow the on-screen instructions. A successful connection is the end goal.
108+
109+
You should see a confirmation message like: ✅ Successfully connected to ZenML server.
110+
111+
112+
## ▶️ Running the Feature Ingestion Pipeline
113+
114+
This pipeline is the replacement for the old ingest.py script. It loads documents from a directory and indexes them into ChromaDB.
115+
116+
To run the pipeline, use the pipelines/run_pipeline.py script from within the services/a-rag directory.
117+
118+
```bash
119+
uv run python -m pipelines.run_pipeline --source-dir ../../volumes/rag-source-docs --collection rag_documentation_docker
120+
```
121+
122+
Arguments:
123+
124+
--source-dir (required): Path to the directory containing your source documents (e.g., .md, .txt files). The path should be relative to the a-rag service root.
125+
126+
--collection (optional): The name of the ChromaDB collection to create or use. Defaults to rag_documentation_v2.
127+
128+
**Monitoring the Pipeline**
129+
130+
After triggering a run, you can monitor its progress in real-time:
131+
132+
Open your browser and go to http://localhost:8237.
133+
134+
Navigate to the Pipelines -> All Runs tab.
135+
136+
You will see your rag_feature_ingestion_pipeline run. Click on it to see the graph, check the status of each step, and view detailed logs.
137+
138+
This setup provides a robust, professional framework for managing our ML workflows.

README.md

Lines changed: 31 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -3,11 +3,10 @@
33
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
44
[![UV Package Manager](https://img.shields.io/badge/PackageManager-UV-purple.svg)](https://pypi.org/project/uv/)
55
[![Python Version](https://img.shields.io/badge/Python-3.12-blue.svg?logo=python&logoColor=white)](https://www.python.org/)
6+
[![ZenML](https://img.shields.io/badge/Orchestration-ZenML-8207e4.svg?logo=zenml&logoColor=white)](https://zenml.io/)
67
[![LlamaIndex](https://img.shields.io/badge/LlamaIndex-%F0%9F%90%AC%20llama--index-blue.svg)](https://llamaindex.ai/)
78
[![llama.cpp](https://img.shields.io/badge/llama.cpp-%F0%9F%90%8E%20C%2B%2B-green.svg)](https://github.com/ggerganov/llama.cpp)
8-
[![asyncio](https://img.shields.io/badge/asyncio-3.11-blue.svg)](https://docs.python.org/3/library/asyncio.html)
9-
[![NumPy](https://img.shields.io/badge/NumPy-v1.21-blue.svg?logo=numpy&logoColor=white)](https://numpy.org/)
10-
[![OpenCV](https://img.shields.io/badge/OpenCV-v4.5.1-blue.svg?logo=opencv&logoColor=white)](https://opencv.org/)[
9+
[![asyncio](https://img.shields.io/badge/asyncio-3.12-blue.svg)](https://docs.python.org/3/library/asyncio.html)
1110
[![Docker Ready](https://img.shields.io/badge/Docker-Ready-blue.svg?logo=docker&logoColor=white)](https://www.docker.com/)
1211
[![SQLite](https://img.shields.io/badge/SQLite-3.x-green.svg)](https://www.sqlite.org/)
1312
[![SQLAlchemy](https://img.shields.io/badge/SQLAlchemy-3.x-blue.svg)](https://www.sqlalchemy.org/)
@@ -18,7 +17,6 @@
1817
[![npm](https://img.shields.io/badge/npm-v11.4.1-CB3837.svg?logo=npm&logoColor=white)](https://www.npmjs.com/)
1918
[![Aiogram](https://img.shields.io/badge/Aiogram-3.x-brightgreen.svg?logo=telegram&logoColor=white)](https://aiogram.dev/)
2019
[![Telegram](https://img.shields.io/badge/Telegram-2CA5E0?style=for-the-badge&logo=telegram&logoColor=white)](https://telegram.org/)
21-
[![Telegram API](https://img.shields.io/badge/Telegram%20API-2CA5E0?style=for-the-badge&logo=telegram&logoColor=white)](https://core.telegram.org/bots/api)
2220

2321
## About this repository
2422

@@ -35,8 +33,9 @@ This project is not just a collection of code; it's an implementation of a profe
3533

3634
- **Microservices Architecture:** The system is decomposed into small, independent, and loosely-coupled services. This allows for independent development, deployment, and scaling of each component.
3735
- **Clean & Scalable Code:** We adhere to principles like **Feature-Sliced Design (FSD)** on the frontend and a clear service-layer separation on the backend. This ensures the codebase remains predictable and maintainable as it grows.
38-
- **Infrastructure as Code (IaC):** The entire application stack, including inter-service networking, is defined declaratively in a `docker-compose.yml` file. This guarantees a reproducible environment for both development and production.
36+
- **Infrastructure as Code (IaC):** The entire application stack, including inter-service networking, is defined declaratively in a `docker-compose.infra.yml` file. This guarantees a reproducible environment for both development and production.
3937
- **Type Safety:** We use **TypeScript** on the frontend and Python type hints with Pydantic on the backend to eliminate entire classes of runtime errors and make the code self-documenting.
38+
- **MLOps First:** We treat ML processes not as ad-hoc scripts, but as versioned, reproducible, and automated pipelines managed by an orchestrator.
4039

4140
---
4241

@@ -50,9 +49,10 @@ graph TD
5049
User["User's Browser"]
5150
TelegramAPI["Telegram API"]
5251
Proxy[("Reverse Proxy (Nginx)")]
53-
Dashboard["llm-dashboard<br>(React UI + Nginx)"]
54-
API["llm-api<br>(FastAPI)"]
55-
Gateway["bot-gateway<br>(Aiogram)"]
52+
Dashboard["rag-admin<br>(React UI)"]
53+
API["a-rag<br>(FastAPI)"]
54+
Gateway["tg-gateway<br>(Aiogram)"]
55+
ZenML[("ZenML Server")]
5656
5757
%% 2. Group nodes into subgraphs
5858
subgraph "External World"
@@ -65,6 +65,7 @@ graph TD
6565
Dashboard
6666
API
6767
Gateway
68+
ZenML
6869
end
6970
7071
%% 3. Define all connections between nodes
@@ -75,15 +76,24 @@ graph TD
7576
7677
TelegramAPI -- "Webhook Events" --> Gateway
7778
Gateway -- "Internal API Calls / Events" --> API
79+
API -- "Executes & Logs" --> ZenML
7880
```
7981

80-
1. bot-gateway (Formerly TGramBot): The entry point for all interactions from the Telegram API. This service is responsible for receiving messages and forwarding them for processing.
82+
1. tg-gateway: The entry point for all interactions from the Telegram API.
83+
2. a-rag (The Core ML Service): The "brain" of the system. It handles business logic, RAG pipelines, and interacts with the database. It also contains the MLOps pipelines.
84+
3. rag-admin (The Management Frontend): A modern React (SPA) application for system management.
8185

82-
2. llm-api (The LLM Backend): The core "brain" of the system. It handles business logic, interacts with the database, and processes tasks from the bot-gateway.
86+
4. reverse-proxy: A central Nginx instance that acts as the single entry point for all external traffic.
8387

84-
3. llm-dashboard (The Management Frontend): A modern React (SPA) application for managing the system, viewing data, and configuring API keys. Served by a dedicated Nginx container.
88+
5. zenml-server: A self-hosted MLOps orchestrator that manages, tracks, and versions all ML pipelines.
89+
90+
## 📦 MLOps & Orchestration
91+
92+
To move beyond manual scripts and embrace professional ML engineering, we use ZenML as our MLOps orchestrator. This allows us to define our data processing, model evaluation, and future training tasks as formal, reproducible pipelines.
93+
94+
[!NOTE]
95+
For a detailed explanation of our MLOps strategy, pipeline structure, and how to run them, please see our dedicated [MLOps README](MLOPS_README.md).
8596

86-
4. reverse-proxy (The System's Front Door): A central Nginx instance that acts as the single entry point for all external traffic. It intelligently routes requests to the appropriate service (llm-dashboard or llm-api), handles CORS, and is responsible for SSL termination in a production environment.
8797

8898
## 📂 Project Structure
8999

@@ -181,6 +191,15 @@ Follow these steps to get your local environment up and running:
181191
docker compose -f docker-compose.infra.yml down
182192
```
183193
194+
6. **Running MLOps Pipelines (Data Ingestion):**
195+
To populate the RAG knowledge base, you need to run the data ingestion pipeline. This is managed by ZenML. For detailed instructions, see **[MLOPS_README.md](./MLOPS_README.md)**.
196+
197+
A typical command to run the ingestion pipeline (executed from `services/a-rag`):
198+
```bash
199+
# (Requires one-time setup described in MLOPS_README.md)
200+
uv run python pipelines/run_pipeline.py --source-dir ../../volumes/rag-source-docs
201+
```
202+
184203
## ☕ Support My Work
185204
186205
[![Buy me a coffee](https://img.shields.io/badge/Buy%20me%20a%20coffee-yellow?logo=kofi)](https://buymeacoffee.com/max.v.zaikin)

docker-compose.infra.yml

Lines changed: 25 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -20,11 +20,34 @@ services:
2020
ports:
2121
- "127.0.0.1:8000:8000"
2222
volumes:
23-
- ./volumes/rag-db:/chroma/.chroma/index
23+
# --- [ISSUE-20] Start of changes: Long-Term Memory ---
24+
# This volume mapping supports the ChromaDB instance for RAG context storage.
25+
# https://docs.trychroma.com/production/containers/docker
26+
- ./volumes/rag-db:/data
27+
# --- [ISSUE-20] End of changes: Long-Term Memory ---
28+
2429
environment:
2530
- IS_PERSISTENT=TRUE
2631
restart: always
2732

33+
# --- [ISSUE-26] Start of changes: MLOps Orchestration with ZenML ---
34+
# ZenML Server for orchestrating, tracking, and versioning ML pipelines.
35+
# This is a self-hosted instance using the latest stable official GHCR image.
36+
zenml-server:
37+
image: zenmldocker/zenml-server:0.83.1
38+
container_name: tgb-local-zenml
39+
ports:
40+
- "127.0.0.1:8237:8080"
41+
volumes:
42+
# https://docs.zenml.io/deploying-zenml/deploying-zenml/deploy-with-docker
43+
- ./volumes/zenml-db:/zenml/.zenconfig/local_stores/default_zen_store
44+
45+
restart: always
46+
# --- [ISSUE-26] End of changes: MLOps Orchestration with ZenML ---
2847
volumes:
2948
redis-db:
30-
chroma-db:
49+
rag-db:
50+
# --- [ISSUE-26] Start of changes: MLOps Orchestration with ZenML ---
51+
# Named volume for ZenML Server data persistence.
52+
zenml-db:
53+
# --- [ISSUE-26] End of changes: MLOps Orchestration with ZenML ---

services/a-rag/pipelines/__init__.py

Whitespace-only changes.
Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,53 @@
1+
"""
2+
file: services/a-rag/pipelines/feature_pipeline.py
3+
4+
# --- [ISSUE-26] Implement ZenML for MLOps Pipeline Management ---
5+
6+
Defines the ZenML pipeline for document ingestion and feature creation.
7+
8+
This pipeline orchestrates the steps defined in `pipelines.steps.*` to
9+
create a reproducible and trackable workflow for populating our RAG
10+
knowledge base.
11+
"""
12+
from pathlib import Path
13+
from zenml import pipeline
14+
15+
# Import the steps from our steps module using explicit relative imports.
16+
from .steps.data_processing import (
17+
ensure_vector_store_exists,
18+
index_documents,
19+
load_documents,
20+
)
21+
22+
23+
@pipeline(name="rag_feature_ingestion_pipeline")
24+
def feature_ingestion_pipeline(source_dir: Path, collection_name: str):
25+
"""
26+
The feature ingestion pipeline for our RAG system.
27+
28+
This version is more robust, passing simple data types between steps
29+
instead of complex, non-serializable objects.
30+
31+
Args:
32+
source_dir: Path to the source directory containing documents.
33+
collection_name: Name of the ChromaDB collection to use.
34+
"""
35+
# Each function call here corresponds to a step in the pipeline.
36+
# ZenML automatically handles passing the output of one step as input
37+
# to the next, based on the function signatures.
38+
39+
# Step 1: Load documents from the source directory.
40+
documents = load_documents(source_dir=source_dir)
41+
42+
# Step 2: Ensure the vector database is available and the collection exists.
43+
# This step must complete before indexing can begin.
44+
validated_collection_name = ensure_vector_store_exists(
45+
collection_name=collection_name
46+
)
47+
48+
# Step 3: Index the documents into the validated collection.
49+
# This step depends on the outputs of `load_documents` and `ensure_vector_store_exists`.
50+
# ZenML understands this dependency graph automatically.
51+
index_documents(
52+
documents=documents, collection_name=validated_collection_name
53+
)

0 commit comments

Comments
 (0)