maxzaikin
diff --git a/‎MLOPS_README.md‎
Lines changed: 138 additions & 0 deletions b/‎MLOPS_README.md‎
Lines changed: 138 additions & 0 deletions
diff --git a/‎README.md‎
Lines changed: 31 additions & 12 deletions b/‎README.md‎
Lines changed: 31 additions & 12 deletions
diff --git a/‎docker-compose.infra.yml‎
Lines changed: 25 additions & 2 deletions b/‎docker-compose.infra.yml‎
Lines changed: 25 additions & 2 deletions
diff --git a/‎services/a-rag/pipelines/__init__.py‎ b/‎services/a-rag/pipelines/__init__.py‎
diff --git a/‎services/a-rag/pipelines/feature_pipeline.py‎
Lines changed: 53 additions & 0 deletions b/‎services/a-rag/pipelines/feature_pipeline.py‎
Lines changed: 53 additions & 0 deletions
@@ -0,0 +1,138 @@
+# MLOps Pipelines Guide for TGB-MicroSuite
+
+Welcome to the MLOps guide for the `a-rag` service. This document explains the architecture, purpose, and usage of our automated Machine Learning pipelines, orchestrated by **ZenML**.
+
+## 🎯 Philosophy: From Scripts to Pipelines
+
+Our core principle is to treat ML processes not as one-off scripts, but as versioned, reproducible, and automated software components. The previous manual `ingest.py` script was brittle and lacked observability. By migrating to ZenML, we gain:
+
+-   **Reproducibility:** Every pipeline run is tracked, including the code version, parameters, and inputs/outputs (artifacts).
+-   **Observability:** A central dashboard (`http://localhost:8237`) provides a complete history of all runs, logs for each step, and visualization of the pipeline structure (DAG).
+-   **Automation:** These pipelines are the foundation for our future CI/CD/CT (Continuous Integration/Delivery/Training) workflows.
+-   **Modularity & Reusability:** Each step in a pipeline is an independent, reusable function that can be composed into different pipelines.
+
+## 🏗️ MLOps Architecture
+
+Our MLOps capabilities are integrated directly into the `a-rag` microservice and orchestrated by a self-hosted ZenML server managed via `docker-compose.infra.yml`.
+
+```mermaid
+graph TD
+    subgraph "Developer & CI-CD"
+        A["1. Trigger Run: `uv run python pipelines/run_pipeline.py`"]
+    end
+
+    subgraph "ZenML Server (Docker Container)"
+        B["2. ZenML Orchestrator"]
+    end
+
+    subgraph "Execution Logic (within a-rag service)"
+        C["@pipeline: feature_ingestion_pipeline"]
+        D["@step: load_documents"]
+        E["@step: get_vector_store"]
+        F["@step: index_documents"]
+    end
+
+    subgraph "External Infrastructure"
+        G["Source Docs on Disk"]
+        H["ChromaDB (Docker Container)"]
+    end
+
+    A --> B
+    B -->|Executes Pipeline| C
+    C --> D
+    D -->|Documents| F
+    C --> E
+    E -->|VectorStore Client| F
+    
+    D -->|Reads from| G
+    E -->|Connects to| H
+    F -->|Writes to| H
+```
+
+Workflow Explanation:
+
+    A developer or a CI/CD job triggers a pipeline run via the central CLI entry point (pipelines/run_pipeline.py).
+
+    The ZenML client communicates the request to the ZenML Server, which begins orchestrating the pipeline.
+
+    The pipeline definition (feature_ingestion_pipeline) dictates the execution order of the steps.
+
+    Each step (@step) is executed as a tracked job.
+
+        load_documents reads files from the local volume.
+
+        get_vector_store connects to the existing ChromaDB container, reusing connection settings from src/core/config.py.
+
+        index_documents takes the loaded documents and the vector store client, performs embedding, and ingests the data.
+
+    All results, logs, and artifacts are tracked by the ZenML Server and visible in the UI.
+
+
+## 🛠️ Setup & Configuration
+
+### Step 1: Launch Core Infrastructure
+
+From the project root, ensure all services are running. This command starts ChromaDB, Redis, and our ZenML Server.
+
+```bash
+docker-compose -f docker-compose.infra.yml up -d
+```
+
+### Step 2: Set Up the a-rag Service Environment
+
+Navigate to the a-rag service directory. All subsequent commands should be run from here. Then, activate its virtual environment and install dependencies.
+
+```bash
+cd services/a-rag
+
+# Create/activate virtual environment
+uv venv
+source .venv/bin/activate
+
+# Install dependencies
+uv pip sync pyproject.toml
+```
+
+### Step 3: Connect Your Local Client to the ZenML Server (One-Time Setup)
+
+**This is a critical one-time setup step**. You must connect your local ZenML client (which you just installed) to the ZenML Server running in Docker. This tells your client where to send all pipeline information.
+
+Once this is done, the configuration is saved locally, and all future pipeline runs will automatically be sent to and tracked by your local server.
+
+```bash
+# Ensure your (a-rag) venv is active
+(a-rag) $ zenml connect --url http://127.0.0.1:8237 --username default
+```
+[!NOTE]
+The zenml connect command is being deprecated. You might see a warning suggesting to use zenml login. In recent versions, zenml connect might automatically open a browser window for authentication. Simply follow the on-screen instructions. A successful connection is the end goal.
+
+You should see a confirmation message like: ✅ Successfully connected to ZenML server.
+
+
+## ▶️ Running the Feature Ingestion Pipeline
+
+This pipeline is the replacement for the old ingest.py script. It loads documents from a directory and indexes them into ChromaDB.
+
+To run the pipeline, use the pipelines/run_pipeline.py script from within the services/a-rag directory.
+
+```bash
+uv run python -m  pipelines.run_pipeline --source-dir ../../volumes/rag-source-docs  --collection rag_documentation_docker
+```
+
+Arguments:
+
+    --source-dir (required): Path to the directory containing your source documents (e.g., .md, .txt files). The path should be relative to the a-rag service root.
+
+    --collection (optional): The name of the ChromaDB collection to create or use. Defaults to rag_documentation_v2.
+
+**Monitoring the Pipeline**
+
+After triggering a run, you can monitor its progress in real-time:
+
+    Open your browser and go to http://localhost:8237.
+
+    Navigate to the Pipelines -> All Runs tab.
+
+    You will see your rag_feature_ingestion_pipeline run. Click on it to see the graph, check the status of each step, and view detailed logs.
+
+This setup provides a robust, professional framework for managing our ML workflows.
@@ -3,11 +3,10 @@
 [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
 [![UV Package Manager](https://img.shields.io/badge/PackageManager-UV-purple.svg)](https://pypi.org/project/uv/)
 [![Python Version](https://img.shields.io/badge/Python-3.12-blue.svg?logo=python&logoColor=white)](https://www.python.org/)
+[![ZenML](https://img.shields.io/badge/Orchestration-ZenML-8207e4.svg?logo=zenml&logoColor=white)](https://zenml.io/)
 [![LlamaIndex](https://img.shields.io/badge/LlamaIndex-%F0%9F%90%AC%20llama--index-blue.svg)](https://llamaindex.ai/)
 [![llama.cpp](https://img.shields.io/badge/llama.cpp-%F0%9F%90%8E%20C%2B%2B-green.svg)](https://github.com/ggerganov/llama.cpp)
-[![asyncio](https://img.shields.io/badge/asyncio-3.11-blue.svg)](https://docs.python.org/3/library/asyncio.html)
-[![NumPy](https://img.shields.io/badge/NumPy-v1.21-blue.svg?logo=numpy&logoColor=white)](https://numpy.org/)
-[![OpenCV](https://img.shields.io/badge/OpenCV-v4.5.1-blue.svg?logo=opencv&logoColor=white)](https://opencv.org/)[
+[![asyncio](https://img.shields.io/badge/asyncio-3.12-blue.svg)](https://docs.python.org/3/library/asyncio.html)
 [![Docker Ready](https://img.shields.io/badge/Docker-Ready-blue.svg?logo=docker&logoColor=white)](https://www.docker.com/)  
 [![SQLite](https://img.shields.io/badge/SQLite-3.x-green.svg)](https://www.sqlite.org/)
 [![SQLAlchemy](https://img.shields.io/badge/SQLAlchemy-3.x-blue.svg)](https://www.sqlalchemy.org/)
@@ -18,7 +17,6 @@
 [![npm](https://img.shields.io/badge/npm-v11.4.1-CB3837.svg?logo=npm&logoColor=white)](https://www.npmjs.com/)  
 [![Aiogram](https://img.shields.io/badge/Aiogram-3.x-brightgreen.svg?logo=telegram&logoColor=white)](https://aiogram.dev/)
 [![Telegram](https://img.shields.io/badge/Telegram-2CA5E0?style=for-the-badge&logo=telegram&logoColor=white)](https://telegram.org/)
-[![Telegram API](https://img.shields.io/badge/Telegram%20API-2CA5E0?style=for-the-badge&logo=telegram&logoColor=white)](https://core.telegram.org/bots/api)
 
 ## About this repository
 
@@ -35,8 +33,9 @@ This project is not just a collection of code; it's an implementation of a profe
 
 -   **Microservices Architecture:** The system is decomposed into small, independent, and loosely-coupled services. This allows for independent development, deployment, and scaling of each component.
 -   **Clean & Scalable Code:** We adhere to principles like **Feature-Sliced Design (FSD)** on the frontend and a clear service-layer separation on the backend. This ensures the codebase remains predictable and maintainable as it grows.
--   **Infrastructure as Code (IaC):** The entire application stack, including inter-service networking, is defined declaratively in a `docker-compose.yml` file. This guarantees a reproducible environment for both development and production.
+-   **Infrastructure as Code (IaC):** The entire application stack, including inter-service networking, is defined declaratively in a `docker-compose.infra.yml` file. This guarantees a reproducible environment for both development and production.
 -   **Type Safety:** We use **TypeScript** on the frontend and Python type hints with Pydantic on the backend to eliminate entire classes of runtime errors and make the code self-documenting.
+-   **MLOps First:** We treat ML processes not as ad-hoc scripts, but as versioned, reproducible, and automated pipelines managed by an orchestrator.
 
 ---
 
@@ -50,9 +49,10 @@ graph TD
     User["User's Browser"]
     TelegramAPI["Telegram API"]
     Proxy[("Reverse Proxy (Nginx)")]
-    Dashboard["llm-dashboard<br>(React UI + Nginx)"]
-    API["llm-api<br>(FastAPI)"]
-    Gateway["bot-gateway<br>(Aiogram)"]
+    Dashboard["rag-admin<br>(React UI)"]
+    API["a-rag<br>(FastAPI)"]
+    Gateway["tg-gateway<br>(Aiogram)"]
+    ZenML[("ZenML Server")]
 
     %% 2. Group nodes into subgraphs
     subgraph "External World"
@@ -65,6 +65,7 @@ graph TD
         Dashboard
         API
         Gateway
+        ZenML
     end
 
     %% 3. Define all connections between nodes
@@ -75,15 +76,24 @@ graph TD
     
     TelegramAPI -- "Webhook Events" --> Gateway
     Gateway -- "Internal API Calls / Events" --> API
+    API -- "Executes & Logs" --> ZenML
 ```
 
-1. bot-gateway (Formerly TGramBot): The entry point for all interactions from the Telegram API. This service is responsible for receiving messages and forwarding them for processing.
+1. tg-gateway: The entry point for all interactions from the Telegram API.
+2. a-rag (The Core ML Service): The "brain" of the system. It handles business logic, RAG pipelines, and interacts with the database. It also contains the MLOps pipelines.
+3. rag-admin (The Management Frontend): A modern React (SPA) application for system management.
 
-2. llm-api (The LLM Backend): The core "brain" of the system. It handles business logic, interacts with the database, and processes tasks from the bot-gateway.
+4. reverse-proxy: A central Nginx instance that acts as the single entry point for all external traffic.
 
-3. llm-dashboard (The Management Frontend): A modern React (SPA) application for managing the system, viewing data, and configuring API keys. Served by a dedicated Nginx container.
+5. zenml-server: A self-hosted MLOps orchestrator that manages, tracks, and versions all ML pipelines.
+
+## 📦 MLOps & Orchestration
+
+To move beyond manual scripts and embrace professional ML engineering, we use ZenML as our MLOps orchestrator. This allows us to define our data processing, model evaluation, and future training tasks as formal, reproducible pipelines.
+
+[!NOTE]
+For a detailed explanation of our MLOps strategy, pipeline structure, and how to run them, please see our dedicated [MLOps README](MLOPS_README.md).
 
-4. reverse-proxy (The System's Front Door): A central Nginx instance that acts as the single entry point for all external traffic. It intelligently routes requests to the appropriate service (llm-dashboard or llm-api), handles CORS, and is responsible for SSL termination in a production environment.
 
 ## 📂 Project Structure
 
@@ -181,6 +191,15 @@ Follow these steps to get your local environment up and running:
     docker compose -f docker-compose.infra.yml down
     ```
 
+6.  **Running MLOps Pipelines (Data Ingestion):**
+    To populate the RAG knowledge base, you need to run the data ingestion pipeline. This is managed by ZenML. For detailed instructions, see **[MLOPS_README.md](./MLOPS_README.md)**.
+    
+    A typical command to run the ingestion pipeline (executed from `services/a-rag`):
+    ```bash
+    # (Requires one-time setup described in MLOPS_README.md)
+    uv run python pipelines/run_pipeline.py --source-dir ../../volumes/rag-source-docs
+    ```
+    
 ## ☕ Support My Work
 
 [![Buy me a coffee](https://img.shields.io/badge/Buy%20me%20a%20coffee-yellow?logo=kofi)](https://buymeacoffee.com/max.v.zaikin)
 
@@ -20,11 +20,34 @@ services:
     ports:
       - "127.0.0.1:8000:8000"
     volumes:
-      - ./volumes/rag-db:/chroma/.chroma/index
+      # --- [ISSUE-20] Start of changes: Long-Term Memory ---
+      # This volume mapping supports the ChromaDB instance for RAG context storage.
+      # https://docs.trychroma.com/production/containers/docker
+      - ./volumes/rag-db:/data
+      # --- [ISSUE-20] End of changes: Long-Term Memory ---
+    
     environment:
       - IS_PERSISTENT=TRUE
     restart: always
 
+  # --- [ISSUE-26] Start of changes: MLOps Orchestration with ZenML ---
+  # ZenML Server for orchestrating, tracking, and versioning ML pipelines.
+  # This is a self-hosted instance using the latest stable official GHCR image.
+  zenml-server:
+    image: zenmldocker/zenml-server:0.83.1
+    container_name: tgb-local-zenml
+    ports:
+      - "127.0.0.1:8237:8080"
+    volumes:
+      # https://docs.zenml.io/deploying-zenml/deploying-zenml/deploy-with-docker
+      - ./volumes/zenml-db:/zenml/.zenconfig/local_stores/default_zen_store
+
+    restart: always
+  # --- [ISSUE-26] End of changes: MLOps Orchestration with ZenML ---
 volumes:
   redis-db:
-  chroma-db:
+  rag-db:  
+  # --- [ISSUE-26] Start of changes: MLOps Orchestration with ZenML ---
+  # Named volume for ZenML Server data persistence.
+  zenml-db:
+  # --- [ISSUE-26] End of changes: MLOps Orchestration with ZenML ---
@@ -0,0 +1,53 @@
+"""
+file: services/a-rag/pipelines/feature_pipeline.py
+
+# --- [ISSUE-26] Implement ZenML for MLOps Pipeline Management ---
+
+Defines the ZenML pipeline for document ingestion and feature creation.
+
+This pipeline orchestrates the steps defined in `pipelines.steps.*` to
+create a reproducible and trackable workflow for populating our RAG
+knowledge base.
+"""
+from pathlib import Path
+from zenml import pipeline
+
+# Import the steps from our steps module using explicit relative imports.
+from .steps.data_processing import (
+    ensure_vector_store_exists,
+    index_documents,
+    load_documents,
+)
+
+
+@pipeline(name="rag_feature_ingestion_pipeline")
+def feature_ingestion_pipeline(source_dir: Path, collection_name: str):
+    """
+    The feature ingestion pipeline for our RAG system.
+    
+    This version is more robust, passing simple data types between steps
+    instead of complex, non-serializable objects.
+
+    Args:
+        source_dir: Path to the source directory containing documents.
+        collection_name: Name of the ChromaDB collection to use.
+    """
+    # Each function call here corresponds to a step in the pipeline.
+    # ZenML automatically handles passing the output of one step as input
+    # to the next, based on the function signatures.
+
+    # Step 1: Load documents from the source directory.
+    documents = load_documents(source_dir=source_dir)
+
+    # Step 2: Ensure the vector database is available and the collection exists.
+    # This step must complete before indexing can begin.
+    validated_collection_name = ensure_vector_store_exists(
+        collection_name=collection_name
+    )
+
+    # Step 3: Index the documents into the validated collection.
+    # This step depends on the outputs of `load_documents` and `ensure_vector_store_exists`.
+    # ZenML understands this dependency graph automatically.
+    index_documents(
+        documents=documents, collection_name=validated_collection_name
+    )