Skip to content

Commit 2a673ec

Browse files
bcdurakhtahir1strickvl
committed
Documentation revamp (#3524)
* trying it out * removed duplicated entries * one more * commenting out the new redirects * hello world * removing some stuff * code repos * checkpoint * new things * new * new changes * logging changes * new changes * new * minor * docs * new * removed the client * new stuff * GITBOOK-46: No subject * toc * GITBOOK-48: No subject * GITBOOK-49: No subject * new structure * better example * advanced features * configuration with yaml * artifacts * a few changes * changing the order * new changes * adding the pages back in * develop changes * toc changes * GITBOOK-50: No subject * trying something new * new changes * GITBOOK-1: No subject * changed the order * renaming * new deploy section * trying redirects * trying * trying direct link * tocs * lets try this * new hello world * hello * value * rename * more changes to tocs * GITBOOK-2: No subject * new changes * secrets * trying something else * trying again * one more test * removed the extension * one more try * new try * new trial * trying something * lol * ... * try * what is happening * lol * trying one final thing * removing tests * GITBOOK-51: No subject * adding back missing redirects * asset clean up * old redirrect * red * some final changes * removing another page * finishing touches * fixing links so far * logging * checkpoint * checkpoint * manage * new structure * headers * renaming * some minor changes * GITBOOK-3: No subject * new page * new toc.md * GITBOOK-4: No subject * first commit (cherry picked from commit 36ab57c) * fixing a few bugs (cherry picked from commit d3aac85) * nice checkpoint * one more checkpoint * new fixes * one final bug * boooom * GITBOOK-1: No subject * Update image path for Templates in documentation * Improve link extraction and broken link detection * Update API reference links in connection documentation * Improve hyper-parameter tuning documentation * Add example usage of ZenML Client fetching pipeline runs * Install ZenML package in installation guide * modified redirects * GITBOOK-52: No subject * removed extra assets * Refactor code for running remote notebooks * cursor fixing some of the links * trying something * try * second try * Use export-requirements for the integrations page * disabling some redirects * Update tutorial on fetching pipelines and running remote notebooks * Add handling for broken-reference links in check_broken_links --------- Co-authored-by: Hamza Tahir <[email protected]> Co-authored-by: Hamza Tahir <[email protected]> Co-authored-by: Alex Strick van Linschoten <[email protected]> (cherry picked from commit c92a527)
1 parent 1be19a6 commit 2a673ec

File tree

126 files changed

+9179
-3222
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

126 files changed

+9179
-3222
lines changed

.gitbook.yaml

Lines changed: 319 additions & 242 deletions
Large diffs are not rendered by default.
Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,54 @@
1+
---
2+
name: GitBook Redirect Checks
3+
on:
4+
pull_request:
5+
types: [opened, synchronize]
6+
paths: [docs/**, .gitbook.yaml]
7+
jobs:
8+
check_gitbook:
9+
if: github.event.pull_request.draft == false
10+
runs-on: ubuntu-latest
11+
steps:
12+
# Setup Python
13+
- name: Setup Python
14+
uses: actions/setup-python@v4
15+
with:
16+
python-version: '3.10'
17+
# Install dependencies
18+
- name: Install dependencies
19+
run: pip install pyyaml
20+
21+
# Checkout target branch
22+
- name: Checkout target branch
23+
uses: actions/checkout@v3
24+
with:
25+
ref: ${{ github.base_ref }}
26+
27+
# Setup temp folders for target branch
28+
- name: Setup temp folders for target branch
29+
run: |
30+
# Create temp directories
31+
mkdir -p $RUNNER_TEMP/gitbook_base
32+
33+
# Set up the directory from the target branch
34+
python scripts/setup_gitbook_dirs.py . $RUNNER_TEMP/gitbook_base
35+
36+
# Checkout PR branch
37+
- name: Checkout PR branch
38+
uses: actions/checkout@v3
39+
with:
40+
ref: ${{ github.head_ref }}
41+
42+
# Setup temp folders for PR branch
43+
- name: Setup temp folders for PR branch
44+
run: |
45+
# Create temp directories
46+
mkdir -p $RUNNER_TEMP/gitbook_head
47+
48+
# Set up the directory from the PR branch
49+
python scripts/setup_gitbook_dirs.py . $RUNNER_TEMP/gitbook_head
50+
51+
# Run GitBook Redirect Check Script
52+
- name: Run GitBook Redirect Check Script
53+
run: |-
54+
python scripts/gitbook_redirect_check.py $RUNNER_TEMP/gitbook_base $RUNNER_TEMP/gitbook_head --pr "${{ github.event.pull_request.number }}"
Loading
Loading

docs/book/api-docs/.gitbook.yaml

Lines changed: 0 additions & 266 deletions
Large diffs are not rendered by default.

docs/book/api-docs/pro-api/pro-api/getting-started.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -55,10 +55,10 @@ To generate a new API token for the ZenML Pro API:
5555
1. Navigate to the organization settings page in your ZenML Pro dashboard
5656
2. Select "API Tokens" from the left sidebar
5757

58-
![API Tokens](../../.gitbook/assets/zenml-pro-api-token-01.png)
58+
![API Tokens](../../../.gitbook/assets/zenml-pro-api-token-01.png)
5959
3. Click the "Create new token" button. Once generated, you'll see a dialog showing your new API token.
6060

61-
![API Tokens](../../.gitbook/assets/zenml-pro-api-token-02.png)
61+
![API Tokens](../../../.gitbook/assets/zenml-pro-api-token-02.png)
6262
4. Simply use the API token as the bearer token in your HTTP requests. For example, you can use the following command to check your current user:
6363
* using curl:
6464

docs/book/component-guide/.gitbook.yaml

Lines changed: 80 additions & 265 deletions
Large diffs are not rendered by default.

docs/book/component-guide/component-guide.md

Lines changed: 67 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,26 +1,89 @@
11
---
22
description: Overview of categories of MLOps components and third-party integrations.
3+
icon: magnifying-glass
34
---
45

56
# Overview
67

78
If you are new to the world of MLOps, it is often daunting to be immediately faced with a sea of tools that seemingly all promise and do the same things. It is useful in this case to try to categorize tools in various groups in order to understand their value in your toolchain in a more precise manner.
89

9-
ZenML tackles this problem by introducing the concept of [**Stacks and Stack Components**](https://docs.zenml.io/user-guides/production-guide/understand-stacks). These stack components represent categories, each of which has a particular function in your MLOps pipeline. ZenML realizes these stack components as base abstractions that standardize the entire workflow for your team. In order to then realize the benefit, one can write a concrete implementation of the [abstraction](https://docs.zenml.io/how-to/infrastructure-deployment/stack-deployment/implement-a-custom-stack-component), or use one of the many built-in integrations that implement these abstractions for you.
10+
## What is a stack?
1011

11-
## Essential Components
12+
The [stack](https://docs.zenml.io/user-guides/production-guide/understand-stacks) is a fundamental component of the ZenML framework. Put simply, a stack represents the configuration of the infrastructure and tooling that defines where and how a pipeline executes.
13+
14+
A stack comprises different stack components, where each component is responsible for a specific task. For example, a stack might have a [container registry](https://docs.zenml.io/stacks/container-registries), a [Kubernetes cluster](https://docs.zenml.io/stacks/orchestrators/kubernetes) as an [orchestrator](https://docs.zenml.io/stacks/orchestrators), an [artifact store](https://docs.zenml.io/stacks/artifact-stores), an [experiment tracker](https://docs.zenml.io/stacks/experiment-trackers) like MLflow and so on.
1215

1316
Each pipeline run that you execute with ZenML will require a **stack** and each **stack** will be required to include at least an **orchestrator** and an **artifact store**. Apart from these two, the other components are optional and to be added as your pipeline evolves in MLOps maturity.
1417

18+
## Stacks as a way to organize your execution environment
19+
20+
With ZenML, you can run your pipelines on more than one stacks with ease. This pattern helps you test your code across different environments effortlessly.
21+
22+
This enables a case like this: a data scientist starts experimentation locally on their system and then once they are satisfied, move to a cloud environment on your staging cloud account to test more advanced features of your pipeline. Finally, when all looks good, they can mark the pipeline ready for production and have it run on a production-grade stack in your production cloud account.
23+
24+
![Stacks as a way to organize your execution environment](../.gitbook/assets/stack_envs.png)
25+
26+
Having separate stacks for these environments helps:
27+
28+
* avoid wrongfully deploying your staging pipeline to production
29+
* curb costs by running less powerful resources in staging and testing locally first
30+
* control access to environments by granting permissions for only certain stacks to certain users
31+
32+
## How to manage credentials for your stacks
33+
34+
Most stack components require some form of credentials to interact with the underlying infrastructure. For example, a container registry needs to be authenticated to push and pull images, a Kubernetes cluster needs to be authenticated to deploy models as a web service, and so on.
35+
36+
The preferred way to handle credentials in ZenML is to use [Service Connectors](https://docs.zenml.io/how-to/infrastructure-deployment/auth-management/service-connectors-guide). Service connectors are a powerful feature of ZenML that allow you to abstract away credentials and sensitive information from your team.
37+
38+
![Service Connectors abstract away complexity and implement security best practices](../.gitbook/assets/ConnectorsDiagram.png)
39+
40+
### Recommended roles
41+
42+
Ideally, you would want that only the people who deal with and have direct access to your cloud resources are the ones that are able to create Service Connectors. This is useful for a few reasons:
43+
44+
* **Less chance of credentials leaking**: the more people that have access to your cloud resources, the higher the chance that some of them will be leaked.
45+
* **Instant revocation of compromised credentials**: folks who have direct access to your cloud resources can revoke the credentials instantly if they are compromised, making this a much more secure setup.
46+
* **Easier auditing**: you can have a much easier time auditing and tracking who did what if you have a clear separation between the people who can create Service Connectors (who have direct access to your cloud resources) and those who can only use them.
47+
48+
### Recommended workflow
49+
50+
![Recommended workflow for managing credentials](../.gitbook/assets/service_con_workflow.png)
51+
52+
Here's an approach you can take that is a good balance between convenience and security:
53+
54+
* Have a limited set of people that have permissions to create Service Connectors. These are ideally people that have access to your cloud accounts and know what credentials to use.
55+
* You can create one connector for your development or staging environment and let your data scientists use that to register their stack components.
56+
* When you are ready to go to production, you can create another connector with permissions for your production environment and create stacks that use it. This way you can ensure that your production resources are not accidentally used for development or staging.
57+
58+
If you follow this approach, you can keep your data scientists free from the hassle of figuring out the best authentication mechanisms for the different cloud services, having to manage credentials locally, and keep your cloud accounts safe, while still giving them the freedom to run their experiments in the cloud.
59+
60+
{% hint style="info" %}
61+
Please note that restricting permissions for users through roles is a ZenML Pro feature. You can read more about it [here](https://docs.zenml.io/pro/core-concepts/roles). Sign up for a free trial here: https://cloud.zenml.io/.
62+
{% endhint %}
63+
64+
## How to deploy and manage stacks
65+
66+
Deploying and managing a MLOps stack is tricky.
67+
68+
* Each tool comes with a certain set of requirements. For example, a [Kubeflow installation](https://www.kubeflow.org/docs/started/installing-kubeflow/) will require you to have a Kubernetes cluster, and so would a **Seldon Core deployment**.
69+
* Figuring out the defaults for infra parameters is not easy. Even if you have identified the backing infra that you need for a stack component, setting up reasonable defaults for parameters like instance size, CPU, memory, etc., needs a lot of experimentation to figure out.
70+
* Many times, standard tool installations don't work out of the box. For example, to run a custom pipeline in [Vertex AI](https://cloud.google.com/vertex-ai), it is not enough to just run an imported pipeline. You might also need a custom service account that is configured to perform tasks like reading secrets from your secret store or talking to other GCP services that your pipeline might need.
71+
* Some tools need an additional layer of installations to enable a more secure, production-grade setup. For example, a standard **MLflow tracking server** deployment comes without an authentication frontend which might expose all of your tracking data to the world if deployed as-is.
72+
* All the components that you deploy must have the right permissions to be able to talk to each other. For example, your workloads running in a Kubernetes cluster might require access to the container registry or the code repository, and so on.
73+
* Cleaning up your resources after you're done with your experiments is super important yet very challenging. For example, if your Kubernetes cluster has made use of [Load Balancers](https://kubernetes.io/docs/concepts/services-networking/service/#loadbalancer), you might still have one lying around in your account even after deleting the cluster, costing you money and frustration.
74+
75+
All of these points make taking your pipelines to production a more difficult task than it should be. We believe that the expertise in setting up these often-complex stacks shouldn't be a prerequisite to running your ML pipelines.
76+
77+
This docs section consists of information that makes it easier to provision, configure, and extend stacks and components in ZenML.
78+
1579
## Stack Components Guide
1680

1781
Here is a full list of all stack components currently supported in ZenML, with a description of the role of that component in the MLOps process:
1882

1983
<table data-view="cards"><thead><tr><th></th><th></th><th data-hidden data-card-cover data-type="files"></th><th data-hidden data-card-target data-type="content-ref"></th></tr></thead><tbody><tr><td><strong>Orchestrator</strong></td><td>Orchestrating the runs of your pipeline</td><td><a href=".gitbook/assets/orchestrator.png">orchestrator.png</a></td><td><a href="orchestrators/">orchestrators</a></td></tr><tr><td><strong>Artifact Store</strong></td><td>Storage for the artifacts created by your pipelines</td><td><a href=".gitbook/assets/artifact-store.png">artifact-store.png</a></td><td><a href="artifact-stores/">artifact-stores</a></td></tr><tr><td><strong>Container Registry</strong></td><td>Store for your containers</td><td><a href=".gitbook/assets/container-registry.png">container-registry.png</a></td><td><a href="container-registries/">container-registries</a></td></tr><tr><td><strong>Data Validator</strong></td><td>Data and model validation</td><td><a href=".gitbook/assets/data-validator.png">data-validator.png</a></td><td><a href="data-validators/">data-validators</a></td></tr><tr><td><strong>Experiment Tracker</strong></td><td>Tracking your ML experiments</td><td><a href=".gitbook/assets/experiment-tracker.png">experiment-tracker.png</a></td><td><a href="experiment-trackers/">experiment-trackers</a></td></tr><tr><td><strong>Model Deployer</strong></td><td>Services/platforms responsible for online model serving</td><td><a href=".gitbook/assets/model-deployer.png">model-deployer.png</a></td><td><a href="model-deployers/">model-deployers</a></td></tr><tr><td><strong>Step Operator</strong></td><td>Execution of individual steps in specialized runtime environments</td><td><a href=".gitbook/assets/step-operator.png">step-operator.png</a></td><td><a href="step-operators/">step-operators</a></td></tr><tr><td><strong>Alerter</strong></td><td>Sending alerts through specified channels</td><td><a href=".gitbook/assets/alerter.png">alerter.png</a></td><td><a href="alerters/">alerters</a></td></tr><tr><td><strong>Image Builder</strong></td><td>Builds container images.</td><td><a href=".gitbook/assets/image-builder.png">image-builder.png</a></td><td><a href="image-builders/">image-builders</a></td></tr><tr><td><strong>Annotator</strong></td><td>Labeling and annotating data</td><td><a href=".gitbook/assets/annotator.png">annotator.png</a></td><td><a href="annotators/">annotators</a></td></tr><tr><td><strong>Model Registry</strong></td><td>Manage and interact with ML Models</td><td><a href=".gitbook/assets/model-registry.png">model-registry.png</a></td><td><a href="model-registries/">model-registries</a></td></tr><tr><td><strong>Feature Store</strong></td><td>Management of your data/features</td><td><a href=".gitbook/assets/feature-store.png">feature-store.png</a></td><td><a href="feature-stores/">feature-stores</a></td></tr></tbody></table>
2084

21-
## Writing custom component flavors
85+
## Custom Implementations
2286

2387
You can take control of how ZenML behaves by creating your own components. This is done by writing custom component `flavors`.
2488

2589
<table data-card-size="large" data-view="cards"><thead><tr><th></th><th></th><th data-hidden data-card-cover data-type="files"></th><th data-hidden data-card-target data-type="content-ref"></th></tr></thead><tbody><tr><td><strong>Component Flavors</strong></td><td>How to write a custom stack component flavor</td><td><a href=".gitbook/assets/flavors.png">flavors.png</a></td><td><a href="https://app.gitbook.com/s/5aBlTJNbVDkrxJp7J1J9/how-to/infrastructure-deployment/stack-deployment/implement-a-custom-stack-component">Implement a custom stack component</a></td></tr><tr><td><strong>Custom orchestrator guide</strong></td><td>Learn how to develop a custom orchestrator</td><td><a href=".gitbook/assets/custom-orchestrator.png">custom-orchestrator.png</a></td><td><a href="orchestrators/custom.md">custom.md</a></td></tr></tbody></table>
26-

0 commit comments

Comments
 (0)