Skip to content

Commit def2fca

Browse files
committed
document system-level environment variables for file-based pipeline nodes at level Jupyterlab, KFP or Airflow runtime, or both
Signed-off-by: shalberd <[email protected]>
1 parent ddb269a commit def2fca

File tree

2 files changed

+66
-0
lines changed

2 files changed

+66
-0
lines changed

docs/source/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -51,6 +51,7 @@ Elyra is a set of AI-centric extensions to JupyterLab Notebooks.
5151
user_guide/pipeline-components.md
5252
user_guide/best-practices-custom-pipeline-components
5353
user_guide/best-practices-file-based-nodes.md
54+
user_guide/env-variables-file-based-nodes.md
5455
user_guide/enhanced-script-support.md
5556
user_guide/code-snippets.md
5657

Lines changed: 65 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,65 @@
1+
<!--
2+
{% comment %}
3+
Copyright 2018-2023 Elyra Authors
4+
5+
Licensed under the Apache License, Version 2.0 (the "License");
6+
you may not use this file except in compliance with the License.
7+
You may obtain a copy of the License at
8+
9+
http://www.apache.org/licenses/LICENSE-2.0
10+
11+
Unless required by applicable law or agreed to in writing, software
12+
distributed under the License is distributed on an "AS IS" BASIS,
13+
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
14+
See the License for the specific language governing permissions and
15+
limitations under the License.
16+
{% endcomment %}
17+
-->
18+
## System-level environment variables used in file-based pipeline nodes
19+
20+
[Generic pipelines and typed pipelines](pipelines.md) support natively file-based nodes for Jupyter notebooks, Python scripts, and R scripts. In order to support heterogeneous execution - that is making them runnable to your requiremenents in any runtime environment (JupyterLab, Kubeflow Pipelines, and Apache Airflow) - follow the documentation on environment variables listed below.
21+
22+
There are system-level environment variables for two types of scopes:
23+
- Jupyterlab pipeline generation and validation (PipelineProcessor)
24+
- Runtime image task (Airflow) or component (KFP) execution of file-based node Jupyter notebooks, Python scripts, and R scripts (bootstrapper pipeline run)
25+
26+
This page lists the environment variables; their scope, defaults, and background concept.
27+
28+
### `ELYRA_ENABLE_PIPELINE_INFO`
29+
30+
Scope: Jupyterlab PipelineProcessor and runtime image task execution in runtime environment
31+
Impact: Produces a formatted log INFO message used entirely for support purposes.
32+
Having single-line entries in the log (no embedded newlines) with pipeline name, operation_name, action and Duration makes it easy to cross-evaluate logs across log files.
33+
34+
Background: During processing of Pipelines in jupyterlab, i.e. before execution when logging pipeline info during submitting the pipeline, processing later Pipeline operation dependencies,
35+
submitting the Pipeline to Git, and exporting the Pipeline as KFP Python or yaml or Airflow DAG Python code (not needed with local / LocalPipelineProcessor).
36+
37+
Also used in runtime-specific container environment in bootstrapper.py python code for execution run logging operation info of KFP Pipeline components and Airflow Pipeline / DAG Tasks to
38+
log KFP component / Airflow task execution info when execution of the script starts, dependencies are processed, and the script execution operation ends.
39+
40+
Default: We recommend leaving this at its default "true", i.e. no explicit setting of this environment variable necessary.
41+
If you want to set `ELYRA_ENABLE_PIPELINE_INFO` to `false`, you can do so in either
42+
- Jupyterlab at runtime
43+
- Statically baked into Jupyterlab container definition for use in Jupyterlab container build
44+
- Pipeline Editor at Pipeline Properties - Generic Node Defaults - Environment Variables or at Node Properties - Additional Properties - Environment Variables
45+
- Statically baked into Jupyterlab container definition for use in KFP or Airflow runtime image container build
46+
47+
### `ELYRA_GENERIC_NODES_ENABLE_SCRIPT_OUTPUT_TO_S3`
48+
49+
Scope: Runtime image task (Airflow) or component (KFP) execution of file-based node Jupyter notebooks, Python scripts, and R scripts (bootstrapper pipeline run). Relevant for pipeline runs in KFP components or Airflow DAGs.
50+
Background:
51+
- Puts script execution Output / STDOUT into a .log file for Python and R Scripts.
52+
- Puts script execution Output / STDOUT into a notebookname-output.ipynb and notebookname-Output.html file.
53+
54+
Impact: Controls whether the files are then uploaded to the Elyra S3 bucket, if this environment variable is not set at pipeline, node, or runtime container level.
55+
56+
Default: `true` if not specified, i.e. no explicit setting of this environment variable necessary.
57+
58+
Background:
59+
If you prefer to use S3-compatible storage for transfer of files between pipeline steps only and **not for logging information / run output of R, Python and Jupyter Notebook files**,
60+
for example because you capture and store logs with central KFP, Airflow, K8S / Openshift mechanisms,
61+
set env var **`ELYRA_GENERIC_NODES_ENABLE_SCRIPT_OUTPUT_TO_S3`** to **`false`**.
62+
63+
If you want to set `ELYRA_GENERIC_NODES_ENABLE_SCRIPT_OUTPUT_TO_S3` to `false`, you can do so in either
64+
- Pipeline Editor at Pipeline Properties - Generic Node Defaults - Environment Variables or at Node Properties - Additional Properties - Environment Variables
65+
- Statically baked into Jupyterlab container definition for use in KFP or Airflow runtime image container build

0 commit comments

Comments
 (0)