Skip to content

Commit 0b96daf

Browse files
Store client-side pipeline run logs in the artifact store (#3498)
* Store client-side pipeline logs in the artifact store * Update endpoint to return pipeline logs * Add project scope to pipeline logs endpoint * Add pipeline log settings and pipeline run logs endpoint * Fix linter errors * Use enable_pipeline_logs instead of enable_step_logs * Apply code review suggestions and add docs * Optimised images with calibre/image-actions * Fix linter errors * Use the enable_pipeline_logs in the pipeline configuration * Fix deadlock bug and remaining unit tests * Remove redundant Kubernetes orchestrator logs --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
1 parent 31db412 commit 0b96daf

File tree

23 files changed

+317
-77
lines changed

23 files changed

+317
-77
lines changed
Loading

docs/book/how-to/control-logging/enable-or-disable-logs-storing.md

Lines changed: 30 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,11 @@
11
# Enable or disable logs storing
22

3-
By default, ZenML uses a logging handler to capture the logs that occur during the execution of a step. Users are free to use the default python logging module or print statements, and ZenML's logging handler will catch these logs and store them.
3+
By default, ZenML uses a logging handler to capture two types of logs:
4+
5+
* the logs collected from your ZenML client while triggering and waiting for a pipeline to run. These logs cover everything that happens client-side: building and pushing container images, triggering the pipeline, waiting for it to start, and waiting for it to finish. Note that these logs are not available for scheduled pipelines and the logs might not be available for pipeline runs that fail while building or pushing the container images.
6+
* the logs collected from the execution of a step. These logs only cover what happens during the execution of a single step and originate mostly from the user-provided step code and the libraries it calls.
7+
8+
For step logs, users are free to use the default python logging module or print statements, and ZenML's logging handler will catch these logs and store them.
49

510
```python
611
import logging
@@ -13,16 +18,38 @@ def my_step() -> None:
1318
print("World.") # You can utilize `print` statements as well.
1419
```
1520

16-
These logs are stored within the respective artifact store of your stack. You can display the logs in the dashboard as follows:
21+
All these logs are stored within the respective artifact store of your stack. You can visualize the pipeline run logs and step logs in the dashboard as follows:
1722

23+
![Displaying pipeline run logs on the dashboard](../../.gitbook/assets/zenml_pipeline_run_logs.png)
1824
![Displaying step logs on the dashboard](../../.gitbook/assets/zenml_step_logs.png)
1925

2026
{% hint style="warning" %}
2127
Note that if you are not connected to a cloud artifact store with a service connector configured then you will not
2228
be able to view your logs in the dashboard. Read more [here](./view-logs-on-the-dasbhoard.md).
2329
{% endhint %}
2430

25-
If you do not want to store the logs in your artifact store, you can:
31+
If you do not want to store the logs in your artifact store, you can do the following:
32+
33+
For pipeline run logs:
34+
35+
1. Disable it by using the `enable_pipeline_logs` parameter with your `@pipeline` decorator:
36+
37+
```python
38+
from zenml import pipeline, step
39+
40+
@pipeline(enable_pipeline_logs=False) # disables logging for pipeline
41+
def my_pipeline():
42+
...
43+
```
44+
2. Disable it by using the environmental variable `ZENML_DISABLE_PIPELINE_LOGS_STORAGE` and setting it to `true`. This environmental variable takes precedence over the parameter mentioned above. Note this environmental variable needs to be set on the [execution environment](../pipeline-development/configure-python-environments/README.md#execution-environments), i.e., on the orchestrator level:
45+
46+
```shell
47+
export ZENML_DISABLE_PIPELINE_LOGS_STORAGE=true
48+
49+
python my_pipeline.py
50+
```
51+
52+
For step logs:
2653

2754
1. Disable it by using the `enable_step_logs` parameter either with your `@pipeline` or `@step` decorator:
2855

src/zenml/config/compiler.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -210,6 +210,7 @@ def _apply_run_configuration(
210210
enable_artifact_metadata=config.enable_artifact_metadata,
211211
enable_artifact_visualization=config.enable_artifact_visualization,
212212
enable_step_logs=config.enable_step_logs,
213+
enable_pipeline_logs=config.enable_pipeline_logs,
213214
settings=config.settings,
214215
tags=config.tags,
215216
extra=config.extra,

src/zenml/config/pipeline_configurations.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,7 @@ class PipelineConfigurationUpdate(StrictBaseModel):
4141
enable_artifact_metadata: Optional[bool] = None
4242
enable_artifact_visualization: Optional[bool] = None
4343
enable_step_logs: Optional[bool] = None
44+
enable_pipeline_logs: Optional[bool] = None
4445
settings: Dict[str, SerializeAsAny[BaseSettings]] = {}
4546
tags: Optional[List[Union[str, "Tag"]]] = None
4647
extra: Dict[str, Any] = {}

src/zenml/config/pipeline_run_configuration.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -40,6 +40,7 @@ class PipelineRunConfiguration(
4040
enable_artifact_metadata: Optional[bool] = None
4141
enable_artifact_visualization: Optional[bool] = None
4242
enable_step_logs: Optional[bool] = None
43+
enable_pipeline_logs: Optional[bool] = None
4344
schedule: Optional[Schedule] = None
4445
build: Union[PipelineBuildBase, UUID, None] = Field(
4546
default=None, union_mode="left_to_right"

src/zenml/constants.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -168,6 +168,7 @@ def handle_int_env_var(var: str, default: int = 0) -> int:
168168
ENV_ZENML_SERVER = "ZENML_SERVER"
169169
ENV_ZENML_ENFORCE_TYPE_ANNOTATIONS = "ZENML_ENFORCE_TYPE_ANNOTATIONS"
170170
ENV_ZENML_ENABLE_IMPLICIT_AUTH_METHODS = "ZENML_ENABLE_IMPLICIT_AUTH_METHODS"
171+
ENV_ZENML_DISABLE_PIPELINE_LOGS_STORAGE = "ZENML_DISABLE_PIPELINE_LOGS_STORAGE"
171172
ENV_ZENML_DISABLE_STEP_LOGS_STORAGE = "ZENML_DISABLE_STEP_LOGS_STORAGE"
172173
ENV_ZENML_DISABLE_STEP_NAMES_IN_LOGS = "ZENML_DISABLE_STEP_NAMES_IN_LOGS"
173174
ENV_ZENML_IGNORE_FAILURE_HOOK = "ZENML_IGNORE_FAILURE_HOOK"

src/zenml/integrations/kubernetes/orchestrators/kubernetes_orchestrator.py

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -543,7 +543,6 @@ def prepare_or_run_pipeline(
543543
mount_local_stores=self.config.is_local,
544544
)
545545

546-
logger.info("Waiting for Kubernetes orchestrator pod to start...")
547546
kube_utils.create_and_wait_for_pod_to_start(
548547
core_api=self._k8s_core_api,
549548
pod_display_name="Kubernetes orchestrator pod",

src/zenml/integrations/kubernetes/orchestrators/kubernetes_orchestrator_entrypoint.py

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -173,7 +173,6 @@ def run_step_on_kubernetes(step_name: str) -> None:
173173
mount_local_stores=mount_local_stores,
174174
)
175175

176-
logger.info(f"Waiting for pod of step `{step_name}` to start...")
177176
kube_utils.create_and_wait_for_pod_to_start(
178177
core_api=core_api,
179178
pod_display_name=f"pod for step `{step_name}`",

src/zenml/integrations/kubernetes/step_operators/kubernetes_step_operator.py

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -218,9 +218,6 @@ def launch(
218218
mount_local_stores=False,
219219
)
220220

221-
logger.info(
222-
"Waiting for pod of step `%s` to start...", info.pipeline_step_name
223-
)
224221
kube_utils.create_and_wait_for_pod_to_start(
225222
core_api=self._k8s_core_api,
226223
pod_display_name=f"pod of step `{info.pipeline_step_name}`",

src/zenml/logging/step_logging.py

Lines changed: 28 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@
1313
# permissions and limitations under the License.
1414
"""ZenML logging handler."""
1515

16+
import logging
1617
import os
1718
import re
1819
import sys
@@ -48,6 +49,7 @@
4849
redirected: ContextVar[bool] = ContextVar("redirected", default=False)
4950

5051
LOGS_EXTENSION = ".log"
52+
PIPELINE_RUN_LOGS_FOLDER = "pipeline_runs"
5153

5254

5355
def remove_ansi_escape_codes(text: str) -> str:
@@ -65,14 +67,14 @@ def remove_ansi_escape_codes(text: str) -> str:
6567

6668
def prepare_logs_uri(
6769
artifact_store: "BaseArtifactStore",
68-
step_name: str,
70+
step_name: Optional[str] = None,
6971
log_key: Optional[str] = None,
7072
) -> str:
7173
"""Generates and prepares a URI for the log file or folder for a step.
7274
7375
Args:
7476
artifact_store: The artifact store on which the artifact will be stored.
75-
step_name: Name of the step.
77+
step_name: Name of the step. Skipped for global pipeline run logs.
7678
log_key: The unique identification key of the log file.
7779
7880
Returns:
@@ -81,11 +83,8 @@ def prepare_logs_uri(
8183
if log_key is None:
8284
log_key = str(uuid4())
8385

84-
logs_base_uri = os.path.join(
85-
artifact_store.path,
86-
step_name,
87-
"logs",
88-
)
86+
subfolder = step_name or PIPELINE_RUN_LOGS_FOLDER
87+
logs_base_uri = os.path.join(artifact_store.path, subfolder, "logs")
8988

9089
# Create the dir
9190
if not artifact_store.exists(logs_base_uri):
@@ -210,7 +209,7 @@ def _read_file(
210209
artifact_store.cleanup()
211210

212211

213-
class StepLogsStorage:
212+
class PipelineLogsStorage:
214213
"""Helper class which buffers and stores logs to a given URI."""
215214

216215
def __init__(
@@ -324,6 +323,18 @@ def save_to_file(self, force: bool = False) -> None:
324323
self.disabled = True
325324

326325
try:
326+
# The configured logging handler uses a lock to ensure that
327+
# logs generated by different threads are not interleaved.
328+
# Given that most artifact stores are based on fsspec, which
329+
# use a separate thread for async operations, it may happen that
330+
# the fsspec library itself will log something, which will end
331+
# up in a deadlock.
332+
# To avoid this, we temporarily disable the lock in the logging
333+
# handler while writing to the file.
334+
logging_handler = logging.getLogger().handlers[0]
335+
logging_lock = logging_handler.lock
336+
logging_handler.lock = None
337+
327338
if self.buffer:
328339
if self.artifact_store.config.IS_IMMUTABLE_FILESYSTEM:
329340
_logs_uri = self._get_timestamped_filename()
@@ -353,6 +364,9 @@ def save_to_file(self, force: bool = False) -> None:
353364
# I/O errors.
354365
logger.error(f"Error while trying to write logs: {e}")
355366
finally:
367+
# Restore the original logging handler lock
368+
logging_handler.lock = logging_lock
369+
356370
self.buffer = []
357371
self.last_save_time = time.time()
358372

@@ -418,8 +432,8 @@ def merge_log_files(self, merge_all_files: bool = False) -> None:
418432
)
419433

420434

421-
class StepLogsStorageContext:
422-
"""Context manager which patches stdout and stderr during step execution."""
435+
class PipelineLogsStorageContext:
436+
"""Context manager which patches stdout and stderr during pipeline run execution."""
423437

424438
def __init__(
425439
self, logs_uri: str, artifact_store: "BaseArtifactStore"
@@ -428,17 +442,17 @@ def __init__(
428442
429443
Args:
430444
logs_uri: the URI of the logs file.
431-
artifact_store: Artifact Store from the current step context.
445+
artifact_store: Artifact Store from the current pipeline run context.
432446
"""
433-
self.storage = StepLogsStorage(
447+
self.storage = PipelineLogsStorage(
434448
logs_uri=logs_uri, artifact_store=artifact_store
435449
)
436450

437-
def __enter__(self) -> "StepLogsStorageContext":
451+
def __enter__(self) -> "PipelineLogsStorageContext":
438452
"""Enter condition of the context manager.
439453
440454
Wraps the `write` method of both stderr and stdout, so each incoming
441-
message gets stored in the step logs storage.
455+
message gets stored in the pipeline logs storage.
442456
443457
Returns:
444458
self

0 commit comments

Comments
 (0)