python API for retrieving outputs from previous runs #4488

dpeng817 · 2021-08-10T20:13:54Z

Use Case

Dagster has multiple execution entrypoints, and it's reasonable that someone might execute in one entry point, and want to utilize the results of that execution in another entrypoint. For example, I might execute a job in Dagit, but I want to observe / validate the intermediary results of that execution in python. We don't currently have a python API that would allow us to do this.

This has broader use case of being able to observe outputs from runs that have been launched/submitted externally, rather than run locally.

There is also the memoization use case, where an output that I am using / assuming exists in the current run was created in a previous run. It's reasonable to want to observe previously memoized outputs in addition to those populated by this run.

Ideas of Implementation

We would need access to the dagster instance for the run history, and also the pipeline itself to reconstruct the IO managers and retrieve results.

Result retrieval would basically just call load_input on the IO manager with a properly populated InputContext.

User Requests

https://dagster.slack.com/archives/C01U954MEER/p1658141980750219

Message from the maintainers:

Excited about this feature? Give it a 👍. We factor engagement into prioritization.

The text was updated successfully, but these errors were encountered:

sryza · 2022-10-19T00:25:12Z

For those using software-defined assets, load_asset_value now enables this: https://docs.dagster.io/concepts/assets/software-defined-assets#loading-asset-values-outside-of-dagster-runs

zhh210 · 2024-05-08T15:41:44Z

hey @sryza is load_asset_value removed from the legacy dagster version? Searching keyword returns nothing from the latest doc. Also load_asset_value keeps ignoring the resources I specified in Definitions and complains resource_config missing.

garethbrickman · 2024-05-08T17:03:59Z

@zhh210 load_asset_value is documented here. If you need help troubleshooting please create a new issue.

zhh210 · 2024-05-08T18:13:57Z

@zhh210 load_asset_value is documented here. If you need help troubleshooting please create a new issue.

Thanks @garethbrickman , created a separate ticket on the issue. It seems the resource_config passed over to load_value_asset() is not the same context.resource_config used in typical customized io manager.

dpeng817 added type: feature-request practitioner labels Aug 10, 2021

dpeng817 added this to Backlog in Practitioner Aug 11, 2021

yuhan removed the practitioner label Sep 30, 2021

garethbrickman closed this as completed May 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

python API for retrieving outputs from previous runs #4488

python API for retrieving outputs from previous runs #4488

dpeng817 commented Aug 10, 2021 •

edited by sryza

sryza commented Oct 19, 2022

zhh210 commented May 8, 2024 •

edited

garethbrickman commented May 8, 2024 •

edited

zhh210 commented May 8, 2024

python API for retrieving outputs from previous runs #4488

python API for retrieving outputs from previous runs #4488

Comments

dpeng817 commented Aug 10, 2021 • edited by sryza

Use Case

Ideas of Implementation

User Requests

Message from the maintainers:

sryza commented Oct 19, 2022

zhh210 commented May 8, 2024 • edited

garethbrickman commented May 8, 2024 • edited

zhh210 commented May 8, 2024

dpeng817 commented Aug 10, 2021 •

edited by sryza

zhh210 commented May 8, 2024 •

edited

garethbrickman commented May 8, 2024 •

edited