Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[rfc] Implement Python API for launching asset backfills #26731

Draft
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

deepyaman
Copy link
Contributor

@deepyaman deepyaman commented Dec 27, 2024

Summary & Motivation

Background

@smackesey has interacted with several users on the following scenario:

  • There is some computation (run or backfill) X that should only be launched under certain conditions
  • The conditions under which X should be launched are non-trivial to compute
  • It is desired to manually trigger conditional launch of X-- that is, on clicking a button, the conditions under which X should be launched are computed and X is launched if those conditions are satisfied.

Problem Statement

We do not have a good solution to the above scenario.

  • Launch job Y that conditionally launches X
    • If X is a job, execute_job will work.
    • If X is a backfill, then there is no good public API to launch it. User must use the private GQL API.
  • Manually trigger sensor that conditionally launches X
    • The way you manually trigger a sensor is by pressing the "Test Sensor" button, which is non-ideal and kinda buried.
    • You then have to pass each generated RunRequest through to the launchpad and manually launch.
      • This works OK if X is a job.
      • This is unworkable if X is a backfill, because you have to manually launch each RunRequest

So if X is a backfill, basically your only option is to use a job that then uses the private GQL API.

Objective

The purpose of this enhancement is to improve user experience. This will allow users to not have to resort to workarounds to perform a manual trigger a conditional launch of X.

The manifestation of this is to expose a public Python API for creating and launching asset backfills.

How I Tested These Changes

Added unit tests mirroring a couple of existing backfill tests, but using the newly-added Python API.

TODO

I wanted to get feedback/validation that I'm on the right track before proceeding much further, but I understand that the following things probably still need doing:

  • Test validation of arguments/argument combinations
  • Add additional execution tests?
  • Improve docstring (with examples, etc.)
  • Add public-facing docs?

Changelog

Exposed a Python API for creating and launching asset backfills.

@deepyaman deepyaman force-pushed the deepyaman/feat/python-backfill-api branch 2 times, most recently from cddb72a to f7133d7 Compare December 27, 2024 14:14
) and partitions_by_assets:
raise DagsterInvariantViolationError(
"partitions_by_assets cannot be used together with asset_selection, selector, or"
" partitionNames"
"partitionsByAssets cannot be used together with assetSelection, selector, or"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the very least, I feel like the capitalization should be consistent; the camel-casing is aligned with the GraphQL API names.

"partitions_by_assets cannot be used together with asset_selection, selector, or"
" partitionNames"
"partitionsByAssets cannot be used together with assetSelection, selector, or"
" partitionNames or if allPartitions is True"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes sense to validate allPartitions, too, if checking partitionNames; neither can be used with partitionsByAssets.

I would be happy to split this and the above change into a separate (stacked?) PR, since they're not directly related to creating the Python API; just thought I'd do some housekeeping as I was learning about it for the Python API implementation.

asset_graph=asset_graph,
backfill_timestamp=backfill_timestamp,
tags=tags,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Minimal argument order alignment across calls.

@@ -2260,7 +2262,7 @@ def get_event_tags_for_asset(
@public
@traced
def wipe_assets(self, asset_keys: Sequence[AssetKey]) -> None:
"""Wipes asset event history from the event log for the given asset keys.
"""Wipe asset event history from the event log for the given asset keys.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: All of the other public methods in this file use imperative mood.

@public
def launch_backfill(
self,
asset_graph: "BaseAssetGraph",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I decided to directly ask for the asset graph as an argument, since there are already tutorials that ask the user to pass asset graph. I didn't really see any reason to ask for a broader Definitions object.

Returns:
str: The ID of the backfill.
"""
# TODO(deepyaman): Abstract logic shared with `dagster-graphql`.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# TODO(deepyaman): Abstract logic shared with `dagster-graphql`.

Actually, as I got to implementing more of this, maybe they're sufficiently different that this isn't worth it.

Comment on lines 3171 to 3173
partitions_by_assets: Optional[
Sequence[Tuple[AssetKey, Optional[Sequence[Tuple[str, str]]]]]
] = None,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This avoids exposing any additional objects, but I suppose an explanation/examples will be necessary for anybody unfamiliar with Dagster internals already.

@deepyaman deepyaman changed the title Implement Python API for launching asset backfills [rfc] Implement Python API for launching asset backfills Dec 28, 2024
)

tags = {t["key"]: t["value"] for t in backfill_params.get("tags", [])}

tags = {**tags, **graphene_info.context.get_viewer_tags()}

title = check_valid_title(backfill_params.get("title"))
Copy link
Contributor Author

@deepyaman deepyaman Dec 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems reasonable to use the existing utility. Also, I think there's no reason it should only be happening for job backfills.

This is another one I'd be happy to move into a separate PR, if so desired.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doing it for only job backfills is 100% just a goof on my part. we should be doing it for all types of backfills - thanks for catching!

switching to the check version of the fn is also fine w me. I think originally we were considering re-raising as a GrapheneError so that it had a more immediate failure for the user. But we've kind of abandoned adding titles/descriptions to backfills for now anyway, so keeping it as the generic python error is fine

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jamiedemaria Great! I raised a new PR just addressing this (tagged you, if you can take a quick look), since it's agreed upon, while I go about changing this broader PR.

)

# Check that the partition keys can be found in the asset graph.
# TODO(deepyaman): Abstract logic shared with `dagster-graphql`.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this one can definitely be deduplicated; probably can be done in a subsequent PR?

@deepyaman deepyaman marked this pull request as ready for review December 28, 2024 02:07
@deepyaman deepyaman force-pushed the deepyaman/feat/python-backfill-api branch from ab28602 to 4423290 Compare December 31, 2024 04:19
Copy link
Contributor

@OwenKephart OwenKephart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After sleeping on this, I feel like putting this logic on the DagsterInstance itself isn't quite right.

At the core, there are two things we want to be able to do here:

  1. Express some intent to backfill some arbitrary set of partitions
  2. Send a message to the instance to act on that intent

We already have an existing method for (2) (instance.add_backfill), and so the question is what is the best way of exposing (1) to the user.

We already have the PartitionBackfill object, and so putting this layer of indirection for constructing the partition backfill feels off, especially because none of the code in the added method really interacts with the core instance database outside of a single call to self.add_backfill().

Right now, this single method supports a wide array of different parameter combinations, with only some subset of them being valid, which is pretty confusing for a user, even if we add a ton of examples of valid behavior.

I think the better way of factoring this behavior would be to just expose PartitionBackfill more directly, and make the existing static constructors public (with a potentially a bit of light refactoring of the input types, e.g. backfill_id could be an optional parameter -- if not provided, we'd create one for the user).

To start, we could just expose .from_asset_partitions() as public, as I think that's the most common case, and its existing implementation is less subject to change than the from_partitions_by_assets (which has extra custom types)

Copy link
Contributor

++1 I think exposing the constructors on PartitionBackfill is a lot cleaner and gives us more flexibility in the future

Copy link
Contributor

@jamiedemaria jamiedemaria left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have there been discussions in the build team about the implications of allowing users to launch backfills (and other jobs) from runs? I know in the past we've shied away from allowing/promoting this behavior and instead pointed users to sensors. Would love to get up to date on what you are all thinking in terms of best practices here

)

tags = {t["key"]: t["value"] for t in backfill_params.get("tags", [])}

tags = {**tags, **graphene_info.context.get_viewer_tags()}

title = check_valid_title(backfill_params.get("title"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doing it for only job backfills is 100% just a goof on my part. we should be doing it for all types of backfills - thanks for catching!

switching to the check version of the fn is also fine w me. I think originally we were considering re-raising as a GrapheneError so that it had a more immediate failure for the user. But we've kind of abandoned adding titles/descriptions to backfills for now anyway, so keeping it as the generic python error is fine

deepyaman added a commit that referenced this pull request Dec 31, 2024
## Summary & Motivation

Fix a mistaken inconsistency; see
#26731 (comment).

## How I Tested These Changes

BK

## Changelog

Added validation of `title` for asset backfills (not just for job
backfills).
@deepyaman deepyaman force-pushed the deepyaman/feat/python-backfill-api branch from 4423290 to 2186ef0 Compare January 1, 2025 00:17
@gibsondan
Copy link
Member

We should make sure to consider how this will work in cloud (where the instance api needs to communicate over graphql) and what the exact api boundaries will be that we need to preserve backwards compatibility for

@deepyaman
Copy link
Contributor Author

deepyaman commented Jan 2, 2025

After sleeping on this, I feel like putting this logic on the DagsterInstance itself isn't quite right.

At the core, there are two things we want to be able to do here:

  1. Express some intent to backfill some arbitrary set of partitions
  2. Send a message to the instance to act on that intent

We already have an existing method for (2) (instance.add_backfill), and so the question is what is the best way of exposing (1) to the user.

We already have the PartitionBackfill object, and so putting this layer of indirection for constructing the partition backfill feels off, especially because none of the code in the added method really interacts with the core instance database outside of a single call to self.add_backfill().

Right now, this single method supports a wide array of different parameter combinations, with only some subset of them being valid, which is pretty confusing for a user, even if we add a ton of examples of valid behavior.

I think the better way of factoring this behavior would be to just expose PartitionBackfill more directly, and make the existing static constructors public (with a potentially a bit of light refactoring of the input types, e.g. backfill_id could be an optional parameter -- if not provided, we'd create one for the user).

To start, we could just expose .from_asset_partitions() as public, as I think that's the most common case, and its existing implementation is less subject to change than the from_partitions_by_assets (which has extra custom types)

I should have a separate PR up later today just exposing PartitionBackfill.from_asset_partitions()! 🤞

@OwenKephart @jamiedemaria I went ahead and created #26786 as an alternative to this; I'll switch this back to draft for now (I think there are some other minor changes I want to try to get into separate PRs, but that can wait on the Python API to land).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants