-
Notifications
You must be signed in to change notification settings - Fork 923
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create docs for best practices in Kedro pipeline deployment #2712
Comments
I would really like to push this one as we get asked often, this question get asked more often with the deployment plugins. Cc @marrrcin |
How do you want to proceed with this one @noklam ? |
To start with, I wan to create a documentation that guides user how to tackle these common challenge. As kedro team, we may build specific plugin for certain platform (databricks). After that, the next step may be #2058 or something else. What are the remaining challenges when deploying a Kedro pipeline? I don't have too many real world experience and need more of your input here. I'd focus on the pipeline & performance here. There are many I/O overhead when people simply do a 1:1 mapping. We have been repeating answer like this on Slack "You should deploy a modular pipeline to a What do you need to do differently when moving a Kedro pipeline to Azure? How different is it if it is SageMaker instead? |
|
@marrrcin We just had some discussion on this in backlog grooming. In short, we need to:
I think step 1 is especially important, because I think the recommendation above
largely comes from my push (and maybe @datajoely? I can't remember, but without much chance for me at least to implement it in practice). And now, I'm not totally sure this is ideal, and think you may have better thoughts (as in https://kedro-org.slack.com/archives/C03RKPCLYGY/p1688043383962069?thread_ts=1687854961.990649&cid=C03RKPCLYGY). We should also get the inputs from the broader team, and it would be a good topic for tech design to reach alignment on what we consider "best practice". So... would you be willing to lead a tech design session on this? :) |
Sure, but not this week. |
Description
Is your feature request related to a problem? A clear and concise description of what the problem is: "I'm always frustrated when ..."
Related:
This was mentioned end of last year, this ticket is created as the follow-up action.
The core of this is explaining mapping between Kedro's pipeline to Deployment shouldn't be 1:1. In mid/long term we can provide better toolings. Meanwhile we should recommend mapping a modular pipeline -> Airflow's task / Prefect's task / Compute Node on AWS/GCP/Azure
Context
Why is this change important to you? How would you use it? How can it benefit other users?
Possible Implementation
(Optional) Suggest an idea for implementing the addition or change.
Possible Alternatives
(Optional) Describe any alternative solutions or features you've considered.
The text was updated successfully, but these errors were encountered: