Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: IDE Working Group #808

Open
ederign opened this issue Jan 29, 2025 · 9 comments
Open

Proposal: IDE Working Group #808

ederign opened this issue Jan 29, 2025 · 9 comments

Comments

@ederign
Copy link
Member

ederign commented Jan 29, 2025

Hi everyone! 👋

After brainstorming with some community members about how to improve the Kubeflow User/Developer Experience for Data Scientists and ML practitioners, I decided to go one step further and start a formal discussion and propose a new IDE working group and its initial roadmap.

The IDE Working Group (potentially, Kubeflow Jupyter Extension WG) will be responsible for developing and integrating IDE-based tools and extensions to provide a streamlined user experience to data scientists and machine learning practitioners on Kubeflow.

WG IDE Charter

The IDE Working Group is responsible for developing and integrating IDE-based tools and extensions to provide a streamlined user experience to data scientists and machine learning practitioners on Kubeflow.

This charter adheres to the conventions, roles, and organization management outlined in wg-governance.

Scope

The IDE Working Group focuses on developing, maintaining, and improving tools and extensions that support data science and machine learning practitioners workflows within Kubeflow. The group is dedicated to delivering a high-level, seamless experience integrated with the IDE of choice across multiple Kubeflow components.

In scope

Code, Binaries, and Services

  1. Development of Kubeflow JupyterLab extensions that provide simple abstractions and UX to interact with the most common Kubeflow components (e.g., pipelines, hyperparameter tuning) and shorten the time to value for practitioners comfortable with Jupyter. These extensions will focus on the most used Kubeflow components, such as:

    • Pipelines;
    • Training Operator & Katib;
    • Model Registry;
    • Model Serving (KServe);
    • Feast
  2. Promote the reusability of UI components from other Kubeflow UIs into the IDE (e.g., rendering a pipeline graph inside the JupyterLab environment) by establishing a shared contract between the IDE WG and the wider Kubeflow community. 

  3. Develop a Python SDK to simplify operationalization across Kubeflow components and provide a "one-stop-shop" for practitioners who want easy access to Kubeflow services. The SDK also provides the groundwork for the IDE extension automation and workflows.

    • Create a single installation and configuration layer for users interacting programmatically with the Kubeflow ecosystem via SDKs.
    • The "common" SDK is not meant to replace individual components' SDKs but rather to offer a unified access layer to simplify dependency management and shared configuration (like authorization).

Guiding Principles

  • Synergy among Kubeflow Working Groups: Collaborate with other WG to promote reusability of UI components from other Kubeflow UIs to create a single UX between Jupyter IDE and Kubeflow Central Dashboard;
  • Collaboration with other open-source IDE projects (like Jupyter and VSCode) to promote the creation and reusability of open standards for AI/ML tools (protocols, communication exchange, file formats, etc.) and plugins. The aim of this group is to actively participate in the development of these standards to include Kubeflow in a broader ecosystem or interoperable tools. 

Cross-cutting and Externally Facing Processes

  • Collaboration with other Kubeflow WGs, including WG Notebooks, WG Pipelines, WG Training, and WG Serving, ensures that IDE tools are interoperable across different stages of the ML lifecycle.
  • Coordination with the release teams to align updates in IDE tools with broader Kubeflow release schedules.

Out of scope

  • Building and maintaining Notebook/Workspaces images (this falls under the WG Notebooks).

Working Group Roadmap Proposal

Vision

Development of Kubeflow JupyterLab extensions that provide simple abstractions and UX to interact with the most common Kubeflow components (e.g., pipelines, hyperparameter tuning) and shorten the time to value for practitioners comfortable with Jupyter. These extensions will focus on the most used Kubeflow components, such as Pipelines, Training Operator & Katib, Model Registry, Model Serving (Kserve), Feast, etc.

Phase 1 - Establish baseline (XX Months)

Goal: Baseline/starting point for Kubeflow IDE Extension

This phase will consist of three main tasks:

  • Working on the kubeflow-kale/kale to make it functional with KFP v2. The goal is to demo a successful notebook run with the latest version of KFP.
  • Re-introduce Elyra add-on support in Kubeflow. The goal is to demo a pipeline visual authoring compatible with the latest version of KFP.
  • Explore the synergy between the Kubeflow Jupyter Extension and Jupyter Scheduler. We strive to build a close partnership of this working group with Jupyter upstream and even conciliate our efforts.

Task breakdown:

Kale:
Note: @StefanoFioravanzo started this issue #730 and got great feedback and traction from the community.

  • Create a map of existing features and capabilities.
  • Upgrade dependencies to resolve CVEs and update deprecated modules
  • Align the internal API with KFP v2 
  • Update jupyter notebook docker images
  • Demo!

Elyra
Note: This work is already in progress by my group at Red Hat, together with the Elyra community.

Jupyter Scheduler

  • Demonstrate the capability of Jupyter Scheduler extension for Notebook Workflows.
  • Discuss how we can consolidate efforts to build a unified solution for Notebook Workflows.

Phase 2 - Code Migration (XX Months)

Goal: code consolidated within the Kubeflow GitHub organization with proper code structure and naming

Phase 1 focused on establishing a baseline by demoing Kale and Elyra integrations successfully. In this phase we want to consolidate the Kale codebase under the Kubeflow organization. This new structure will allow us to work on top of Kale and iteratively build the new IDE experience for Kubeflow. Elyra will continue to be the interim solution for low-code visual pipeline authoring.

  • Migrating kubeflow-kale/kale to kubeflow/XXX - naming of the repository to be discussed with Kubeflow community. This new repository will house everything related to Kubeflow IDE plugins and extensions

Phase 3 - Enhance IDE extension  (XX Months)

Goal: Add the visual authoring and the runtime pipeline visualization to the Kale baseline. With these new features Kubeflow can provide both a notebook-based and a visual/drag-and-drop-based authoring pipeline experience.
We are also planning to provide the same visualization look and feel both on IDE and on the Kubeflow Central Dashboard.

Long-term plan

Goal: Kubeflow JupyterLab Extension MVP will provide a streamlined user experience to data scientists and machine learning practitioners across all components of the Kubeflow ecosystem.

@ederign
Copy link
Member Author

ederign commented Jan 29, 2025

CC @kubeflow/kubeflow-steering-committee @StefanoFioravanzo @andreyvelich

This proposal submission is a collaboration between @StefanoFioravanzo, @andreyvelich, and myself. We also got helpful feedback from multiple other community members.

@ederign
Copy link
Member Author

ederign commented Jan 29, 2025

This proposal is also related to the 'SDK discussion' on kubeflow/training-operator#2402 (comment)

@StefanoFioravanzo
Copy link
Member

@ederign thanks for migrating our notes and creating the issue! Looking forward to starting these efforts and can't wait to hear feedback from the community

@andreyvelich
Copy link
Member

@lresende
Copy link
Member

lresende commented Jan 29, 2025

Thanks for the well-written proposal. Some of these align very well with the mission of the Elyra project. Given the synergy, it might be a good idea to explore how we could make some of these in the context of Jupyter/Elyra in particular as we are all projects related to the Linux Foundation. Please let me know if any specific meetings are happening in this area.

cc @caponetto @shalberd @romeokienzler

@ederign
Copy link
Member Author

ederign commented Jan 29, 2025

@lresende absolutely! We still need to wait for broader feedback from the community about the proposal, but if we agree to proceed, I'll make sure to invite Elyra folks to the discussions.

@Griffin-Sullivan
Copy link

I think this is a great idea and will enhance the overall UX with Kubeflow! I'd be happy to help out with any of the initiatives.

@milosjava
Copy link
Member

Really detailed proposal, thank you very much for that! From my experience at Pepsico, Data Scientists often struggle to get familiar with Kubeflow, and companies typically need to develop a tool or library to help them use it effectively. Once implemented, this could definitely accelerate adoption.

@tarekabouzeid
Copy link
Member

I think it's really great initiative that will improve Kubeflow usability.
And thank you so much for the detailed explanation, great work! I would really like to help in this initiative.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants