Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reuse get SSH connection method in Slurm function agent #3189

Merged
merged 3 commits into from
Mar 20, 2025

Conversation

JiangJiaWei1103
Copy link
Contributor

@JiangJiaWei1103 JiangJiaWei1103 commented Mar 13, 2025

Tracking issue

flyteorg/flyte#5634

Why are the changes needed?

Reusing a shared method for obtaining an SSH connection object prevents code duplication and ensures consistency between SlurmShellAgent and SlurmFunctionAgent.

What changes were proposed in this pull request?

Reuse get_ssh_conn method in SlurmFunctionAgent.

How was this patch tested?

This patch is tested by running the following example:

import os

from flytekit import task, workflow
from flytekitplugins.slurm import SlurmFunction 


@task(
    task_config=SlurmFunction(
        ssh_config={
            "host": "aws2",
            "username": "ubuntu",
        },
        sbatch_conf={
            "partition": "debug",
            "job-name": "job3",
            "output": "/home/ubuntu/fn_task.log"
        },
        script="""#!/bin/bash -i

echo [TEST SLURM FN TASK 1] Run the first user-defined task function...

# Setup env vars
export MY_ENV_VAR=123

# Source the virtual env
. /home/ubuntu/.cache/pypoetry/virtualenvs/demo-4A8TrTN7-py3.12/bin/activate 

# Run the user-defined task function
{task.fn}
"""
    )
)
def plus_one(x: int) -> int: 
    print(os.getenv("MY_ENV_VAR"))
    return x + 1


@task(
    task_config=SlurmFunction(
        ssh_config={
            "host": "aws2",
            "username": "ubuntu",
        },
        script="""#!/bin/bash -i

echo [TEST SLURM FN TASK 2] Run the second user-defined task function...

. /home/ubuntu/.cache/pypoetry/virtualenvs/demo-4A8TrTN7-py3.12/bin/activate 
{task.fn}
"""
    )
)
def greet(year: int) -> str:
    return f"Hello {year}!!!"


@workflow
def wf(x: int) -> str:
    x = plus_one(x=x)
    msg = greet(year=x)
    return msg


if __name__ == "__main__":
    from flytekit.clis.sdk_in_container import pyflyte
    from click.testing import CliRunner

    runner = CliRunner()
    path = os.path.realpath(__file__)

    print(f">>> LOCAL EXEC <<<")
    result = runner.invoke(pyflyte.main, ["run", "--raw-output-data-prefix", "s3://my-flyte-slurm-agent", path, "wf", "--x", 2024])
    print(result.output)

Setup process

For agent local test, the setup process is summarized as follows:

git clone https://github.com/flyteorg/flytekit.git
gh pr checkout 3189
make setup && pip install -e .
cd plugins/flytekit-slurm && pip install -e .

Screenshots

  • Local machine (Flyte client)
    Screenshot 2025-03-13 at 9 37 36 PM

  • The first task on the Slurm cluster
    Screenshot 2025-03-13 at 9 38 16 PM

  • The second task on the Slurm cluster
    Screenshot 2025-03-13 at 9 39 23 PM

Check all the applicable boxes

  • I updated the documentation accordingly.
  • All new and existing tests passed.
  • All commits are signed-off.

Related PRs

Docs link

N/A

Summary by Bito

This PR refactors the codebase from an agent-based to a connector-based architecture, focusing on SSH connection handling in the Slurm function agent by leveraging shared methods. It eliminates custom connection logic and redundant classes, improves error handling, enhances modularity, and introduces a geopandas plugin for GeoDataFrame transformations.

Unit tests added: True

Estimated effort to review (1-5, lower is better): 5

@flyte-bot
Copy link
Contributor

flyte-bot commented Mar 13, 2025

Code Review Agent Run #1cb8c9

Actionable Suggestions - 1
  • plugins/flytekit-slurm/flytekitplugins/slurm/function/agent.py - 1
Review Details
  • Files reviewed - 1 · Commit Range: 00c4b12..00c4b12
    • plugins/flytekit-slurm/flytekitplugins/slurm/function/agent.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

  • /review - Manually triggers a full AI review.

Refer to the documentation for additional commands.

Configuration

This repository uses code_review_bito You can customize the agent settings here or contact your Bito workspace admin at [email protected].

Documentation & Help

AI Code Review powered by Bito Logo

@flyte-bot
Copy link
Contributor

flyte-bot commented Mar 13, 2025

Changelist by Bito

This pull request implements the following key changes.

Key Change Files Impacted
Feature Improvement - Refactor SSH Connection Handling

connector.py - Removed redundant imports and custom SSH connection methods, replacing them with the shared get_ssh_conn method to improve consistency and reduce code duplication.

Comment on lines 101 to 103
conn = await get_ssh_conn(
ssh_config=resource_meta.ssh_config, slurm_cluster_to_ssh_conn=self.slurm_cluster_to_ssh_conn
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incorrect handling of tuple return value

The return value from get_ssh_conn() is not being properly unpacked. According to the function signature in ssh_utils.py, get_ssh_conn() returns a tuple of (SlurmCluster, SSHClientConnection), but only the connection is being used. Consider updating to unpack the tuple correctly: _, conn = await get_ssh_conn(...).

Code suggestion
Check the AI-generated fix before applying
Suggested change
conn = await get_ssh_conn(
ssh_config=resource_meta.ssh_config, slurm_cluster_to_ssh_conn=self.slurm_cluster_to_ssh_conn
)
slurm_cluster, conn = await get_ssh_conn(
ssh_config=resource_meta.ssh_config, slurm_cluster_to_ssh_conn=self.slurm_cluster_to_ssh_conn
)

Code Review Run #1cb8c9


Should Bito avoid suggestions like this for future reviews? (Manage Rules)

  • Yes, avoid them

Copy link
Member

@Future-Outlier Future-Outlier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you provide issue link in the Tracking Issue section?

Copy link
Member

@Future-Outlier Future-Outlier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the slurm function agent and slurm script agent used the same variable slurm_cluster_to_ssh_conn?

@JiangJiaWei1103
Copy link
Contributor Author

can you provide issue link in the Tracking Issue section?

Added! Apologies for the oversight.

@JiangJiaWei1103
Copy link
Contributor Author

Can the slurm function agent and slurm script agent used the same variable slurm_cluster_to_ssh_conn?

Are you asking if both the SlurmFunctionAgent and SlurmShellAgent can share a single slurm_cluster_to_ssh_conn dictionary?

Future-Outlier
Future-Outlier previously approved these changes Mar 19, 2025
Signed-off-by: JiangJiaWei1103 <[email protected]>
Signed-off-by: JiangJiaWei1103 <[email protected]>
Copy link

codecov bot commented Mar 19, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 93.20%. Comparing base (45d5531) to head (5fa3419).
Report is 3 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #3189      +/-   ##
==========================================
- Coverage   94.35%   93.20%   -1.16%     
==========================================
  Files          64       42      -22     
  Lines        2799     2280     -519     
==========================================
- Hits         2641     2125     -516     
+ Misses        158      155       -3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@flyte-bot
Copy link
Contributor

flyte-bot commented Mar 19, 2025

Code Review Agent Run #13343d

Actionable Suggestions - 0
Review Details
  • Files reviewed - 73 · Commit Range: 00c4b12..5fa3419
    • flytekit/clis/sdk_in_container/serve.py
    • flytekit/core/array_node_map_task.py
    • flytekit/core/data_persistence.py
    • flytekit/core/python_function_task.py
    • flytekit/core/type_engine.py
    • flytekit/exceptions/system.py
    • flytekit/exceptions/user.py
    • flytekit/extend/backend/base_agent.py
    • flytekit/extend/backend/base_connector.py
    • flytekit/extend/backend/utils.py
    • flytekit/extras/webhook/__init__.py
    • flytekit/extras/webhook/task.py
    • flytekit/image_spec/default_builder.py
    • flytekit/image_spec/image_spec.py
    • flytekit/models/task.py
    • flytekit/sensor/base_sensor.py
    • flytekit/sensor/file_sensor.py
    • flytekit/sensor/sensor_engine.py
    • plugins/flytekit-airflow/flytekitplugins/airflow/__init__.py
    • plugins/flytekit-airflow/flytekitplugins/airflow/task.py
    • plugins/flytekit-aws-sagemaker/flytekitplugins/awssagemaker_inference/__init__.py
    • plugins/flytekit-aws-sagemaker/flytekitplugins/awssagemaker_inference/boto3_mixin.py
    • plugins/flytekit-aws-sagemaker/flytekitplugins/awssagemaker_inference/boto3_task.py
    • plugins/flytekit-aws-sagemaker/flytekitplugins/awssagemaker_inference/task.py
    • plugins/flytekit-aws-sagemaker/flytekitplugins/awssagemaker_inference/workflow.py
    • plugins/flytekit-aws-sagemaker/tests/test_boto3_mixin.py
    • plugins/flytekit-aws-sagemaker/tests/test_inference_task.py
    • plugins/flytekit-aws-sagemaker/tests/test_inference_workflow.py
    • plugins/flytekit-bigquery/flytekitplugins/bigquery/__init__.py
    • plugins/flytekit-bigquery/flytekitplugins/bigquery/task.py
    • plugins/flytekit-geopandas/flytekitplugins/geopandas/__init__.py
    • plugins/flytekit-geopandas/flytekitplugins/geopandas/gdf_transformers.py
    • plugins/flytekit-geopandas/setup.py
    • plugins/flytekit-geopandas/tests/test_geopandas_plugin.py
    • plugins/flytekit-k8sdataservice/flytekitplugins/k8sdataservice/__init__.py
    • plugins/flytekit-k8sdataservice/flytekitplugins/k8sdataservice/k8s/kube_config.py
    • plugins/flytekit-k8sdataservice/flytekitplugins/k8sdataservice/sensor.py
    • plugins/flytekit-k8sdataservice/flytekitplugins/k8sdataservice/task.py
    • plugins/flytekit-k8sdataservice/tests/k8sdataservice/test_task.py
    • plugins/flytekit-mmcloud/flytekitplugins/mmcloud/__init__.py
    • plugins/flytekit-mmcloud/tests/test_mmcloud.py
    • plugins/flytekit-openai/flytekitplugins/openai/__init__.py
    • plugins/flytekit-openai/flytekitplugins/openai/batch/task.py
    • plugins/flytekit-openai/flytekitplugins/openai/chatgpt/task.py
    • plugins/flytekit-openai/tests/chatgpt/test_chatgpt.py
    • plugins/flytekit-perian/flytekitplugins/perian_job/__init__.py
    • plugins/flytekit-perian/flytekitplugins/perian_job/task.py
    • plugins/flytekit-perian/setup.py
    • plugins/flytekit-slurm/flytekitplugins/slurm/__init__.py
    • plugins/flytekit-slurm/flytekitplugins/slurm/function/task.py
    • plugins/flytekit-slurm/flytekitplugins/slurm/script/task.py
    • plugins/flytekit-slurm/flytekitplugins/slurm/ssh_utils.py
    • plugins/flytekit-snowflake/flytekitplugins/snowflake/__init__.py
    • plugins/flytekit-snowflake/flytekitplugins/snowflake/task.py
    • plugins/flytekit-snowflake/tests/test_snowflake.py
    • plugins/flytekit-spark/flytekitplugins/spark/__init__.py
    • plugins/flytekit-spark/flytekitplugins/spark/models.py
    • plugins/flytekit-spark/flytekitplugins/spark/task.py
    • plugins/flytekit-spark/tests/test_spark_task.py
    • plugins/setup.py
    • pydoclint-errors-baseline.txt
    • pyproject.toml
    • tests/flytekit/clis/sdk_in_container/test_serve.py
    • tests/flytekit/unit/bin/test_python_entrypoint.py
    • tests/flytekit/unit/cli/pyflyte/test_serve.py
    • tests/flytekit/unit/core/image_spec/test_image_spec.py
    • tests/flytekit/unit/core/test_array_node_map_task.py
    • tests/flytekit/unit/core/test_eager_cleanup.py
    • tests/flytekit/unit/core/test_partials.py
    • tests/flytekit/unit/extras/webhook/test_end_to_end.py
    • tests/flytekit/unit/sensor/test_file_sensor.py
    • tests/flytekit/unit/sensor/test_sensor_engine.py
    • tests/flytekit/unit/types/structured_dataset/test_snowflake.py
  • Files skipped - 9
    • .github/workflows/build_image.yml - Reason: Filter setting
    • .github/workflows/pythonbuild.yml - Reason: Filter setting
    • .github/workflows/pythonpublish.yml - Reason: Filter setting
    • plugins/flytekit-aws-sagemaker/README.md - Reason: Filter setting
    • plugins/flytekit-geopandas/README.md - Reason: Filter setting
    • plugins/flytekit-mmcloud/README.md - Reason: Filter setting
    • plugins/flytekit-openai/README.md - Reason: Filter setting
    • plugins/flytekit-perian/README.md - Reason: Filter setting
    • plugins/flytekit-slurm/README.md - Reason: Filter setting
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

  • /review - Manually triggers a full AI review.

Refer to the documentation for additional commands.

Configuration

This repository uses code_review_bito You can customize the agent settings here or contact your Bito workspace admin at [email protected].

Documentation & Help

AI Code Review powered by Bito Logo

@Future-Outlier Future-Outlier merged commit ce003ed into flyteorg:master Mar 20, 2025
115 of 116 checks passed
chmod77 pushed a commit to chmod77/flytekit that referenced this pull request Mar 27, 2025
* refactor: Reuse get SSH connection method

Signed-off-by: JiangJiaWei1103 <[email protected]>

* Fix lint

Signed-off-by: JiangJiaWei1103 <[email protected]>

---------

Signed-off-by: JiangJiaWei1103 <[email protected]>
Signed-off-by: chmod77 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

3 participants