Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Workflow is Error but taskset node is not Error when Agent pod failed. Fixes #14200 #14212

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

Tuilot
Copy link

@Tuilot Tuilot commented Feb 20, 2025

Fixes #14200

Motivation

  1. Workflow node status and taskset node status all non-fulfilled when agent pod failed.
  2. Mark all non-fulfilled taskset nodes as error, then continue reconciling the taskset.

Modifications

Skip reconciling taskset when all taskset nodes are fulfilled

Verification

in ut TestHTTPTemplateWhenAgentPodFailed

Documentation

@jswxstw
Copy link
Member

jswxstw commented Feb 21, 2025

CI/ Unit Tests is failing, there is a bug in TestReconcileTaskSetWithMemoization:

err = woc.reconcileTaskSet(ctx)

I think here should use woc.operate(ctx) instead of woc.reconcileTaskSet(ctx).

Signed-off-by: tuilot <[email protected]>
@Tuilot
Copy link
Author

Tuilot commented Feb 22, 2025

@jswxstw I fixed it, can you review it?

@jswxstw
Copy link
Member

jswxstw commented Feb 24, 2025

@jswxstw I fixed it, can you review it?

I have left a comment before, which I think there is no need to getWorkflowTaskSet if no taskSet to reconcile.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Workflow is Error but taskset node is not Error when Agent pod failed
2 participants