Skip to content

Orquesta workflow gets stuck in running state #5029

Closed
StackStorm/orquesta
#213
@igcherkaev

Description

@igcherkaev

SUMMARY

Orquesta workflow gets stuck in running state, and with-items works incorrectly (at least it's different from mistral and not documented that it's this way by design). In mistral, task with with-items does not transition to another task until all items are processed, regardless whether they fail or not in the middle of the loop. In orquesta, if task with the first item fails, it immediately starts the task that's defined under failed() condition. If all items succeed, orquesta works exactly as mistral by processing all items first, then transitioning to the next task. I think this is a bug, which in its turn leads to the workflow never reaching final state and gets stuck in running state.

STACKSTORM VERSION

Paste the output of st2 --version:

st2 3.2.0, on Python 2.7.5
OS, environment, install method

Post what OS you are running this on, along with any other relevant information/

Steps to reproduce the problem

Meta:

---
pack: "playground"
name: "wf_orquesta_stuck3"
description: "Orquesta workflow gets stuck in running bug, st2 v3.2.0"
runner_type: orquesta
enabled: true
entry_point: "workflows/wf_orquesta_stuck3.yaml"

Workflow:

---
version: '1.0'

tasks:
  init_task:
    action: core.noop
    next:
      - when: <% succeeded() %>
        do:
          - task_1
          - task_2
  task_1:
    with:
      items: i in <% ["1", "2"] %>
      concurrency: 1
    action: core.local
    input:
      cmd: "exit <% item(i) %>"
    next:
      - when: <% succeeded() or failed() %>
        do:
          - run_check_1

  task_2:
    with:
      items: i in <% ["0", "0"] %>
      concurrency: 1
    action: core.local
    input:
      cmd: "exit <% item(i) %>"
    next:
      - when: <% succeeded() or failed() %>
        do:
          - run_check_2

  run_check_1:
    with:
      items: i in <% ["0", "0"] %>
      concurrency: 1
    action: core.local
    input:
      cmd: "exit <% item(i) %>"
    next:
      - when: <% succeeded() %>
        do:
          - all_good
      - when: <% failed() %>
        do:
          - check_failed

  run_check_2:
    with:
      items: i in <% ["0", "0"] %>
      concurrency: 1
    action: core.local
    input:
      cmd: "exit <% item(i) %>"
    next:
      - when: <% succeeded() %>
        do:
          - all_good
      - when: <% failed() %>
        do:
          - check_failed

  all_good:
    join: all
    action: core.noop

  check_failed:
    action: core.noop
    next:
      - do:
          - fail

Expected Results

Two task_1 executed and fail, then two run_check_1 executed and finally complete the workflow and not get stuck.

Actual Results

image

So as can be seen here, right after task_1 failed, orquesta engine started run_check_1 task without waiting for another item in the list to be processed. But in case of task_2 it did wait for both of the items to be processed before transitioning to run_check_2, which is what expected in both cases. And this workflow never completes.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions