Description
SUMMARY
Orquesta workflow gets stuck in running state, and with-items works incorrectly (at least it's different from mistral and not documented that it's this way by design). In mistral, task with with-items does not transition to another task until all items are processed, regardless whether they fail or not in the middle of the loop. In orquesta, if task with the first item fails, it immediately starts the task that's defined under failed()
condition. If all items succeed, orquesta works exactly as mistral by processing all items first, then transitioning to the next task. I think this is a bug, which in its turn leads to the workflow never reaching final state and gets stuck in running state.
STACKSTORM VERSION
Paste the output of st2 --version
:
st2 3.2.0, on Python 2.7.5
OS, environment, install method
Post what OS you are running this on, along with any other relevant information/
- CentOS 7, installed using documentation https://docs.stackstorm.com/install/rhel7.html
Steps to reproduce the problem
Meta:
---
pack: "playground"
name: "wf_orquesta_stuck3"
description: "Orquesta workflow gets stuck in running bug, st2 v3.2.0"
runner_type: orquesta
enabled: true
entry_point: "workflows/wf_orquesta_stuck3.yaml"
Workflow:
---
version: '1.0'
tasks:
init_task:
action: core.noop
next:
- when: <% succeeded() %>
do:
- task_1
- task_2
task_1:
with:
items: i in <% ["1", "2"] %>
concurrency: 1
action: core.local
input:
cmd: "exit <% item(i) %>"
next:
- when: <% succeeded() or failed() %>
do:
- run_check_1
task_2:
with:
items: i in <% ["0", "0"] %>
concurrency: 1
action: core.local
input:
cmd: "exit <% item(i) %>"
next:
- when: <% succeeded() or failed() %>
do:
- run_check_2
run_check_1:
with:
items: i in <% ["0", "0"] %>
concurrency: 1
action: core.local
input:
cmd: "exit <% item(i) %>"
next:
- when: <% succeeded() %>
do:
- all_good
- when: <% failed() %>
do:
- check_failed
run_check_2:
with:
items: i in <% ["0", "0"] %>
concurrency: 1
action: core.local
input:
cmd: "exit <% item(i) %>"
next:
- when: <% succeeded() %>
do:
- all_good
- when: <% failed() %>
do:
- check_failed
all_good:
join: all
action: core.noop
check_failed:
action: core.noop
next:
- do:
- fail
Expected Results
Two task_1
executed and fail, then two run_check_1
executed and finally complete the workflow and not get stuck.
Actual Results
So as can be seen here, right after task_1
failed, orquesta engine started run_check_1
task without waiting for another item in the list to be processed. But in case of task_2
it did wait for both of the items to be processed before transitioning to run_check_2
, which is what expected in both cases. And this workflow never completes.