Skip to content

pythonworkflow/python-workflow-definition

Repository files navigation

Python Workflow Definition

Pipeline Binder arXiv DOI

Definition

In the Python Workflow Definition (PWD) each node represents a Python function, with the edges defining the connection between input and output of the different Python functions.

Format

Each workflow consists of three files, a Python module which defines the individual Pythons, a JSON file which defines the connections between the different Python functions and a conda environment file to define the software dependencies. The files are not intended to be human readable, but rather interact as a machine readable exchange format between the different workflow engines to enable interoperability.

Installation

The Python Workflow Definition can either be installed via pypi or via conda. For the pypi installation use:

pip install python-workflow-definition

For the conda installation via the conda-forge community channel use:

conda install conda-forge::python-workflow-definition

Examples

Simple Example

As a first example we define two Python functions which add multiple inputs:

def get_sum(x, y):
    return x + y
    
def get_prod_and_div(x: float, y: float) -> dict:
    return {"prod": x * y, "div": x / y}

These two Python functions are combined in the following example workflow:

def combined_workflow(x=1, y=2):
    tmp_dict = get_prod_and_div(x=x, y=y)
    return get_sum(x=tmp_dict["prod"], y=tmp_dict["div"])

For the workflow representation of these Python functions the Python functions are stored in the example_workflows/arithmetic/workflow.py Python module. The connection of the Python functions are stored in the example_workflows/arithmetic/workflow.json JSON file:

{
  "nodes": [
    {"id": 0, "type": "function", "value": "workflow.get_prod_and_div"},
    {"id": 1, "type": "function", "value": "workflow.get_sum"},
    {"id": 2, "type": "input", "value": 1, "name": "x"},
    {"id": 3, "type": "input", "value": 2, "name": "y"},
    {"id": 4, "type": "output", "name": "result"}
  ],
  "edges": [
    {"target": 0, "targetPort": "x", "source": 2, "sourcePort": null},
    {"target": 0, "targetPort": "y", "source": 3, "sourcePort": null},
    {"target": 1, "targetPort": "x", "source": 0, "sourcePort": "prod"},
    {"target": 1, "targetPort": "y", "source": 0, "sourcePort": "div"},
    {"target": 4, "targetPort": null, "source": 1, "sourcePort": null}
  ]
}

The abbreviations in the definition of the edges are:

  • target - target node
  • targetPort - target port - for a node with multiple input parameters the target port specifies which input parameter to use.
  • source - source node
  • sourcePort - source port - for a node with multiple output parameters the source port specifies which output parameter to use.

As the workflow does not require any additional resources, as it is only using built-in functionality of the Python standard library.

The corresponding Jupyter notebooks demonstrate this functionality:

Example Explanation
example_workflows/arithmetic/aiida.ipynb Define Workflow with aiida and execute it with jobflow and pyiron_base.
example_workflows/arithmetic/jobflow.ipynb Define Workflow with jobflow and execute it with aiida and pyiron_base.
example_workflows/arithmetic/pyiron_base.ipynb Define Workflow with pyiron_base and execute it with aiida and jobflow.
example_workflows/arithmetic/universal_workflow.ipynb Execute workflow defined in the Python Workflow Definition with aiida, executorlib, jobflow, pyiron_base and pure Python.

Quantum Espresso Workflow

The second workflow example is the calculation of an energy volume curve with Quantum Espresso. In the first step the initial structure is relaxed, afterward it is strained and the total energy is calculated.

Example Explanation
example_workflows/quantum_espresso/aiida.ipynb Define Workflow with aiida and execute it with jobflow and pyiron_base.
example_workflows/quantum_espresso/jobflow.ipynb Define Workflow with jobflow and execute it with aiida and pyiron_base.
example_workflows/quantum_espresso/pyiron_base.ipynb Define Workflow with pyiron_base and execute it with aiida and jobflow.
example_workflows/quantum_espresso/universal_workflow.ipynb Execute workflow defined in the Python Workflow Definition with aiida, executorlib, jobflow, pyiron_base and pure Python.

NFDI4Ing Scientific Workflow Requirements

To demonstrate the compatibility of the Python Workflow Definition to file based workflows, the workflow benchmark developed as part of NFDI4Ing is implemented for all three simulation codes based on a shared workflow definition.

Additional source files provided with the workflow benchmark:

Example Explanation
example_workflows/nfdi/aiida.ipynb Define Workflow with aiida and execute it with jobflow and pyiron_base.
example_workflows/nfdi/jobflow.ipynb Define Workflow with jobflow and execute it with aiida and pyiron_base.
example_workflows/nfdi/pyiron_base.ipynb Define Workflow with pyiron_base and execute it with aiida and jobflow.
example_workflows/nfdi/universal_workflow.ipynb Execute workflow defined in the Python Workflow Definition with aiida, executorlib, jobflow, pyiron_base and pure Python.

About

Python Workflow Definition - workflow interoperability for aiida, jobflow and pyiron

Topics

Resources

License

Stars

Watchers

Forks

Contributors 5