Skip to content

[Suggestion] Dealing with the data node attribute "name" #105

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
GeigerJ2 opened this issue May 15, 2025 · 0 comments
Open

[Suggestion] Dealing with the data node attribute "name" #105

GeigerJ2 opened this issue May 15, 2025 · 0 comments

Comments

@GeigerJ2
Copy link
Collaborator

Currently, when going the route, PWD -> WfMS -> PWD, in the second step, the "name"s of input nodes are re-generated (from the function arguments) by the write_workflow_json functions that uses these helpers, e.g., see here:

def update_node_names(workflow_dict: dict) -> dict:
node_names_final_dict = {}
input_nodes = [n for n in workflow_dict[NODES_LABEL] if n["type"] == "input"]
node_names_dict = {
n["id"]: list(
set(
[
e[TARGET_PORT_LABEL]
for e in workflow_dict[EDGES_LABEL]
if e[SOURCE_LABEL] == n["id"]
]
)
)[0]
for n in input_nodes
}
counter_dict = Counter(node_names_dict.values())
node_names_useage_dict = {k: -1 for k in counter_dict.keys()}
for k, v in node_names_dict.items():
node_names_useage_dict[v] += 1
if counter_dict[v] > 1:
node_names_final_dict[k] = v + "_" + str(node_names_useage_dict[v])
else:
node_names_final_dict[k] = v
for n in workflow_dict[NODES_LABEL]:
if n["type"] == "input":
n["name"] = node_names_final_dict[n["id"]]
return workflow_dict
def set_result_node(workflow_dict):
node_id_lst = [n["id"] for n in workflow_dict[NODES_LABEL]]
source_lst = list(set([e[SOURCE_LABEL] for e in workflow_dict[EDGES_LABEL]]))
end_node_lst = []
for ni in node_id_lst:
if ni not in source_lst:
end_node_lst.append(ni)
node_id = len(workflow_dict[NODES_LABEL])
workflow_dict[NODES_LABEL].append(
{"id": node_id, "type": "output", "name": "result"}
)
workflow_dict[EDGES_LABEL].append(
{
TARGET_LABEL: node_id,
TARGET_PORT_LABEL: None,
SOURCE_LABEL: end_node_lst[0],
SOURCE_PORT_LABEL: None,
}
)
return workflow_dict

One can still, in principle, modify the "name"s in the PWD, however, when one starts with this PWD:

{
  "version": "0.0.1",
  "nodes": [
	...
    {"id": 3, "type": "input", "value": 1, "name": "a"},
    ...
  ],
  "edges": [
    {"target": 0, "targetPort": "x", "source": 3, "sourcePort": null},
	...
  ]
}

loads it into the WfMS and then exports it again to PWD, the result will be this:

{
  "version": "0.0.1",
  "nodes": [
	...
    {"id": 3, "type": "input", "value": 1, "name": "x"},   <-- This got replaced with the value from the targetPort
    ...
  ],
  "edges": [
    {"target": 0, "targetPort": "x", "source": 3, "sourcePort": null},
	...
  ]
}

Hence, in the PWD JSON schema, they somewhat present redundant information from the port information (as said above, they can still be added/modified, it's just not respected by the WfMS, or persistent on interconversions). pyiron and jobflow don't support global workflow inputs and outputs (yet), instead, these are based on the arguments of the first function(s) in the workflow. So a proper implementation is not possible at this point, but one could probably still add some helper functions instead.

As the input/output "name" attributes don't break the code and the workflows can still be executed either way, we still already keep them in the PWD for now and postpone a proper implementation for later.

@jan-janssen jan-janssen changed the title Dealing with the data node attribute "name" [Suggestion] Dealing with the data node attribute "name" May 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant