Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workflow guidelines version 1 #1

Merged
merged 8 commits into from
Feb 26, 2024
Merged

Workflow guidelines version 1 #1

merged 8 commits into from
Feb 26, 2024

Conversation

SimonHeybrock
Copy link
Member

@SimonHeybrock SimonHeybrock commented Feb 22, 2024

Add initial version of workflow guidelines.

docs/user-guide/reduction-workflow-guidelines.md Outdated Show resolved Hide resolved

**Note**
Most of this is avoid by following S.2.
A bad exampleAvoid would be to write to total raw counts to the output metadata, as this would require keeping the large data alive in memory, unless it is ensured that the task runs early.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But how do we do this instead? This note just forbids writing certain things as metadata. But that cannot be avoided, e.g., in reflectometry as we need to write wavelength and theta ranges.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There will be exceptions, not everything can be "avoided", this is why I chose this formulation instead of making it stricter.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then I'd like clarification about how strict these guidelines are. As it is written, it sounds like a command. So if this one may be violated, can all rules be broken if we can argue for it?

Copy link
Member Author

@SimonHeybrock SimonHeybrock Feb 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Time will tell? I think we should be agile and avoid over-analyzing for now, see what works in practice and what does not, then iterate and improve. "Avoid" does not sound like a command to me?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, in the case you mentioned, shouldn't the wavelength and theta ranges report the used ranges, instead of those in the raw data? I presume the workflow may limit this, and in any case it can be computed from derived data instead of raw data, limiting the problem.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, in the case you mentioned, shouldn't the wavelength and theta ranges report the used ranges, instead of those in the raw data? I presume the workflow may limit this, and in any case it can be computed from derived data instead of raw data, limiting the problem.

Possibly. The current workflow does not allow limiting theta. Also, what if the bounds are much larger than the data? I'd want to report the limits of the data that is actually used instead.

docs/user-guide/reduction-workflow-guidelines.md Outdated Show resolved Hide resolved
docs/user-guide/reduction-workflow-guidelines.md Outdated Show resolved Hide resolved
docs/user-guide/reduction-workflow-guidelines.md Outdated Show resolved Hide resolved
docs/user-guide/reduction-workflow-guidelines.md Outdated Show resolved Hide resolved
**Reason**
Unless explicitly computed, the exact propagation of uncertainties in broadcast operations is not well-defined.
Dropping uncertainties is not desireable in general, as it may lead to underestimation of the uncertainties, but we realize that the upper-bound approach may not be suitable in all cases.
We should therefore support two strategies, "drop" and "upper-bound", and "upper-bound" should be the default.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this really something the user should toggle? How can the upper-bound strategy be valid for some data but not other for the same workflow?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am pretty sure that at least the IDS or IS will want to toggle this.

docs/user-guide/reduction-workflow-guidelines.md Outdated Show resolved Hide resolved
docs/user-guide/reduction-workflow-guidelines.md Outdated Show resolved Hide resolved

**Reason** Unit tests for providers are easier to write and maintain than for entire workflows.

**Note** This does not mean that we should not write tests for entire workflows, but that we should also write tests for providers.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about a guideline for structuring the test directory? And for how to select whether to run integration tests or not. If they take a long time to run, it would be useful to be able to easily deselect them.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think splitting unit tests from integration tests (or something like that) is a great suggestion! As we have not done that before, maybe we can add it in a future version, once we have tried it out in a project?

@@ -0,0 +1,139 @@
# Reduction Workflow Guidelines
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing to add from my side 👍

@SimonHeybrock
Copy link
Member Author

@jl-wynen @nvaytet I think I addressed all comments. I have cleaned this up and suggest releasing a first agreed-upon version such that we can begin implementation. I included a listing of the remaining items which we can add subsequently.

@SimonHeybrock SimonHeybrock changed the title Add draft of workflow guidelines Workflow guidelines version 1 Feb 26, 2024
@SimonHeybrock SimonHeybrock merged commit b15bcb4 into main Feb 26, 2024
4 checks passed
@SimonHeybrock SimonHeybrock deleted the guidelines branch February 26, 2024 10:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

3 participants