-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Workflow guidelines version 1 #1
Conversation
|
||
**Note** | ||
Most of this is avoid by following S.2. | ||
A bad exampleAvoid would be to write to total raw counts to the output metadata, as this would require keeping the large data alive in memory, unless it is ensured that the task runs early. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But how do we do this instead? This note just forbids writing certain things as metadata. But that cannot be avoided, e.g., in reflectometry as we need to write wavelength and theta ranges.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There will be exceptions, not everything can be "avoided", this is why I chose this formulation instead of making it stricter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then I'd like clarification about how strict these guidelines are. As it is written, it sounds like a command. So if this one may be violated, can all rules be broken if we can argue for it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Time will tell? I think we should be agile and avoid over-analyzing for now, see what works in practice and what does not, then iterate and improve. "Avoid" does not sound like a command to me?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, in the case you mentioned, shouldn't the wavelength and theta ranges report the used ranges, instead of those in the raw data? I presume the workflow may limit this, and in any case it can be computed from derived data instead of raw data, limiting the problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, in the case you mentioned, shouldn't the wavelength and theta ranges report the used ranges, instead of those in the raw data? I presume the workflow may limit this, and in any case it can be computed from derived data instead of raw data, limiting the problem.
Possibly. The current workflow does not allow limiting theta. Also, what if the bounds are much larger than the data? I'd want to report the limits of the data that is actually used instead.
**Reason** | ||
Unless explicitly computed, the exact propagation of uncertainties in broadcast operations is not well-defined. | ||
Dropping uncertainties is not desireable in general, as it may lead to underestimation of the uncertainties, but we realize that the upper-bound approach may not be suitable in all cases. | ||
We should therefore support two strategies, "drop" and "upper-bound", and "upper-bound" should be the default. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this really something the user should toggle? How can the upper-bound strategy be valid for some data but not other for the same workflow?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am pretty sure that at least the IDS or IS will want to toggle this.
|
||
**Reason** Unit tests for providers are easier to write and maintain than for entire workflows. | ||
|
||
**Note** This does not mean that we should not write tests for entire workflows, but that we should also write tests for providers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about a guideline for structuring the test directory? And for how to select whether to run integration tests or not. If they take a long time to run, it would be useful to be able to easily deselect them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think splitting unit tests from integration tests (or something like that) is a great suggestion! As we have not done that before, maybe we can add it in a future version, once we have tried it out in a project?
Co-authored-by: Jan-Lukas Wynen <[email protected]>
@@ -0,0 +1,139 @@ | |||
# Reduction Workflow Guidelines |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nothing to add from my side 👍
Co-authored-by: Jan-Lukas Wynen <[email protected]>
Add initial version of workflow guidelines.