Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Guideline rules on name and type conventions #9

Merged
merged 6 commits into from
Mar 11, 2024
Merged

Guideline rules on name and type conventions #9

merged 6 commits into from
Mar 11, 2024

Conversation

jl-wynen
Copy link
Member

@jl-wynen jl-wynen commented Mar 4, 2024

Fixes #7

Please flag any and all names / types that you don't think are appropriate! And suggest any that I missed.

@@ -43,6 +40,56 @@ We plan to include the following in future versions of the guidelines:

- *Provider*: A callable step in a workflow writing with Sciline.

## C: Convention
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please either remove the empty Naming below if you prefer Convention, or move this.

Comment on lines 58 to 64
| --- **Run IDs** --- | | |
| SampleRun, BackgroundRun, ... | Any | Identifier for a run |
| RunType | TypeVar | Constrained to the run types used by the package, see above |
| RunTitle | str | Extracted from NeXus or provided by user, can be used to find files |
| --- **Monitors** --- | | |
| IncidentMonitor \| *Monitor | Any | Identifier for a monitor |
| MonitorType | TypeVar | Constrained to the monitor types used by the package, see above |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we assume that a workflow necessarily wants/needs to use generics for runs types and monitor types. Is that a good assumption? Or should we consider formulating it as "if using generics for X, use this convention"?

Should we maybe split C.1 to discuss naming for generics (and their typevars) separately?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we assume that a workflow necessarily wants/needs to use generics for runs types and monitor types. Is that a good assumption? Or should we consider formulating it as "if using generics for X, use this convention"?

I don't understand what you are saying. When we don't use generics, the SampleRun, ... and RunType simply don't exist. This table doesn't mandate using all entries, it's supposed to tell us how to name them if and when we need them.

Should we maybe split C.1 to discuss naming for generics (and their typevars) separately?

Why? How would you structure that table?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand what you are saying. When we don't use generics, the SampleRun, ... and RunType simply don't exist.

Say I want to write a workflow that process a sample run and a vanadium run, without use of generics. Which names do I use?

I am suggesting to make two guidelines (each with their own table), or have a separate guideline specifying conventions for naming typevars and identifiers.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In principle, this is already part of it. Simply ignore the Run IDs section and use the '*Filename'-type patterns. Do you think we need to be more explicit and list {Sample,Vanadium}Filename, {Sample,Vanadium}RawData, etc?

Copy link
Member

@SimonHeybrock SimonHeybrock Mar 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the reader is not familiar with using generic providers in a workflow then the current table is quite unclear. Please make a separate guideline. We do not need to make an explicit list.

Comment on lines +90 to +122
- Gracefully promote dtypes for small parameters.
E.g., `sc.scalar(2, unit='m')` and `sc.scalar(2.0, unit='m')` should be usable interchangeably.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍
Should we require the same, e.g., for params like bin-edges such that, e.g., passing arange works just like passing linspace?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't it work the same way?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Operations may raise when ints are passed instead of floats as coords (bin edges, for example). I am suggesting extending your suggestion for graceful handling.

|-------------------------------------------|---------|-----------------------------------------------------------------|
| --- **Run IDs** --- | | |
| SampleRun, BackgroundRun, ... | Any | Identifier for a run, only used as a type tag |
| RunType | TypeVar | Constrained to the run types used by the package, see above |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So is the rule to end typevars with Type?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't know in general. I've only seen those two in practice.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SimonHeybrock Do we need to do anything about this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we make this a rule? I think it is quite confusing to know what is a typevar and what is and ID otherwise?

| Name | Type | Description |
|-----------------------------|-------------|---------------------------------------------------------------------------|
| --- **Files** --- | | |
| Filename \| *Filename | str | Simple name of a file, must be processed into FilePath |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should Filename even exist? If the file is obtained from SciCat, we only have FilePath, I suppose. Otherwise, can we replace Filename by, e.g., and instrument identifier and a run number?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SciCat datasets can contain any number of files. So we need a pair or (dataset id, file id). And the file id can only really be a file name.

@jl-wynen jl-wynen merged commit 8bf9dcf into main Mar 11, 2024
3 checks passed
@jl-wynen jl-wynen deleted the conventions branch March 11, 2024 13:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Naming conventions
2 participants