-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New directives syntax #819
base: main
Are you sure you want to change the base?
Conversation
Supports 3 node types: - shared: 1 node or less - mixed: any number of tasks - wholenode: whole node increments
Also move threshold to _PartitionConfig class and gpus_per_node default to 0.
This is useful for various group/bundling resource aggregation.
We use these directives internally to help with scheduling. Even if used in templates, using the per-process first allows for more control over resource request and computing the totals later is cheap and simple.
Uses new directives and -per-task options.
directives["gpus"] = directives["processes"] * directives["gpus_per_process"] | ||
if (memory := directives["memory_per_cpu"]) is not None: | ||
directives["memory"] = directives["cpus"] * memory | ||
else: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Georgia Tech's slurm schedular also uses "memory_per_gpu", so if this is the appropriate place, do we need to add the below here also?
elif (memory := directives["memory_per_gpu"]) is not None:
directives["memory"] = directives["gpus"] * memory
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
memory_per_gpu
is not a directive. In this new scheme processes
is the base unit on which gpus
and memory
are based. If your users need a specific amount of memory per GPU, they could set:
directives={'processes': N, 'gpus_per_process': G, 'memory_per_process': memory_per_gpu * G}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait, I got that wrong. The directive is (currently) memory_per_cpu
, so you would have to calculate that based on the number of cpus per gpu.
I originally suggested memory_per_cpu
because that provided a 1:1 mapping to SLURM. However, @b-butler explained that the complexities of bundling and groups essentially requires computing an aggregate memory
and the dividing to issue the proper SLURM request. Thus, we could change the user facing directive to memory_per_process
.
In either case, the information will be there for the Georgia Tech's template to produce --memory-per-gpu
if that is a hard requirement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
--memory-per-gpu
is a variable in the Georgia Tech's submission script. We can enter it in the custom any custom value as log as we can import the in the header block values (example https://github.com/bcrawford39GT/signac_numpy_tutorial/blob/main/signac_numpy_tutorial/project/templates/phoenix.sh), without Signac auto placing a default --mem
option that can not be removed, This is the main issue, meaning if we add anything the default --mem
option is also added, causing 2 conflicting variables. Maybe there can be a variable to not autoprint --mem
?
I think the memory_per_process
option would be fine instead of --memory-per-cpu
and --memory-per-gpu
, as long as they can import the value in the header block values (example https://github.com/bcrawford39GT/signac_numpy_tutorial/blob/main/signac_numpy_tutorial/project/templates/phoenix.sh) for manipulation if needed, and Signac does not write --mem
automatically. Alternatively, maybe you are saying memory_per_process
will be automatically written to --mem
?
If original --memory-per-cpu
and --memory-per-gpu
it is not feasible to do that is fine. I was under the impression it was OK, not an issue, and in progress.
We will try and adapt whatever is feasible to do. I am trying to understand the viable options or direction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new default template will always write --mem-per-cpu
to the slurm file, not --mem
. We can put this in a {% block memory %}
so that child scripts can override it if needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. As long as the user has an option to stop the default writing of the new default --mem-per-cpu
also that should be fine. We just can not specify it 2x+ or it will error out as overspecified
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wish we didn't need memory at all, but we do. A number of clusters that flow supports default to a ridiculously small amount of memory per rank. Users therefore need the ability to request more. It is the responsibility of the user to request an appropriate amount that does not result in extra charges.
The systems that flow needs to both a) combine memory requests across bundles/groups and b) provide the --mem-per-cpu needed for srun to correctly launch jobs are complex.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. We can work with the final solution that works for everyone. Just wanted to state the barriers to things, so you are aware.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wish Memory was free also, on GPUs too! No need to buy commercial cards then
Sorry for the delay, I have uploaded my current changes. Some things that need to be done:
Currently, the code once debugged supports homogenous MPI, heterogeneous nonMPI, and mixed MPI/nonMPI submissions. |
Description
This PR removes and changes the supported directive names in favor of a more generic structure which has less ambiguous setting and more closely allows for modern scheduler requests. One example is moving
np
ton_processes
which provides some self-documenting effect. We also anchor ton_processes
as a basis for most other directives such asn_threads_per_process
.Motivation and Context
There have been multiple issues related to directives behavior or confusion based on naming
np
andnranks
for instance. This clears them up, and simplifies the logic to support various workflows or appropriately disallows them when unlikely to work on a scheduler.Checklist: