You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I imagine a wide problem in using HPC systems is not knowing how much memory or walltime to allocate. I imagine this not knowing the right amounts (of memory especially) causes huge amounts of wasted resources on HPC systems. I know it does in my workflow.
SLURM returns MaxRSS/Elapsed value for each job after completion which future.batchtools could store in .future/20171118_083108-ebBlfz/.
I'm eventually imagining a function exported from drake that could report on these somehow.
The text was updated successfully, but these errors were encountered:
That's a great idea. Yes, it seems to be a common problem - using way more or way less resources than requested is inefficient for both the user and the cluster utilization. I agree that one solution is to provide users with feedback (memory and processing time).
For futures in general, I hope to be able gather some of these stats using R itself (and therefore for all types of futures). I'm planning to add some basic support for this throughout the board, cf. futureverse/future#59.
However, for more system-specific information, like the Slurm stats your mentioning, that obviously has to be implemented by the more specific future classes. And for such, I think they should probably be part of batchtools itself and then future.batchtools could provide a way to access/present it.
I imagine a wide problem in using HPC systems is not knowing how much memory or walltime to allocate. I imagine this not knowing the right amounts (of memory especially) causes huge amounts of wasted resources on HPC systems. I know it does in my workflow.
SLURM returns MaxRSS/Elapsed value for each job after completion which
future.batchtools
could store in.future/20171118_083108-ebBlfz/
.I'm eventually imagining a function exported from drake that could report on these somehow.
The text was updated successfully, but these errors were encountered: