-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
availableCores(): Add support for HTCondor #50
Comments
@fboehm, I see you're suggesting |
I don't have an HTCondor setup handy to test, the docs say:
|
@HenrikBengtsson - I'm so sorry that I missed this message (from 3 years ago!) until now. @lmichael107 @CHTC has a lot of HTCondor experience, and she may be able to connect us with others at U. Wisconsin-Madison who might also have answers to some of the above HT Condor questions. I regret that I'm clueless here. My past uses of HT Condor were pretty crude in the sense that I don't think I ever understood the HT Condor variables and how to integrate them with R package functions, especially when thinking about the |
HTCondor users, I need your help to add support for HTCondor to
availableCores()
:HPC schedulers such as Slurm, SGE, and Torque/PBS set environment variables that can be queried to figure out how many CPU cores the scheduler has alloted to the job. This allows the job script to to be agile to what it is allowed to run. For example, when submitting a SGE job to use four (4) cores:
the
my_script.sh
script knows how many cores it got by:Question: How do you achieve the same on HTCondor? Does HTCondor set environment variables in a similar way, or are there other ways to query the number of cores you've been assigned?
FWIW, I tried to search the web for how to do it, but I failed to find anything useful. The closest I found is in Section 2.5.11 of https://www.mn.uio.no/ifi/tjenester/it/hjelp/beregninger/htcondor/condor-manual.pdf:
HTCondor sets several additional environment variables for each executing job that may be useful for the job to reference.
_CONDOR_SCRATCH_DIR
gives the directory where the job may place temporary data files. This directory is unique for every job that is run, and its contents are deleted by HTCondor when the job stops running on a machine, no matter how the job completes._CONDOR_SLOT
gives the name of the slot (for SMP machines), on which the job is run. On machines with only a single slot, the value of this variable will be1
, just like theSlotID
attribute in the machine's ClassAd. This setting is available in all universes. See section 3.7.1 for more details about SMP machines and their configuration.CONDOR_VM
equivalent to_CONDOR_SLOT
described above, except that it is only available in the standard universe. NOTE: As of HTCondor version 6.9.3, this environment variable is no longer used. It will only be defined if theALLOW_VM_CRUFT
configuration variable is set toTrue
.X509_USER_PROXY
gives the full path to the X.509 user proxy file if one is associated with the job. Typically, a user will specify x509userproxy in the submit description file. This setting is currently available in the local, java, and vanilla universes._CONDOR_JOB_AD
is the path to a file in the job's scratch directory which contains the job ad for the currently running job. The job ad is current as of the start of the job, but is not updated during the running of the job. The job may read attributes and their values out of this file as it runs, but any changes will not be acted on in any way by HTCondor. The format is the same as the output of the condor_q -l command. This environment variable may be particularly useful in a USER_JOB_WRAPPER._CONDOR_MACHINE_AD
is the path to a file in the job's scratch directory which contains the machine ad for the slot the currently running job is using. The machine ad is current as of the start of the job, but is not updated during the running of the job. The format is the same as the output of the condor_status -l command._CONDOR_JOB_IWD
is the path to the initial working directory the job was born with._CONDOR_WRAPPER_ERROR_FILE
is only set when the administrator has installed aUSER_JOB_WRAPPER
. If this file exists, HTCondor assumes that the job wrapper has failed and copies the contents of the file to the StarterLog for the administrator to debug the problem.CONDOR_IDS
overrides the value of configuration variableCONDOR_IDS
, when set in the environment.CONDOR_ID
is set for scheduler universe jobs to be the same as theClusterId
attributeThe text was updated successfully, but these errors were encountered: