Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

config for batchtools_sge? #26

Open
nick-youngblut opened this issue Jul 5, 2018 · 9 comments
Open

config for batchtools_sge? #26

nick-youngblut opened this issue Jul 5, 2018 · 9 comments
Labels

Comments

@nick-youngblut
Copy link

Sorry if this is in the docs and I can't find it, but is there a way to specify default resources for the template? When just using batchtools, default resources can be set with a ~/.batchtools.conf.R file. However, this file doesn't seem to work with future.batchtools::plan().

@wlandau
Copy link

wlandau commented Jul 5, 2018

A couple options I use:

future::plan(future.batchtools::batchtools_sge, template = "sge-simple.tmpl") seems to work this way.

@nick-youngblut
Copy link
Author

How do you provide defaults for the variables in the template file? I'm using a template that includes activating a conda environment:

$ cat ~/.batchtools.sge.tmpl
#!/bin/bash

## The name of the job, can be anything, simply used when displaying the list of running jobs
#$ -N <%= job.name %>

## Combining output/error messages into one file
#$ -j y

## Giving the name of the output log file
#$ -o <%= log.file %>

## One needs to tell the queue system to use the current directory as the working directory
## Or else the script may fail as it will execute in your top level home directory /home/username
#$ -cwd

## Use environment variables
#$ -V

## time
#$ -l h_rt=<%= resources$h_rt %>

## memory
#$ -l h_vmem=<%= resources$h_vmem %>


export PATH=<%= resources$conda.path %>:$PATH
source activate <%= resources$conda.env %>

## Export value of DEBUGME environemnt var to slave
export DEBUGME=<%= Sys.getenv("DEBUGME") %>

<%= sprintf("export OMP_NUM_THREADS=%i", resources$omp.threads) -%>
<%= sprintf("export OPENBLAS_NUM_THREADS=%i", resources$blas.threads) -%>
<%= sprintf("export MKL_NUM_THREADS=%i", resources$blas.threads) -%>

Rscript -e 'batchtools::doJobCollection("<%= uri %>")'
exit 0

...and I'd like to set defaults for resources$conda.path and resources$conda.env. When just using batchtools, setting resources can be done with a config file:

$ cat ~/.batchtools.conf.R
default.resources = list(h_rt = '00:59:00',
                         h_vmem = '4G',
                         conda.env = "py3",
                         conda.path = "/ebio/abt3_projects/software/miniconda3/bin")
cluster.functions = makeClusterFunctionsSGE(template = "~/.batchtools.tmpl")
temp.dir = "/ebio/abt3_projects/temp_data/"

@wlandau
Copy link

wlandau commented Dec 7, 2018

From revisiting this section of the README, I think I understand a little more. I am also trying to use the resources argument on SGE. This is my template file sge_batchtools.tmpl:

#!/bin/bash
#$ -cwd
#$ -j y
#$ -o <%= log.file %>
#$ -V
#$ -N <%= job.name %>
#$ -pe smp <%= resources[["slots"]] %>
Rscript -e 'batchtools::doJobCollection("<%= uri %>")'
exit 0

and my script run.R:

library(future.batchtools)
future::plan(batchtools_sge(template = "sge_batchtools.tmpl"))
future(system2("hostname"))

which gives an error:

$ Rscript run.R
Loading required package: future
Error: Fatal error occurred: 101. Command 'qsub' produced exit code 2. Output: 'Unable to read script file because of error: ERROR! -pe option must have range as 2nd argument'
Execution halted

But when I replace <%= resources[["slots"]] %> with 2 in sge_batchtools.tmpl, Rscript run.R submits one job with two slots as desired.

Related: futureverse/future#181, futureverse/future#263, ropensci/drake#169.

@HenrikBengtsson
Copy link
Collaborator

HenrikBengtsson commented Dec 8, 2018

Don't know SGE well enough, so I could be wrong, but I think you wanna specify parallel environment "smp" (symmetric multiprocessing) as in -pe smp 2.

https://github.com/BIMSBbioinfo/intro2UnixandSGE/blob/master/sun_grid_engine_for_beginners/how_to_submit_a_job_using_qsub.md

@HenrikBengtsson
Copy link
Collaborator

My bad - I somehow missed that you do indeed specify smp - I should go will to sleep now.

@wlandau
Copy link

wlandau commented Dec 19, 2018

Found the problem in #26 (comment): my run.R script did not actually set the slots element of resources. This worked for me:

library(future.batchtools)
future::plan(batchtools_sge(template = "sge_batchtools.tmpl"))
future(system2("hostname"), resources = list(slots = 2))

As desired, I saw a short-lived job with 2 slots on the cluster.

@nick-youngblut
Copy link
Author

At least with the configuration that I have list above, I get no output from failed jobs. Moreover, it's not clear where the qsub job log file is, given that it's just set as <%= log.file %> in the *.tmpl file. I also haven't found any documentation about how best to troubleshoot failed qsub jobs (eg., AFAK, there's no getLog() for future.batchtools and batchtools::getLog() doesn't work with future.batchtools jobs).

Is there a good way to troubleshoot failed jobs? Preferably, I would like a function to print the stderr/stdout from each job and the qacct -j JOBID info. I really like using future.batchtools + future.apply, but it's always a pain to troubleshoot failed jobs.

@nick-youngblut
Copy link
Author

I also haven't found any documentation about how best to troubleshoot failed qsub jobs (eg., AFAK, there's no getLog() for future.batchtools and batchtools::getLog() doesn't work with future.batchtools jobs).

Still a problem.

Also, it's not clear what variables are available in the template. I know of job.name, log.file, and resources, but are there any others? If so, is there documentation on this?

@HenrikBengtsson
Copy link
Collaborator

I'd like to redirect this question/ask/request to the batchtools package. I agree that {future.batchtools} might be able to improve it's documentation on this but I want to minimize any type of redundacy here and thereby the risk of falling out of sync with {batchtools}; {batchtools} is in charge on how things work below the future layer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants