Skip to content

Commit

Permalink
Browse files Browse the repository at this point in the history
  • Loading branch information
richelbilderbeek committed May 22, 2024
2 parents d67bb52 + c5dec77 commit 43a50ee
Show file tree
Hide file tree
Showing 8 changed files with 122 additions and 15 deletions.
3 changes: 2 additions & 1 deletion .wordlist.txt
Original file line number Diff line number Diff line change
Expand Up @@ -278,4 +278,5 @@ scontrol
Snakemake
TMP
checkmark

pseudonymisation
schedmd
2 changes: 1 addition & 1 deletion docs/extra/devel.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@
- A little cumbersome but doable!

- For collaboration within a ``sens`` project your can have a "local" ``remote`` repo in your common project folder.
- [More on Git on Bianca](https://www.uppmax.uu.se/support/faq/software-faq/git-on-bianca/)
- [More on Git on Bianca](http://docs.uppmax.uu.se/cluster_guides/git_on_bianca/)



Expand Down
2 changes: 1 addition & 1 deletion docs/extra/julia.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@
- The first time Julia will precompile the package for you!
- You may control the present "central library" by typing ``ml help julia/<version>`` in the BASH shell.
- There you will also find which python, gcc and openmpi version that are compatible.
- Or see the [Julia user guide at UPPMAX](https://www.uppmax.uu.se/support/user-guides/julia-user-guide/){:target="_blank"}
- Or see the [Julia user guide at UPPMAX](http://docs.uppmax.uu.se/software/julia/){:target="_blank"}
- A possibly more up-to-date status can be found from the Julia shell:

``` julia
Expand Down
8 changes: 4 additions & 4 deletions docs/extra/slurm.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
# More about Slurm

- [Link to Slurm session in Intro to UPPMAX course](https://www.uppmax.uu.se/digitalAssets/560/c_560271-l_1-k_uppmax-slurm-2023-02.pdf){:target="_blank"}
- [Link to Slurm session in Intro to UPPMAX course](https://www.uu.se/download/18.57591c9d18f3ec99a0521784/1715116006615/c_560271-l_1-k_uppmax-slurm-2024-01.pdf){:target="_blank"}
- [Slurm documentation](https://slurm.schedmd.com/){:target="_blank"}
- [Slurm user guide](https://www.uppmax.uu.se/support/user-guides/slurm-user-guide/){:target="_blank"}
- [Discovering job resource usage with `jobstats`](https://www.uppmax.uu.se/support/user-guides/jobstats-user-guide/){:target="_blank"}
- [Plotting your core hour usage](https://www.uppmax.uu.se/support/user-guides/plotting-your-core-hour-usage/){:target="_blank"}
- [Slurm user guide](http://docs.uppmax.uu.se/cluster_guides/slurm/){:target="_blank"}
- [Discovering job resource usage with `jobstats`](http://docs.uppmax.uu.se/software/jobstats/){:target="_blank"}
- [Plotting your core hour usage](http://docs.uppmax.uu.se/software/projplot/){:target="_blank"}

## Example

Expand Down
6 changes: 3 additions & 3 deletions docs/intermediate/install.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,9 +47,9 @@
- Usually it is not a problem to build on Rackham and move to Bianca.
- ``cmake`` is available as module
- Check with: ``$ ml avail cmake``
- [Guide for compiling **serial** programs](https://www.uppmax.uu.se/support/user-guides/compiling-source-code/){:target="_blank"}
- [Guide for compiling **parallel** programs](https://www.uppmax.uu.se/support/user-guides/mpi-and-openmp-user-guide/){:target="_blank"}
- [Available **combinations** of compilers and parallel libraries](https://www.uppmax.uu.se/support/user-guides/mpi-and-openmp-user-guide/#tocjump_7075108295107558_2){:target="_blank"}
- [Guide for compiling **serial** programs](http://docs.uppmax.uu.se/cluster_guides/compiling_serial/){:target="_blank"}
- [Guide for compiling **parallel** programs](http://docs.uppmax.uu.se/cluster_guides/compiling_parallel/){:target="_blank"}
- [Available **combinations** of compilers and parallel libraries](http://docs.uppmax.uu.se/cluster_guides/compiling_parallel/#mpi-using-the-openmpi-library){:target="_blank"}

???- info "About CPU hardware on Bianca"

Expand Down
107 changes: 107 additions & 0 deletions docs/intermediate/slurm_bianca.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,14 +8,99 @@
- Advanced job submission

## The Slurm Workload Manager

- Free, popular, lightweight
- Open source: https://slurm.schedmd.com
- available at all SNIC centres
- [UPPMAX Slurm user guide](http://docs.uppmax.uu.se/cluster_guides/slurm/)

### More on sbatch
Recap:

sbatch | -A sens2023598 | -t 10:00 | -p core | -n 10 | my_job.sh
-|-|-|-|-|-
slurm batch| project name | max runtime | partition ("job type") | #cores | job script


### More on time limits

- ``-t dd-hh:mm:ss``
- ``0-00:10:00 = 00:10:00 = 10:00 = 10``
- ``0-12:00:00 = 12:00:00``
- ``3-00:00:00 = 3-0``
- ``3-12:10:15``

### Job walltime

???- question "When you have no idea how long a program will take to run, what should you book?"

A: very long time, e.g. 10-00:00:00

???- question "When you have an idea of how long a program would take to run, what should you book?"

A: overbook by 50%

### More on partitions

- ``-p core``
- “core” is the default partition
- ≤ 16 cores on Bianca
- a script or program written without any thought on parallelism will use 1 core

- ``-p node`
- if you wish to book full node(s)

### Quick testing

- The “devel” partition

- max 2 nodes per job
- up to 1 hour in length
- only 1 at a time
- ``-p devcore``, ``-p devel`
???- question "Any free nodes in the devel partition? Check status with"

- ``sinfo -p devel``
- ``jobinfo -p devel`

- more on these tools later
- High priority queue for short jobs

- 4 nodes
- up to 15 minutes
- ``--qos=short``

### Debugging or complicated workflows
- Interactive jobs

- handy for debugging a code or a script by executing it line by line or for using programs with a graphical user interface
- ``salloc -n 80 -t 03:00:00 -A sens2023598``
- ``interactive -n 80 -t 03:00:00 -A sens2023598`

- up to 12 hours
- useful together with the --begin=<time> flag
- ``salloc -A snic2022-22-50 --begin=2022-02-17T08:00:00`

- asks for an interactive job that will start earliest tomorrow at 08:00

### Parameters in the job script or the command line?

- Command line parameters override script parameters
- A typical script may be:

```bash
#!/bin/bash
#SBATCH -A sens2023598
#SBATCH -p core
#SBATCH -n 1
#SBATCH -t 24:00:00
```
Just a quick test:

```console
sbatch -p devcore -t 00:15:00 jobscript.sh
```

???+ question "Hands-on #1: sbatch/jobinfo"

- login to Bianca
Expand All @@ -26,7 +111,29 @@
- write in the HackMD when you’re done

### Memory in core or devcore jobs

- ``-n X`
- Bianca: 8GB per core
- Slurm reports the available memory in the prompt at the start of an interactive job

### More flags
- ``-J <jobname>`
- email:

- ``--mail-type=BEGIN,END,FAIL,TIME_LIMIT_80``
- ``--mail-user``

- Don’t use. Set your email correctly in SUPR instead.

- out/err redirection:

- ``--output=slurm-%j.out`` and ``—-error=slurm-%j.err`

- by default, where %j will be replaced by the job ID

- ``--output=my.output.file``
- ``--error=my.error.file``


## Monitoring jobs
### Monitoring and modifying jobs
Expand Down
6 changes: 3 additions & 3 deletions docs/intermediate/slurm_intro_with_jobstats.md
Original file line number Diff line number Diff line change
Expand Up @@ -442,9 +442,9 @@ Examine the jobs run by user `douglas`. The relevant job numbers are the jobs wi
## Links

- [Slurm documentation](https://slurm.schedmd.com/){:target="_blank"}
- [Slurm user guide](https://www.uppmax.uu.se/support/user-guides/slurm-user-guide/){:target="_blank"}
- [Discovering job resource usage with `jobstats`](https://www.uppmax.uu.se/support/user-guides/jobstats-user-guide/){:target="_blank"}
- [Plotting your core hour usage](https://www.uppmax.uu.se/support/user-guides/plotting-your-core-hour-usage/){:target="_blank"}
- [Slurm user guide](http://docs.uppmax.uu.se/cluster_guides/slurm/){:target="_blank"}
- [Discovering job resource usage with `jobstats`](http://docs.uppmax.uu.se/software/jobstats/){:target="_blank"}
- [Plotting your core hour usage](http://docs.uppmax.uu.se/software/projplot/){:target="_blank"}


!!! abstract "Keypoints"
Expand Down
3 changes: 1 addition & 2 deletions docs/intermediate/transfer.md
Original file line number Diff line number Diff line change
Expand Up @@ -173,8 +173,7 @@ Exercise 1 and 2 are the most important, as:
but do not wait for feedback.
If you have no idea at all, read the linked UPPMAX documentation:

- What is [SUNET](docs.uppmax.uu.se/getting_started/get_inside_sunet)?
- What is [`ssh`](http://docs.uppmax.uu.se/software/ssh/)? What does it allow us to do?
- What is [SUNET](http://docs.uppmax.uu.se/getting_started/get_inside_sunet/)?
- What is [`wharf`](http://docs.uppmax.uu.se/cluster_guides/wharf/)? What does it allow us to do?
- What is [`rsync`](http://docs.uppmax.uu.se/software/rsync/)?
- What is [`transit`](http://docs.uppmax.uu.se/cluster_guides/transit/)?
Expand Down

0 comments on commit 43a50ee

Please sign in to comment.