Using Slurm

Submitting jobs to MOGON using Slurm

The Simple Linux Utility for Resource Management, or short SLURM, has evolved from a simple tool for resource management to a highly capable scheduler. It is used by many supercomputers worldwide nowadays and takes a central place in your daily work on MOGON by fulfilling several functions:

  • Slurm allocates resources within the cluster depending on your requirements. It is your gateway to access compute nodes from the login nodes—no matter if interactively or via batch jobs.
  • It provides a framework for launching, monitoring and otherwise managing jobs.
  • When there is more work than resources, Slurm schedules pending jobs, balances workloads, and manages the queues for different partitions on MOGON.

Submitting a Job

In the context of Slurm a job is a work package usually defined in a bash script. It contains

  • the resource requirements and meta data,
  • setup of the working environment, and
  • the list of tasks to be executed as job steps.

Here is an example:


#========[ + + + + Requirements + + + + ]========#
#SBATCH --partition=smp
#SBATCH --account=hpckurs

#========[ + + + + Job Steps + + + + ]========#
srun echo "Hello, world!"

As demonstrated in the script above, resource requirements are passed to Slurm line by line and indicated with the #SBATCH keyword. This script submits a job to MOGON’s smp partition, which is intended for jobs that require only a small number of CPUs. The account that will be billed for consumed resources is hpckurs.

Slurm will reject jobs that do not at least set --partition and --account.

Since this is just a toy example, there is no need for us to load any software modules. Therefore, the setup of the working environment it omitted. The srun command initiates a job step.

The job will be sent to Slurm for processing using the command

sbatch <filename>