User Tools

Site Tools


job_arrays

Job Arrays

Purpose

According to the Slurm Job Array Documentation, “job arrays offer a mechanism for submitting and managing collections of similar jobs quickly and easily.” In general, job arrays are useful for applying the same processing routine to a collection of multiple input data files. Job arrays offer a very simple way to submit a large number of independent processing jobs.

We strongly recommend, using parallel processing in addition and to use job arrays as a convenience feature, without neglecting performance optimization.

Maximal Job Array Size

You can look up the maximal job array size using the following command:

$ scontrol show config| grep MaxArraySize

The reason we do not document this value is, that it is subject to change.

Why is there an array size limit at all?

If unlimited or if the limit is huge, some users will see this as an invitation to submit a large number of otherwise unoptimized jobs. The idea behind job arrays is to ease workflows, particularly the submission of a bigger number of jobs. However, the motivations to pool jobs and to optimize still apply.

Job Arrays in Slurm

By submitting a single job array sbatch script, a specified number of “array-tasks” will be created based on this “master” sbatch script. An example job array script is given below:

#!/bin/bash
 
#SBATCH --job-name=arrayJob
#SBATCH --output=arrayJob_%A_%a.out # redirecting stdout
#SBATCH --error=arrayJob_%A_%a.err  # redirecting stderr
#SBATCH --array=1-16 
#SBATCH --time=01:00:00
#SBATCH --partition=short # for mogon I
#SBATCH --ntasks=1        # number of tasks per array job
#SBATCH --mem-per-cpu=4000
 
 
######################
# Begin work section #
######################
 
# Print this sub-job's task ID
echo "My SLURM_ARRAY_TASK_ID: " $SLURM_ARRAY_TASK_ID
 
# Do some work based on the SLURM_ARRAY_TASK_ID
# For example: 
# ./my_process $SLURM_ARRAY_TASK_ID
# 
# where my_process is you executable

In the above example, The --array=1-16 option will cause 16 array-tasks (numbered 1, 2, …, 16) to be spawned when this master job script is submitted. The “array-tasks” are simply copies of this master script that are automatically submitted to the scheduler on your behalf. However, in each array-tasks an environment variable called SLURM_ARRAY_TASK_ID will be set to a unique value (in this example, a number in the range 1, 2, …, 16). In your script, you can use this value to select, for example, a specific data file that each array-tasks will be responsible for processing.

Job array indices can be specified in a number of ways. For example:

#A job array with index values between 0 and 31:
#SBATCH --array=0-31
 
#A job array with index values of 1, 2, 5, 19, 27:
#SBATCH --array=1,2,5,19,27
 
#A job array with index values between 1 and 7 with a step size of 2 (i.e. 1, 3, 5, 7):
#SBATCH --array=1-7:2

The %A_%a construct in the output and error file names is used to generate unique output and error files based on the master job ID (%A) and the array-tasks ID (%a). In this fashion, each array-tasks will be able to write to its own output and error file.

Limiting the number of concurrent jobs of an array

It is possible to limit the number of concurrently executed jobs of an array, e.g. to minimize I/O overhead within one approach, with this syntax:

#SBATCH --array=1-1000%50

where a limit of 50 concurrent jobs would be in place.

Multiprog for "Uneven" Arrays

The --multi-prog option in srun allows you to assign each parallel task in your job with a different option. More information can be found at our wiki page on node-local scheduling.

job_arrays.txt · Last modified: 2019/03/27 11:55 by meesters