job_arrays

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
job_arrays [2017/07/04 22:03]
meesters
job_arrays [2019/03/27 11:55] (current)
meesters [Purpose]
Line 7: Line 7:
 We strongly recommend, using parallel processing in addition and to use job arrays as a convenience feature, without neglecting performance optimization. We strongly recommend, using parallel processing in addition and to use job arrays as a convenience feature, without neglecting performance optimization.
  
 +===== Maximal Job Array Size =====
 +
 +You can look up the maximal job array size using the following command:
 +<code bash>
 +$ scontrol show config| grep MaxArraySize
 +</​code>​
 +
 +The reason we do not document this value is, that it is subject to change.
 +
 +<WRAP center round info>
 +**Why is there an array size limit at all?**
 +
 +If unlimited or if the limit is huge, some users will see this as an invitation to submit a large number of otherwise unoptimized jobs. The idea behind job arrays is to ease workflows, particularly the submission of a bigger number of jobs. However, the motivations to [[node_local_scheduling|pool jobs]] and to optimize still apply.
 +</​WRAP>​
 ===== Job Arrays in Slurm ===== ===== Job Arrays in Slurm =====
  
Line 38: Line 52:
 </​code>​ </​code>​
  
-In the above example, The ''​--array=1-16''​ option will cause 16 array-tasks (numbered 1, 2, ..., 16) to be spawned when this master job script is submitted. The “array-tasks” are simply copies of this master script that are automatically submitted to the scheduler on your behalf. However, in each array-tasks an environment variable called ''​SLURM_ARRAY_TASK_ID''​ will be set to a unique value (in this example, a number in the range 1, 2, ..., 16). In your script, you can use this value to select, for example, a specific data file that each array-tasks will be responsible for processing.+In the above example, The ''​%%--array=1-16%%''​ option will cause 16 array-tasks (numbered 1, 2, ..., 16) to be spawned when this master job script is submitted. The “array-tasks” are simply copies of this master script that are automatically submitted to the scheduler on your behalf. However, in each array-tasks an environment variable called ''​SLURM_ARRAY_TASK_ID''​ will be set to a unique value (in this example, a number in the range 1, 2, ..., 16). In your script, you can use this value to select, for example, a specific data file that each array-tasks will be responsible for processing.
  
 Job array indices can be specified in a number of ways. For example: Job array indices can be specified in a number of ways. For example:
Line 55: Line 69:
 The ''​%A_%a''​ construct in the output and error file names is used to generate unique output and error files based on the master job ID (''​%A''​) and the array-tasks ID (''​%a''​). In this fashion, each array-tasks will be able to write to its own output and error file. The ''​%A_%a''​ construct in the output and error file names is used to generate unique output and error files based on the master job ID (''​%A''​) and the array-tasks ID (''​%a''​). In this fashion, each array-tasks will be able to write to its own output and error file.
  
-===== Multiprog for "​Uneven"​ Arrays =====+==== Limiting the number of concurrent jobs of an array ====
  
-The ''​--multi-prog''​ option in ''​srun''​ allows you to assign each parallel task in your job a different option. +It is possible ​to limit the number of concurrently executed jobs of an array, e.gto minimize I/O overhead within one approach, with this syntax:
- +
-Create your submission script with the basic detailsFor example call it job.sh+
  
 <code bash> <code bash>
-#!/bin/sh +#SBATCH --array=1-1000%50
-#SBATCH -n 16           # 16 cores +
-#​SBATCH ​-1-03:​00:​00 ​  # 1 day and 3 hours +
-#SBATCH -p compute ​     # partition name +
-#SBATCH -J my_job_name ​ # sensible name for the job +
- +
- +
-srun --multi-prog test.config+
 </​code>​ </​code>​
-The file ''​test.config''​ contains the parameters required by the ''​--multi-prog''​ option. 
- 
-The configuration file contains three fields, separated by blanks. These fields are : 
- 
-  *   Task number 
-  *     ​Executable File 
-  *     ​Argument ​ 
-  
-Parameters available : 
- 
-  * ''​%t''​ - The task number of the responsible task 
-  * ''​%o''​ - The task offset (task'​s relative position in the task range). ​ 
-  
-An example configuration file : 
-<code bash> 
-0-3             ​hostname ​       ​ 
-4,5             ​echo ​           task:%t 
-6               ​echo ​           task:%t-%o 
-7               ​echo ​           task:%o 
-8-15            hostname ​       ​ 
-</​code>​ 
- 
-Please note that if you're using a custom executable, you should supply the full PATH to the file. 
- 
-For example: 
-<code bash> 
-0   /​home/​users/​jbloggs/​bin/​my_bin input1 
-1   /​home/​users/​jbloggs/​bin/​my_bin input2 
-... 
-</​code>​ 
- 
-===== ZDV-taskfarm as a Shortcut for Multiprog ===== 
- 
-The module ''​tools/​staskfarm/​0.1''​ provides a shortcut for writing more cumbersome configuration files. The script comes with a help message: 
- 
-<code bash> 
-$ staskfarm -h 
-</​code>​ 
- 
-And it can be used within a slurm environment,​ e.g.: 
- 
-<code bash> 
-#SBATCH -N 2 
-#SBATCH ... 
-#SBATCH --ntasks=x 
- 
-# preliminary tasks go here 
- 
-module load tools/​staskfarm/​0.1 
- 
-taskfarm <​command>​ [filename(s)] 
-</​code>​ 
-Note that ''​--ntasks''​ needs to be specified with a number ''​x''​ lower or equal the number of processors supplied by ''​N''​ divided by the number of threads per task. 
- 
-The code is hosted at [[https://​github.com/​cmeesters/​staskfarm|github]],​ where a "​nicer"​ readme-layout resides. 
  
 +where a limit of 50 concurrent jobs would be in place.
 +===== Multiprog for "​Uneven"​ Arrays =====
  
 +The ''​%%--multi-prog%%''​ option in ''​srun''​ allows you to assign each parallel task in your job with a different option. More information can be found at [[node_local_scheduling|our wiki page on node-local scheduling]].
  • job_arrays.1499198631.txt.gz
  • Last modified: 2017/07/04 22:03
  • by meesters