node_local_scheduling

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
node_local_scheduling [2018/08/01 14:23]
meesters [An example showing the use of functions, variables and redirection]
node_local_scheduling [2020/10/02 15:11]
jrutte02 removed
Line 1: Line 1:
 ====== Node-local scheduling ====== ====== Node-local scheduling ======
  
-There are some use cases, where you would want to simply request a **full cluster node** from the LSF batch system and then run **many** //(e.g. much more than 64)// **small** //(e.g. only a fragment of the total job runtime)// tasks on this full node. Then of course you will need some **local scheduling** on this node to ensure proper utilization of all cores.+There are some use cases, where you would want to simply request a **full cluster node** from slurm and then run **many** //(e.g. much more than 64)// **small** //(e.g. only a fragment of the total job runtime)// tasks on this full node. Then of course you will need some **local scheduling** on this node to ensure proper utilization of all cores.
  
 To accomplish this, we suggest you use the [[http://www.gnu.org/software/parallel/|GNU Parallel]] program. The program is installed to ''/cluster/bin'', but you can also simply load the [[modules|modulefile]] ''software/gnu_parallel'' so that you can also access its man page. To accomplish this, we suggest you use the [[http://www.gnu.org/software/parallel/|GNU Parallel]] program. The program is installed to ''/cluster/bin'', but you can also simply load the [[modules|modulefile]] ''software/gnu_parallel'' so that you can also access its man page.
Line 23: Line 23:
 </file> </file>
  
-Now of course we could submit 150 jobs using LSF or we could use one job which processes the files one after another, but the most elegant way would be to submit one job for 64 cores (e.g. a whole node on Mogon I) and process the files in parallel. This is especially convenient, since we can then use the ''nodeshort'' queue which has better scheduling characteristics than ''short'' (while both show better scheduling compared to there ''long'' counterparts:+Now of course we could submit 150 jobs using slurm or we could use one job which processes the files one after another, but the most elegant way would be to submit one job for 64 cores (e.g. a whole node on Mogon I) and process the files in parallel. This is especially convenient, since we can then use the ''nodeshort'' queue which has better scheduling characteristics than ''short'' (while both show better scheduling compared to their ''long'' counterparts:
  
 <file bash parallel_job> <file bash parallel_job>
Line 162: Line 162:
 #SBATCH -p parallel  # for Mogon II #SBATCH -p parallel  # for Mogon II
 #SBATCH --nodes=3 # appropriate number of Nodes #SBATCH --nodes=3 # appropriate number of Nodes
-#SBATCH -n 192    # example value for Mogon I, see below+#SBATCH -n 24    # example value for Mogon I, see below
 #SBATCH -t 300 #SBATCH -t 300
-#SBATCH --cpus-per-task=8 # we assume an application which scales to 8 threads, but +#SBATCH -c=8 # we assume an application which scales to 8 threads, but 
-                          # -c / --cpus-per-task could also be ommited (default is =1) +             # -c / --cpus-per-task cat be ommited (default is =1) 
-                          # or set to a different value.+             # or set to a different value.
 #SBATCH -o <your logfile prefix>_%j.log #SBATCH -o <your logfile prefix>_%j.log
  
Line 181: Line 181:
 </file> </file>
  
-The number of tasks given by ''-n'' should be the number of CPUs * the number of nodes. However, bear in mind that the a-nodes of Mogon I have 1 FPU per 2 CPU Module and the z-nodes of Mogon II have 20 CPUs, each with hyptherthreading enables. Which number you best assume to be the number of cores is application depended and should best be determined experimentally.+<WRAP center round info 95%> 
 +The number of tasks (given by ''-n'') times the number of cpus per task (given by ''-c'') needs to be equal the number of nodes (given by ''-N'') times number of CPUs per nodes (to be inferred from ''scontrol show node <nodename>'' or in the [[nodes|wiki]].) Or (in pseudo bash)): 
 + 
 +<code bash> 
 +# ensure 
 +((SLURM_CPUS_PER_TASK * SLURM_NTASKS)) -eq $((SLURM_CPUS_ON_NODE * SLURM_CPUS_ON_NODE)) 
 +</code> 
 +</WRAP>
  
 ====== SLURM multiprog for uneven arrays ====== ====== SLURM multiprog for uneven arrays ======