start:working_on_mogon:io_odds_and_ends:slurm_localscratch

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
start:working_on_mogon:io_odds_and_ends:slurm_localscratch [2021/06/10 17:21]
meesters
start:working_on_mogon:io_odds_and_ends:slurm_localscratch [2022/06/20 18:05] (current)
meesters [Signalling in SLURM -- difference between signalling submission scripts and applications] - minor grammar fixes and removed doubled lines
Line 1: Line 1:
 ====== Local Scratch Space ====== ====== Local Scratch Space ======
  
-On every node, there is local scratch space available to your running jobs that you should use if required by your jobs IO-pattern.+On every node, there is local scratch space available to your running jobs.
 Every job can therefore use a directory called ''/localscratch/${SLURM_JOB_ID}/'' on the local disk. If a job array starts then this directory also called ''/localscratch/${SLURM_JOB_ID}/'', where the variable ''SLURM_ARRAY_TASK_ID'' is an index of a subjob in the job array and unrelated to ''$SLURM_JOB_ID'' Every job can therefore use a directory called ''/localscratch/${SLURM_JOB_ID}/'' on the local disk. If a job array starts then this directory also called ''/localscratch/${SLURM_JOB_ID}/'', where the variable ''SLURM_ARRAY_TASK_ID'' is an index of a subjob in the job array and unrelated to ''$SLURM_JOB_ID''
 +
 +<callout type="info" icon="true" title="When to use Local Scratch">
 +If your job(s) in question are merely reading and writing big files in a linear mode, there is no requirement to use a local scratch or a ramdisk. However, these are scenarios, where using the local scratch might be beneficial:
 +  * if your job produces many temporary files
 +  * if your job reads a file or set of files in a directory repeatedly during run time (for multiple threads or concurrent jobs mean a random access pattern to the global file system, which is a true performance killer)
 +</callout>
  
 <callout type="info" icon="true"> <callout type="info" icon="true">
Line 85: Line 91:
 $ sbatch --signal=SIGUSR2@600 ... $ sbatch --signal=SIGUSR2@600 ...
 </code> </code>
-This would send the signal ''SIGUSR2'' to the application ten minutes before hitting the walltime of the job. Note that the slurm documentation states that there is uncertainty of up to 1 minute.+This would send the signal ''SIGUSR2'' to the application ten minutes before hitting the walltime of the job. Note that the slurm documentation states that there is an uncertainty of up to 1 minute.
  
 **Usually** this requires you to use  **Usually** this requires you to use 
Line 93: Line 99:
 </code> </code>
  
-or rather +within a submission script to signal the batch-job (instead of all the children of but not the batch job itselft). The reason is: If using a submission script like the one above, you trap the signal within the script, not the application. 
- +
-<code bash> +
-#SBATCH --signal=B:SIGUSR2@600 ... +
-</code> +
- +
-withing a submission script to signal the batch-job (instead of all the children of but not the batch job itselft). The reason is: If using a submission script like the one above, you trap the signal within the script, not the application. +
  
 </callout> </callout>
  • start/working_on_mogon/io_odds_and_ends/slurm_localscratch.1623338503.txt.gz
  • Last modified: 2021/06/10 17:21
  • by meesters