Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
node_local_scheduling [2016/03/04 20:41] meesters [An example showing the use of functions, variables and redirection] |
node_local_scheduling [2020/10/02 15:11] jrutte02 removed |
||
---|---|---|---|
Line 1: | Line 1: | ||
====== Node-local scheduling ====== | ====== Node-local scheduling ====== | ||
- | There are some use cases, where you would want to simply request a **full cluster node** from the LSF batch system | + | There are some use cases, where you would want to simply request a **full cluster node** from slurm and then run **many** //(e.g. much more than 64)// **small** //(e.g. only a fragment of the total job runtime)// tasks on this full node. Then of course you will need some **local scheduling** on this node to ensure proper utilization of all cores. |
To accomplish this, we suggest you use the [[http:// | To accomplish this, we suggest you use the [[http:// | ||
Line 23: | Line 23: | ||
</ | </ | ||
- | Now of course we could submit 150 jobs using LSF or we could use one job which processes the files one after another, but the most elegant way would be to submit one job for 64 cores (e.g. a whole node) and process the files in parallel. This is especially convenient, since we can then use the '' | + | Now of course we could submit 150 jobs using slurm or we could use one job which processes the files one after another, but the most elegant way would be to submit one job for 64 cores (e.g. a whole node on Mogon I) and process the files in parallel. This is especially convenient, since we can then use the '' |
- | + | ||
- | Let's further assume that our program is able to work in parallel itself using OpenMP. | + | |
- | We determined that '' | + | |
- | This means we can launch '' | + | |
<file bash parallel_job> | <file bash parallel_job> | ||
#!/bin/bash | #!/bin/bash | ||
- | # LSF Job parameters (could also be given on the bsub command line) | ||
- | # Job name | ||
- | #BSUB -J parallel_job | ||
- | # Queue | ||
- | #BSUB -q nodelong | ||
- | # Number of cores | ||
- | #BSUB -n 64 | ||
- | # Memory reservation | ||
- | #BSUB -app Reserve1800M | ||
- | # Allowed job runtime (maximum) | ||
- | #BSUB -W 7200 | ||
- | # Store working directory to be safe | + | #SBATCH --job-name=demo_gnu_parallel |
- | SAVEDPWD=`pwd` | + | #SBATCH --output=res_gnu_parallel.txt |
+ | #SBATCH --ntasks=4 | ||
+ | #SBATCH --time=10: | ||
+ | #SBATCH --mem-per-cpu=100 | ||
+ | #SBATCH -p short | ||
+ | #SBATCH -A <your account> | ||
- | # First, we copy the input data files and the program to the local filesystem | + | # will load the most recent version |
- | cp " | + | module load tools/parallel |
- | cp " | + | |
- | # Change | + | # Store working |
- | cd "/jobdir/${LSB_JOBID}/" | + | SAVEDPWD=$(pwd) |
+ | # set jobdir | ||
+ | export JOBDIR=/localscratch/$SLURM_JOB_ID | ||
- | export OMP_NUM_THREADS=4 | + | # suppose we want to process 150 data files, we need to create them for the purpose of the example: |
+ | for ((i=0; i < 151; i++)); do | ||
+ | fname=" | ||
+ | echo "{0..4}" >> $fname | ||
+ | echo " | ||
+ | done | ||
+ | |||
+ | # First, we copy the input data files and the program to the local filesystem of our node | ||
+ | # (we pretend it is useful - an actual use case are programs with random I/O) on those files | ||
+ | cp " | ||
+ | # Change directory to jobdir | ||
+ | cd $JOBDIR | ||
+ | |||
+ | # we could set the number of threads for the program to use like this: | ||
+ | # export OMP_NUM_THREADS=4 | ||
+ | # but in this case the program is not threaded | ||
+ | |||
# -t enables verbose output to stderr | # -t enables verbose output to stderr | ||
# We could also set -j $((LSB_DJOB_NUMPROC/ | # We could also set -j $((LSB_DJOB_NUMPROC/ | ||
- | # The --delay parameter | + | # The --delay parameter |
# | # | ||
# --progress will output the current progress of the parallel task execution | # --progress will output the current progress of the parallel task execution | ||
Line 64: | Line 71: | ||
# Both variants will have equal results: | # Both variants will have equal results: | ||
#parallel -t -j 16 --delay 1 --progress " | #parallel -t -j 16 --delay 1 --progress " | ||
- | find . -name ' | + | find . -name ' |
# See the GNU Parallel documentation for more examples and explanation | # See the GNU Parallel documentation for more examples and explanation | ||
+ | |||
# Now capture exit status code, parallel will have set it to the number of failed tasks | # Now capture exit status code, parallel will have set it to the number of failed tasks | ||
STATUS=$? | STATUS=$? | ||
+ | | ||
# Copy output data back to the previous working directory | # Copy output data back to the previous working directory | ||
- | cp "/ | + | cp $JOBDIR/data_*.out $SAVEDPWD/ |
+ | | ||
exit $STATUS | exit $STATUS | ||
</ | </ | ||
- | For this example. the program only sleeps for some seconds and then counts the words in the input data file using '' | + | <code bash> |
- | + | $ sbatch parallel_example_script.sh | |
- | <file bash> | + | </code> |
- | $ bsub < parallel_job | + | |
- | </file> | + | |
After this job has run, we should have the results/ | After this job has run, we should have the results/ | ||
Line 95: | Line 101: | ||
</ | </ | ||
+ | ===== Multithreaded Programs ===== | ||
- | ==== An example showing | + | Let's further assume that our program is able to work in parallel itself using OpenMP. |
+ | We determined that '' | ||
+ | This means we can launch '' | ||
- | This example shows how to use user-defined functions, variables and anonymous pipes in bash. It uses [[http://bio-bwa.sourceforge.net/ | + | < |
+ | #!/bin/bash | ||
+ | # | ||
+ | #SBATCH --output=res_gnu_parallel.txt | ||
+ | #SBATCH --cpus-per-task=8 | ||
+ | #SBATCH --time=10:00 | ||
+ | #SBATCH --mem-per-cpu=100 | ||
+ | #SBATCH -p short | ||
+ | #SBATCH -A <your account> | ||
- | * Note, that this example sets '' | + | # will load the most recent version |
- | * Also note, that information about the function is carried to the sub-shells with the '' | + | module load tools/parallel |
- | * Variables which are not stated upon the call of GNU '' | + | |
- | * Positional variables (here, just '' | + | |
- | //In particular//: | + | # Store working directory to be safe |
+ | SAVEDPWD=$(pwd) | ||
- | <file bash> | + | JOBDIR=/ |
+ | RAMDISK=$JOBDIR/ | ||
+ | |||
+ | export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK | ||
+ | |||
+ | # -t enables verbose output to stderr | ||
+ | # We could also set -j $((LSB_DJOB_NUMPROC/ | ||
+ | # The --delay parameter is used to distribute I/O load at the beginning of program execution by | ||
+ | # | ||
+ | # --progress will output the current progress of the parallel task execution | ||
+ | # {} will be replaced by each filename | ||
+ | # {#} will be replaced by the consecutive job number | ||
+ | # Both variants will have equal results: | ||
+ | #parallel -t -j 16 --delay 1 --progress " | ||
+ | find . -name ' | ||
+ | # See the GNU Parallel documentation for more examples and explanation | ||
+ | |||
+ | # Now capture exit status code, parallel will have set it to the number of failed tasks | ||
+ | STATUS=$? | ||
+ | |||
+ | exit $STATUS | ||
+ | </ | ||
+ | |||
+ | |||
+ | |||
+ | |||
+ | |||
+ | |||
+ | ===== Running on several hosts ===== | ||
+ | |||
+ | We do not recommend supplying a hostlist to GNU parallel with the '' | ||
+ | |||
+ | <file bash multi_host> | ||
#!/bin/bash | #!/bin/bash | ||
+ | #SBATCH -J <your meaningful job name> | ||
+ | #SBATCH -A <your account> | ||
+ | #SBATCH -p nodeshort # for Mogon I | ||
+ | #SBATCH -p parallel | ||
+ | #SBATCH --nodes=3 # appropriate number of Nodes | ||
+ | #SBATCH -n 24 # example value for Mogon I, see below | ||
+ | #SBATCH -t 300 | ||
+ | #SBATCH -c=8 # we assume an application which scales to 8 threads, but | ||
+ | # -c / --cpus-per-task cat be ommited (default is =1) | ||
+ | # or set to a different value. | ||
+ | #SBATCH -o <your logfile prefix> | ||
- | #BSUB -n 64 | + | #adjust / overwrite those two commands to enhance readability & overview |
- | #BSUB -R ' | + | # parameterize srun |
- | #BSUB -q nodelong | + | srun=" |
- | #BSUB -o %J.log | + | # parameterize parallel |
- | #BSUB -e %J.log | + | parallel=" |
- | #BSUB -N | + | |
- | #BSUB -W 3600 | + | |
- | #BSUB -J 'bwa template' | + | |
- | #BSUB -R ' | + | |
- | # This script is written by Christian Meesters (HPC-team, ZDV, Mainz) | + | # your preprocessing goes here |
- | # | + | |
- | # Please note: It is valid for Mogon I. The following restrictions apply: | + | |
- | # - if your fastq-files in the defined inputdirectory are big, the given | + | |
- | # | + | |
- | # a data subset. | + | |
- | # in order to see the output of all commands, we set this: | + | # start the run with GNU parallel |
- | set -x | + | $parallel $srun < |
+ | </ | ||
- | #1 we purge all possible module to avoid a mangled setup | + | <WRAP center round info 95%> |
- | module purge | + | The number of tasks (given by '' |
- | #2 we load out GNU parallel module | + | <code bash> |
- | module load software/gnu_parallel | + | # ensure |
+ | ((SLURM_CPUS_PER_TASK * SLURM_NTASKS)) -eq $((SLURM_CPUS_ON_NODE * SLURM_CPUS_ON_NODE)) | ||
+ | </ | ||
+ | </WRAP> | ||
- | #3 in order perform our alignment (here: bwa sampe) and subsequent sorting we | + | ====== SLURM multiprog for uneven arrays ====== |
- | # load bwa and samtools | + | |
- | module load software/ | + | |
- | module load software/ | + | |
- | #4 make the return value of the last pipe command which fails the return value | + | The [[https:// |
- | set -o pipefail | + | |
- | #5 set a path the reference genome, extract its directory path | + | <file bash master_slave_simple.sh> |
- | export REFERENCEGENOME=" | + | #!/bin/bash |
- | REFERENCEDIR=$(dirname $REFERENCEGENOME) | + | # |
+ | #SBATCH --job-name=test_ms | ||
+ | #SBATCH --output=res_ms.txt | ||
+ | # parameters of this snippet, choose sensible values for your setup | ||
+ | #SBATCH --ntasks=4 | ||
+ | #SBATCH --time=10: | ||
+ | #SBATCH --mem-per-cpu=100 | ||
- | #6 select a base directory | + | # for the purpose of this course |
- | INPUTBASEDIR=./ | + | #SBATCH |
- | #6b now we gather all input files: | + | #SBATCH |
- | # | + | |
- | # - its mate (ending on _2.fastq) | + | |
- | # If your files name use a different scheme, adjust this script | + | |
- | FORWARD_READS=$(find -L $INPUTBASEDIR -type f -name ' | + | |
- | #7 create an output directory, here: according to bwa and samtools versions | + | srun <other parameters> |
- | BWA_VERSION=$(bwa |& grep Version | cut -d ' ' | + | </ |
- | export OUTPUTDIR=" | + | |
- | if [ ! -d " | + | Then, of course the '' |
- | mkdir -p "/ | + | |
- | mkdir -p " | + | |
- | fi | + | |
- | #8 copy the reference to the ramdisk | + | <file bash multi.conf> |
- | | + | 0 echo ' |
+ | 1-3 bash -c ' | ||
+ | </file> | ||
- | mkdir -p $REFERENCEDIR / | + | Indeed, as the naming suggests, you can use such setup to emulate a master-slave environment. But then the processes have to care themselves about there communication |
- | cp -r $REFERENCEDIR / | + | |
- | REFERENCEGENOME=/ | + | |
- | REFERENCEDIR=/ | + | |
- | #9 create an alignment function with the appropriate calls for bwa and samtools | + | The configuration |
- | function bwa_aln { | + | |
- | TEMPOUT=$(basename $1) | + | |
- | # check file ending: is the file ending on gz? | + | |
- | if [ " | + | |
- | #bwa sampe $REFERENCEGENOME <( bwa aln -t 4 $REFERENCEGENOME <(zcat $1) ) \ | + | |
- | # <( bwa aln -t 4 $REFERENCEGENOME <(zcat ${1/_1/_2} ) ) \ | + | |
- | # < | + | |
- | #samtools view -Shb /dev/stdin > " | + | |
- | bwa mem -M -t 8 $REFERENCEGENOME <(zcat $1) <(zcat ${1/ | + | |
- | samtools view -Shb /dev/stdin > " | + | |
- | else | + | |
- | #bwa sampe $REFERENCEGENOME <( bwa aln -t 4 $REFERENCEGENOME $1 ) \ | + | |
- | # <( bwa aln -t 4 $REFERENCEGENOME ${1/_1/_2} ) \ | + | |
- | # $1 ${1/_1/_2} | \ | + | |
- | bwa mem -M -t 8 $REFERENCEGENOME $1 ${1/_1/_2} | | + | * Task number |
- | | + | |
- | | + | |
- | } | + | |
+ | Parameters available : | ||
- | #9b we need to export this function, such that all subprocesses will see it (only works in bash) | + | * '' |
- | export -f bwa_aln | + | * '' |
- | # finally we start processing | ||
- | # we consider taking 4 thread for each call of bwa aln, hence 8 threads | ||
- | # and 64 / 8 is 8. This results in a little over subscription, | ||
- | # runs with 8 threads and samtools is run, too. | ||
- | # Note the ungrouping of output with the -u option. | ||
- | parallel -v -u --env bwa_aln --no-notice -j 8 bwa_aln ::: $FORWARD_READS | ||
- | # copy all files to the actual output director | + | ====== The ZDV-taskfarm Script an alternative to multiprog ====== |
- | cp -r "/jobdir/${LSB_JOBID}/$OUTPUTDIR/ | + | |
+ | The script is hosted on [[https:// | ||
+ | |||
+ | The slurm multi-prog setup can be difficult for some scenarios: | ||
+ | |||
+ | * only one executable can be specified per task (e.g. no chain of commands or shell loops are possible, such as '' | ||
+ | * limitation on the maximum number of characters per task description (256) | ||
+ | * building the multi-prog file can be onerous, if you do not have the luxury of using the ' | ||
+ | * the number of commands must match exactly the number of slurm tasks ('' | ||
+ | |||
+ | [[job_arrays|Slurm Job Arrays]] are a better option to multi-prog, unless | ||
+ | |||
+ | The taskfarm script makes using multi-prog setups easy. Please only use it, if your tasks have +/- the same run time or else huge parts of the reserved nodes can be left idle. | ||
+ | |||
+ | For a full listing of the command line interface you can load the module and ask the script itself for help: | ||
+ | <code shell> | ||
+ | $ module load tools/staskfarm | ||
+ | $ staskfarm -h | ||
+ | </ | ||
+ | |||
+ | |||
+ | ===== Taskfarm: Working with one application on many files ===== | ||
+ | |||
+ | <file bash taskfarm_file> | ||
+ | # | ||
+ | |||
+ | #SBATCH -J taskfarm_example | ||
+ | #SABTCH -o taskfarm_example_%j.out | ||
+ | #SBATCH -N2 # in this example we take 2 nodes | ||
+ | #SBATCH -n 128 # optional argument - the optimal setting (or ommitance) has to be tried on a case basis | ||
+ | #SBATCH -A <your account> | ||
+ | #SBATCH -p nodeshort | ||
+ | |||
+ | # will load the most recent module version of the taskfarm | ||
+ | module load tools/ | ||
+ | |||
+ | # - suppose we have a program which requires 2 intputs: | ||
+ | # ' | ||
+ | # - assume further we have 302 such files | ||
+ | # - and we want to work on them in a round robin manner | ||
+ | |||
+ | # 1st we " | ||
+ | for ((i=0; i < 303; i++)); do | ||
+ | touch " | ||
+ | touch " | ||
+ | done | ||
+ | |||
+ | # 3rd, we specify our input command. | ||
+ | # Instead of a ' | ||
+ | # And we use pattern expansion to retrieve the 2nd file name, | ||
+ | # as we cannot (always) loop over several expressions. | ||
+ | echo '#!/bin/ | ||
+ | echo 'echo " | ||
+ | chmod +x cmd_file.sh | ||
+ | cmd=$(pwd)/cmd_file.sh | ||
+ | |||
+ | # 4th, start the taskfarm: | ||
+ | staskfarm $cmd *_R1.fastq | ||
+ | |||
+ | # finally, we need to clean up our mess: | ||
+ | rm *fastq cmd_file.sh | ||
</ | </ | ||
- | ===== Running on several hosts ===== | ||
- | LSF offers '' | + | ===== Taskfarm: Screening one application many parameters ===== |
- | <code bash> | + | <WRAP center round alert 80%> |
- | parallel --no-notice | + | As stated, the most sensible use case for the taskfarm are application with +/- equal run times for the inputs. When screening parameters in a simulation, is is likely that run times greatly depend on the parameters. Therefore, it might be better to consider using GNU parallel |
- | </code> | + | </ |
- | will print the host and 'D', 'E' | + | |
+ | <file bash taskfarm_file2> | ||
+ | # | ||
+ | |||
+ | # | ||
+ | # | ||
+ | # | ||
+ | # | ||
+ | # | ||
+ | #SBATCH -A <your account> | ||
+ | #SBATCH --time 10 | ||
+ | #SBATCH -p short | ||
+ | |||
+ | # will load the most recent module version of the taskfarm | ||
+ | module load tools/ | ||
+ | |||
+ | # - suppose we have a program which requires 2 intputs: | ||
+ | # all natural numbers for arg1 between 1 and 5 and for arg2 just 0 and 1 | ||
+ | # - assume further we want to run a permutation test | ||
+ | |||
+ | # 1st we " | ||
+ | permutations=$(echo {1..5}, | ||
+ | |||
+ | # 3rd, we specify our input command. | ||
+ | # Instead of a 'real' | ||
+ | echo '#!/bin/bash' | ||
+ | echo 'echo " | ||
+ | chmod +x cmd_file.sh | ||
+ | cmd=$(pwd)/ | ||
+ | |||
+ | # 4th, start the taskfarm: | ||
+ | # NOTE: We start each application with 2 threads (-t 2) | ||
+ | # the '--parameters' | ||
+ | staskfarm -t 2 --parameters $cmd $permutations | ||
+ | |||
+ | # finally, we need to clean up our mess: | ||
+ | rm cmd_file.sh | ||
+ | </ |