software:openmpi

This is an old revision of the document!

# OpenMPI

To compile your code with OpenMPI you need an OpenMPI module.

module load mpi/OpenMPI/<version-compiler-version>
mpicc [compilation_parameter] -o <executable> <input_file.c> [input_file2.c ...]

Since F90 has some problems with the OpenMPI compiled with the system compiler, it is necessary to load the MPI module for the compiler you intend to use. A the moment there are modules for the following compilers:

Compiler version OpenMPI Module
gcc/4.4.7 mpi/openmpi/1.10/gcc/4.4.7
gcc/4.9.3 mpi/openmpi/1.10/gcc/4.9.3
gcc/5.1.0 mpi/openmpi/1.10/gcc/5.1.0
gcc/5.3.0 mpi/openmpi/1.10/gcc/5.3.0
intel/composer/2013_4_183 mpi/openmpi/1.10/intel/composer/2013
intel/composer/2016 mpi/openmpi/1.10/intel/composer/2016
module load <compiler/of/your/choice>
module load <mpi/openmpi/1.10/compiler/version>

To execute your program you need to have the correct module of OpenMPI (see in Compilation) loaded.

You execute your program by running it with srun, which behaves like mpirun/mpiexec as the mpi-modules are compiled against slurm.

One example would be:

#!/bin/bash

#SBATCH -N 2 # the number of nodes
#SBATCH -p nodeshort # on Mogon I
#SBATCH -p parallel  # on Mogon II
#SBATCH -t <sufficient time>
#SBATCH --mem <sufficient memory, if default / node is not sufficient>
#SBATCH -J <jobname>

#M1 - example

# M2 - example

srun -N2 -n <should fit the number of MPI ranks> <your application>

In the case of hybrid applications (multi processing + threading) see to it that -c in your slurm parameterization, which is the number of threads per process, times -n, the number of ranks, equal the number of cores per node. This might not always be true, as some applications might profit from the hyperthreading on Mogon II or saturate the FPUs on Mogon I. In this case you should experiment to find the optimal performance. Please do not hesitate to ask for advice if you do not know how to approach this problem.

In order to run larger OpenMPI jobs it might be necessary to increase the memory for your job. Here are a few hints on how much OpenMPI needs to function. Since this is largely dependent on the app you are running, consider this as a guideline which is used by MPI to communicate.

Number of cores Memory demand (-M <value>)
64 default
128 512 MByte (-M 512000)
256 768 MByte (-M 768000)
512 1280 MByte (-M 1280000)
1024 may be problematic, see below
2048
4096

Attention: For jobs with more than 512 Cores there might be problems with execution. Depending on the communication scheme used by MPI the job might fail due to memory limits.

Every MPI programm needs an amount of time to start up an get ready for communicating. With increasing number of cores this time also increases. Here are some rough numbers on how much time MPI needs to proper start up.

Number of cores Startup time
- 256 5 - 10 sec
- 2048 20 - 30 sec
4096 ~40 sec
• software/openmpi.1504964866.txt.gz