MPI

We recommend you to always use the latest versions available!

The following MPI implementations are currently available on MOGON:

The corresponding modulefiles are all in the mpi/ namespace.

Documentation

Most MPI modules carry library specific MPI documentation as a manpage, e.g.:

module load mpi/<your MPI flavor>
man MPI_Abort

Benchmarks

If your software allows for linking different mpi-versions, you can choose the optimal version according to your configuration of either package size or number of threads.

Mogon 3

These measurements have not been performed yet

MPI Implementations

OpenMPI

Compilation OpenMPI

To compile your code with OpenMPI you need an OpenMPI module.

module load mpi/OpenMPI/<version-compiler-version>
mpicc [compilation_parameter] -o <executable> <input_file.c> [input_file2.c ...]

Execution

To execute your program you need to have the correct module of OpenMPI (see in Compilation) loaded.

You execute your program by running it with srun, which behaves like mpirun/mpiexec as the mpi-modules are compiled against slurm.

One example would be:

#!/bin/bash

#SBATCH -N 2 # the number of nodes
#SBATCH -p nodeshort # on MOGON I
#SBATCH -p parallel  # on MOGON II
#SBATCH -A <your slurm account>
#SBATCH -t <sufficient time>
#SBATCH --mem <sufficient memory, if default / node is not sufficient>
#SBATCH -J <jobname>

#M1 - example
#module load mpi/OpenMPI/2.0.2-GCC-6.3.0

# M2 - example
#module load mpi/OpenMPI/2.0.2-GCC-6.3.0-2.27-opa

srun -N2 -n <should fit the number of MPI ranks> <your application>

In the case of hybrid applications (multi processing + threading) see to it that -c in your slurm parameterization, which is the number of threads per process, times -n, the number of ranks, equal the number of cores per node. This might not always be true, as some applications might profit from the hyperthreading on MOGON II or saturate the FPUs on MOGON I. In this case you should experiment to find the optimal performance. Please do not hesitate to ask for advice if you do not know how to approach this problem.

Memory

In order to run larger OpenMPI jobs it might be necessary to increase the memory for your job. Here are a few hints on how much OpenMPI needs to function. Since this is largely dependent on the app you are running, consider this as a guideline which is used by MPI to communicate.

Number of coresMemory demand (--mem <value>)
64default
128512 MByte (--mem 512M)
256768 MByte (--mem 768M)
5121280 MByte (--mem 1280M)
1024may be problematic, see below
2048:::
4096:::

Attention:

For jobs with more than 512 Cores there might be problems with execution. Depending on the communication scheme used by MPI the job might fail due to memory limits.

Startup time

Every MPI programm needs an amount of time to start up and get ready for communicating. With increasing number of cores this time also increases. Here are some rough numbers on how much time MPI needs to properly start up.

Number of coresStartup time
- 2565 - 10 sec
- 204820 - 30 sec
4096~40 sec

Intel MPI

Compilation IntelMPI

To compile your code with IntelMPI you need an IntelMPI module.

module load mpi/impi/<version-compiler-version>
mpicc [compilation_parameter] -o <executable> <input_file.c> [input_file2.c ...]

Execution

To execute your program you need to have the correct module of IntelMPI (see in Compilation IntelMPI) loaded; the one, which you used to compile your software.

One example would be:

#!/bin/bash

#SBATCH -N 2
#SBATCH -p nodeshort # on MOGON I
#SBATCH -p parallel  # on MOGON II
#SBATCH -A <your slurm account>
#SBATCH -t <sufficient time>
#SBATCH --mem <sufficient memory, if default / node is not sufficient>
#SBATCH -J <jobname>

#M1 - example
#module load mpi/impi/2017.2.174-iccifort-2017.2.174-GCC-6.3.0

srun -N 2 -n 64  <mpi-application>

Use -n to specifiy the number of MPI ranks / node.

MVAPICH2

(MPI-3.1 over OpenFabrics-IB, Omni-Path, OpenFabrics-iWARP, PSM, and TCP/IP)

This is an MPI-3.1 implementation. The MVAPICH2 software, based on MPI 3.1 standard, manages performance, scalability and fault tolerance for high-end computing systems and servers using InfiniBand, Omni-Path, Ethernet/iWARP, and RoCE networking technologies.

Features

The current release supports the following ten underlying transport interfaces:

  • OFA-IB-CH3: This interface supports all InfiniBand compliant devices based on the OpenFabrics Gen2 layer. This interface has the most features and is most widely used. For example, this interface can be used over all Mellanox InfiniBand adapters, IBM eHCA adapters and Qlogic adapters.
  • OFA-IB-Nemesis: This interface supports all InfiniBand compliant devices based on the OpenFabrics libibverbs layer with the emerging Nemesis channel of the MPICH2 stack. This interface can be used by all Mellanox InfiniBand adapters.
  • OFA-IB-CH3: This interface supports all iWARP compliant devices supported by OpenFabrics. For example, this layer supports Chelsio T3 adapters with the native iWARP mode.
  • OFA-RoCE-CH3: This interface supports the emerging RoCE (RDMA over Convergence Ethernet) interface for Mellanox ConnectX-EN adapters with 10/40GigE switches. It provides support for RoCE v1 and v2.
  • TrueScale(PSM-CH3): This interface provides native support for TrueScale adapters from Intel over PSM interface. It provides high-performance point-to-point communication for both one-sided and two-sided operations.
  • Omni-Path(PSM2-CH3): This interface provides native support for Omni-Path adapters from Intel over PSM2 interface. It provides high-performance point-to-point communication for both one-sided and two-sided operations.
  • Shared-Memory-CH3: This interface provides native shared memory support on multi-core platforms where communication is required only within a node. Such as SMP-only systems, laptops, etc.
  • TCP/IP-CH3: The standard TCP/IP interface (provided by MPICH2) to work with a range of network adapters supporting TCP/IP interface. This interface can be used with IPoIB (TCP/IP over InfiniBand network) support of InfiniBand also. However, it will not deliver good performance/ scalability as compared to the other interfaces.
  • TCP/IP-Nemesis: The standard TCP/IP interface (provided by MPICH2 Nemesis channel) to work with a range of network adapters supporting TCP/IP interface. This interface can be used with IPoIB (TCP/IP over InfiniBand network) support of InfiniBand also. However, it will not deliver good performance/ scalability as compared to the other interfaces.
  • Shared-Memory-Nemesis: This interface provides native shared memory support on multi-core platforms where communication is required only within a node. Such as SMP-only systems, laptops, etc.

For even more information visit the MVAPICH homepage

Compilation

To compile your code with Mvapich2MPI you need an appropriate module.

module load mpi/MVAPICH2/<version-compiler-version>-slurm
mpicc [compilation_parameter] -o <executable> <input_file.c> [input_file2.c ...]

Execution

Executing applications is like with OpenMPI.