start:development:parallelization:mpi

MPI

We recommend you to always use the latest versions available!

The following MPI implementations are currently available on MOGON:

The corresponding modulefiles are all in the mpi/ namespace.

Documentation

Most MPI modules carry library specific MPI documentation as a “manpage”, e.g.:

$ module load mpi/<your MPI flavor> 
$ man MPI_Abort

Benchmarks

If your software allows for linking different mpi-versions, you can choose the optimal version according to your configuration of either package size or number of threads.

nodeshort

The average latency in microsecs was measured for common MPI operations. Standard deviation of these measurements were also calculated. Package sizes increase from 2 bytes to 4096 bytes. The maximum number of threads was exhausted for each node.



OpenMPI

OpenMPI

To compile your code with OpenMPI you need an OpenMPI module.

module load mpi/OpenMPI/<version-compiler-version>
mpicc [compilation_parameter] -o <executable> <input_file.c> [input_file2.c ...]

To execute your program you need to have the correct module of OpenMPI (see in Compilation) loaded.

You execute your program by running it with srun, which behaves like mpirun/mpiexec as the mpi-modules are compiled against slurm.

One example would be:

#!/bin/bash
 
#SBATCH -N 2 # the number of nodes
#SBATCH -p nodeshort # on MOGON I
#SBATCH -p parallel  # on MOGON II
#SBATCH -A <your slurm account>
#SBATCH -t <sufficient time>
#SBATCH --mem <sufficient memory, if default / node is not sufficient>
#SBATCH -J <jobname>
 
#M1 - example
#module load mpi/OpenMPI/2.0.2-GCC-6.3.0 
 
# M2 - example
#module load mpi/OpenMPI/2.0.2-GCC-6.3.0-2.27-opa
 
srun -N2 -n <should fit the number of MPI ranks> <your application>

In the case of hybrid applications (multi processing + threading) see to it that -c in your slurm parameterization, which is the number of threads per process, times -n, the number of ranks, equal the number of cores per node. This might not always be true, as some applications might profit from the hyperthreading on MOGON II or saturate the FPUs on MOGON I. In this case you should experiment to find the optimal performance. Please do not hesitate to ask for advice if you do not know how to approach this problem.

In order to run larger OpenMPI jobs it might be necessary to increase the memory for your job. Here are a few hints on how much OpenMPI needs to function. Since this is largely dependent on the app you are running, consider this as a guideline which is used by MPI to communicate.

Number of cores Memory demand (–mem <value>)
64 default
128 512 MByte (–mem 512M)
256 768 MByte (–mem 768M)
512 1280 MByte (–mem 1280M)
1024 may be problematic, see below
2048
4096

Attention: For jobs with more than 512 Cores there might be problems with execution. Depending on the communication scheme used by MPI the job might fail due to memory limits.

Every MPI programm needs an amount of time to start up an get ready for communicating. With increasing number of cores this time also increases. Here are some rough numbers on how much time MPI needs to proper start up.

Number of cores Startup time
- 256 5 - 10 sec
- 2048 20 - 30 sec
4096 ~40 sec

IntelMPI

Intel MPI

To compile your code with IntelMPI you need an IntelMPI module.

module load mpi/impi/<version-compiler-version>
mpicc [compilation_parameter] -o <executable> <input_file.c> [input_file2.c ...]

To execute your program you need to have the correct module of IntelMPI (see in Compilation IntelMPI) loaded; the one, which you used to compile your software.

One example would be:

#!/bin/bash
 
#SBATCH -N 2
#SBATCH -p nodeshort # on MOGON I
#SBATCH -p parallel  # on MOGON II
#SBATCH -A <your slurm account>
#SBATCH -t <sufficient time>
#SBATCH --mem <sufficient memory, if default / node is not sufficient>
#SBATCH -J <jobname>
 
#M1 - example
#module load mpi/impi/2017.2.174-iccifort-2017.2.174-GCC-6.3.0
 
srun -N 2 -n 64  <mpi-application>

Use -n to specifiy the number of MPI ranks / node.

MVAPICH2

MVAPICH2

(MPI-3.1 over OpenFabrics-IB, Omni-Path, OpenFabrics-iWARP, PSM, and TCP/IP)

This is an MPI-3.1 implementation. The MVAPICH2 software, based on MPI 3.1 standard, manages performance, scalability and fault tolerance for high-end computing systems and servers using InfiniBand, Omni-Path, Ethernet/iWARP, and RoCE networking technologies.

The current release supports the following ten underlying transport interfaces:

  • OFA-IB-CH3: This interface supports all InfiniBand compliant devices based on the OpenFabrics Gen2 layer. This interface has the most features and is most widely used. For example, this interface can be used over all Mellanox InfiniBand adapters, IBM eHCA adapters and Qlogic adapters.
  • OFA-IB-Nemesis: This interface supports all InfiniBand compliant devices based on the OpenFabrics libibverbs layer with the emerging Nemesis channel of the MPICH2 stack. This interface can be used by all Mellanox InfiniBand adapters.
  • OFA-IB-CH3: This interface supports all iWARP compliant devices supported by OpenFabrics. For example, this layer supports Chelsio T3 adapters with the native iWARP mode.
  • OFA-RoCE-CH3: This interface supports the emerging RoCE (RDMA over Convergence Ethernet) interface for Mellanox ConnectX-EN adapters with 10/40GigE switches. It provides support for RoCE v1 and v2.
  • TrueScale(PSM-CH3): This interface provides native support for TrueScale adapters from Intel over PSM interface. It provides high-performance point-to-point communication for both one-sided and two-sided operations.
  • Omni-Path(PSM2-CH3): This interface provides native support for Omni-Path adapters from Intel over PSM2 interface. It provides high-performance point-to-point communication for both one-sided and two-sided operations.
  • Shared-Memory-CH3: This interface provides native shared memory support on multi-core platforms where communication is required only within a node. Such as SMP-only systems, laptops, etc.
  • TCP/IP-CH3: The standard TCP/IP interface (provided by MPICH2) to work with a range of network adapters supporting TCP/IP interface. This interface can be used with IPoIB (TCP/IP over InfiniBand network) support of InfiniBand also. However, it will not deliver good performance/ scalability as compared to the other interfaces.
  • TCP/IP-Nemesis: The standard TCP/IP interface (provided by MPICH2 Nemesis channel) to work with a range of network adapters supporting TCP/IP interface. This interface can be used with IPoIB (TCP/IP over InfiniBand network) support of InfiniBand also. However, it will not deliver good performance/ scalability as compared to the other interfaces.
  • Shared-Memory-Nemesis: This interface provides native shared memory support on multi-core platforms where communication is required only within a node. Such as SMP-only systems, laptops, etc.

For even more information visit the MVAPICH homepage

To compile your code with Mvapich2MPI you need an appropriate module.

module load mpi/MVAPICH2/<version-compiler-version>-slurm
mpicc [compilation_parameter] -o <executable> <input_file.c> [input_file2.c ...]

Executing applications is like with OpenMPI.

  • start/development/parallelization/mpi.txt
  • Last modified: 2020/04/16 19:31
  • by jrutte02