From Fedora Project Wiki
No edit summary
Line 3: Line 3:
== Introduction ==
== Introduction ==


Message Passing Interface (MPI) is an API for parallelization of programs across multiple nodes and has been around since 1994 [http://en.wikipedia.org/wiki/Message_Passing_Interface]. Before MPI, about every manufacturer of supercomputers had their own programming language for writing programs; MPI made porting software easy.
Message Passing Interface (MPI) is an API for parallelization of programs across multiple nodes and has been around since 1994 [http://en.wikipedia.org/wiki/Message_Passing_Interface]. MPI can also be used for parallelization on SMP machines and is considered very efficient in it too (close to 100% scaling on parallelizable code as compared to ~80% commonly obtained with threads due to unoptimal memory allocation on NUMA machines). Before MPI, about every manufacturer of supercomputers had their own programming language for writing programs; MPI made porting software easy.


There are many MPI implementations available, such as [http://www.lam-mpi.org/ LAM-MPI] (obsoleted by Open MPI), [http://www.open-mpi.org/ Open MPI] (the MPI compiler used in RHEL), [http://www.mcs.anl.gov/research/projects/mpi/mpich1/ MPICH] (Not yet in Fedora), [http://www.mcs.anl.gov/research/projects/mpich2/ MPICH2] and
There are many MPI implementations available, such as [http://www.lam-mpi.org/ LAM-MPI] (obsoleted by Open MPI), [http://www.open-mpi.org/ Open MPI] (the MPI compiler used in RHEL), [http://www.mcs.anl.gov/research/projects/mpi/mpich1/ MPICH] (Not yet in Fedora), [http://www.mcs.anl.gov/research/projects/mpich2/ MPICH2] and
Line 9: Line 9:


As some MPI libraries work better on some hardware than others, and some software works best with some MPI library, the selection of the library used must be done on a user-level basis. Also, people doing high performance computing may want to use more efficient compilers, so one must be able to have many versions compiled with different compilers of the same library installed at the same time. This must be taken into account when writing spec files.
As some MPI libraries work better on some hardware than others, and some software works best with some MPI library, the selection of the library used must be done on a user-level basis. Also, people doing high performance computing may want to use more efficient compilers, so one must be able to have many versions compiled with different compilers of the same library installed at the same time. This must be taken into account when writing spec files.


== Packaging of MPI compilers ==
== Packaging of MPI compilers ==

Revision as of 21:22, 22 July 2009

Warning.png
This is a draft document

Introduction

Message Passing Interface (MPI) is an API for parallelization of programs across multiple nodes and has been around since 1994 [1]. MPI can also be used for parallelization on SMP machines and is considered very efficient in it too (close to 100% scaling on parallelizable code as compared to ~80% commonly obtained with threads due to unoptimal memory allocation on NUMA machines). Before MPI, about every manufacturer of supercomputers had their own programming language for writing programs; MPI made porting software easy.

There are many MPI implementations available, such as LAM-MPI (obsoleted by Open MPI), Open MPI (the MPI compiler used in RHEL), MPICH (Not yet in Fedora), MPICH2 and MVAPICH1 and MVAPICH2 (Not yet in Fedora).

As some MPI libraries work better on some hardware than others, and some software works best with some MPI library, the selection of the library used must be done on a user-level basis. Also, people doing high performance computing may want to use more efficient compilers, so one must be able to have many versions compiled with different compilers of the same library installed at the same time. This must be taken into account when writing spec files.

Packaging of MPI compilers

MPI compilers MUST be installed (including binaries, man pages, etc) in %{_libdir}/%{name}/%{version}-<compiler>, where <compiler> is normally gcc in Fedora.

The runtime of MPI compilers (mpirun, the libraries, the manuals etc) MUST be packaged into %{name}, and the development headers and libraries into %{name}-devel.

As the compiler is installed outside PATH (for a valid reason), one needs to load the relevant variables before being able to use the compiler or run MPI programs. This is done using environment modules.

The module file MUST prepend the MPI bindir {_libdir}/%{name}/%{version}-<compiler>/bin into the users PATH and set LD_LIBRARY_PATH to {_libdir}/%{name}/%{version}-<compiler>/lib. MUST: No files are placed in /etc/ld.so.conf.d.

If the packager wishes to provide alternatives support, it MUST be placed in a subpackage so that alternatives support does not need to be installed if not wished for.

The MPI compiler package MUST provide an RPM macro that makes loading and unloading the support easy in spec files, e.g. by placing the following in /etc/rpm/macros.openmpi

%_openmpi_load \
 . /etc/profile.d/modules.sh; \
 module load openmpi-%{_arch}; \
 export CFLAGS="$CFLAGS %{optflags}"
%_openmpi_unload \
 . /etc/profile.d/modules.sh; \
 module unload openmpi-%{_arch};

loading and unloading the compiler in spec files is as easy as %{_openmpi_load} and %{_openmpi_unload}.

Packaging of MPI software

If supported, MPI software MUST be packaged also in serial mode (for instance: foo). The MPI enabled bits MUST be placed in a subpackage with the suffix denoting the MPI compiler used (for instance: foo-mpi for Open MPI (the traditional MPI compiler in Fedora) and foo-mpich2 for MPICH2).

To prevent name clashes, the binaries of the software placed in %{_bindir} have to be suffixed with the name of the MPI compiler (e.g. bar_mpi (for Open MPI) or bar_mpich2), or be placed in %{_libdir}/%{name}/%{version}-<MPI compiler>/bin with a module file placed in the modules directory as instructed in environment modules.

Idea.png
Note on libraries
If a package compiled using multiple MPI compilers contains shared libraries, the different library versions MUST have different suffixes too (_mpi, _mpich2 and so on), otherwise things will go haywire.

When packaging MPI-enabled software, the packager MUST package at least a version compiled against Open MPI. Packages made against other MPI compilers in Fedora SHOULD be made, but that is left up to the maintainer.

The packages MUST have explicit requires on the used MPI runtime, as rpm might not pick up the correct version. - needs to be checked, at least libmpi is provided by all of them(?)