Hyperion Cluster: Using OpenMPI

The Hyperion Cluster uses Open MPI for it's MPI library. Detailed information on Open MPI can be found on the Open MPI website. For information on how to compile programs using Open MPI, please see OpenMPI FAQ: Compiling Open MPI Applications

Open MPI is installed in /usr/local/openmpi. Since many parallel programs use Fortran and there is rarely any compatibility between different Fortran libraries compiled with different Fortran compilers, Open MPI has been compiled using the 3 compiler suites we support: GCC, Portland Group (PGI), and Intel. Under /usr/local/openmpi there is a separate directory for each compiler suite:

  • /usr/local/openmpi/gcc
  • /usr/local/openmpi/intel
  • /usr/local/openmpi/pgi

As of February 2009, 64-bit versions of Open MPI been installed for the GCC, Portland Group (PGI), and Intel compilers. 32-bit programs are not supported. Underneath each compiler subdirectory is another directory named x86_64 (to avoid ambiguity), and then underneath that are the directories containing the complete OpenMPI installation for each compiler family:

$ ls /usr/local/openmpi/gcc/x86_64/
bin  etc  include  lib  share

$ ls /usr/local/openmpi/pgi/x86_64
bin  etc  include  lib  share

$ ls /usr/local/openmpi/intel/x86_64
bin  etc  include  lib  share

Non-architecture specific files, such as man pages and other documentation, can be found under the etc/ and share/ directories. Most important to users, the man pages are located under share/man:

  • /usr/local/openmpi/gcc/x86_64/share/man
  • /usr/local/openmpi/pgi/x86_64/share/man
  • /usr/local/openmpi/intel/x86_64/share/man

Since different users will prefer/need different compilers, it is up to each individual user to define their PATH, MANPATH, and (if necessary) LD_LIBRARY_PATH to include the correct directories for Open MPI. For example, if you are using the PGI compilers with Open MPI, and you use BASH for your shellyou would add the following to your .bashrc file:

export PATH=/usr/local/openmpi/pgi/x86_64/bin:${PATH}
export MANPATH=/usr/local/openmpi/pgi/x86_64/share/man:${MANPATH}

If you use tcsh for your shell, you would add something like this to your .cshrc file

setenv PATH /usr/local/openmpi/pgi/x86_64/bin:${PATH}
setenv MANPATH /usr/local/openmpi/pgi/x86_64/share/man:${MANPATH}

Keep in mind that these example are for the PGI compilers. If you are using GCC or the Intel compilers, you will have to replace "pgi" with "gcc" or "intel", respectively. Also, you will still need to add the location of your chosen compiler to your path. By default, the GCC compilers are already in your path. If you'd like to use the PGI or Intel compilers, you would also need to add these directories to your PATH:

  • PGI compilers: /usr/local/pgi/linux86-64/current/bin
  • Intel compilers: /usr/local/intel/Compiler/current/bin/intel64

When compiling, it is recommended that you use the correct Open MPI compiler command (mpicc, mpiCC, mpic++, mpicxx, mpif77, mpif90) instead of using your preferred compiler directly. This is discussed in the Open MPI FAQ: Compiling MPI Applications

Step-by-Step Instructions for Compiling and Running MPI applications

First, you need to write some code that uses MPI. For an example program, I'm using this C program:

#include <stdio.h>
#include <string.h>
#include <unistd.h>
#include "mpi.h"

int main(int argc, char* argv[]){
  int my_rank;
  int p;
  int source;
  int dest;
  int tag=0;
  char message[100];
  char my_name[20];
  MPI_Status status;

  /* Start up MPI */
  MPI_Init(&argc, &argv);
  
  /* Find out process rank */
  MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);
  
  /* Find out number of processes */
  MPI_Comm_size(MPI_COMM_WORLD, &p);

  /* What's my hostname? */
  gethostname(my_name, 20);    
  sleep(45);

  if (my_rank == 0) {
    printf("MPIHello running on %i processors.\n", p);
    printf("Greetings from processor %i, on host %s.\n", my_rank, my_name);
    for (source=1; source<p; source++) {
      MPI_Recv(message, 100, MPI_CHAR, source, tag, MPI_COMM_WORLD, &status);
      printf("%s", message);
    }
  } else if (my_rank != 0) {
    sprintf(message, "Greetings from processor %i, on host %s.\n", my_rank, my_name);
    dest=0;
    MPI_Send(message, strlen(message)+1, MPI_CHAR, dest, tag, MPI_COMM_WORLD); 
  }
  sleep(45);
  MPI_Finalize();
}

This code based on an example in "Parallel Programming with MPI" [Pacheco, Peter S., p.41] with some modifications. Feel free to copy it and try to compile/run it yourself.

For these examples, I will be using the PGI compilers and I use BASH as my shell, so I added the following to my .bashrc file:

# Add correct Open MPI binaries to PATH
MPI=/usr/local/openmpi/pgi/x86_64
export PATH=${MPI}/bin:${PATH}
export LD_LIBRARY_PATH=${MPI}/lib:${LD_LIBRARY_PATH}

# Explicitly tell Open MPI which compilers to use
export OMPI_CC=pgcc
export OMPI_CXX=pgCC
export OMPI_F77=pgf77
export OMPI_FC=pgf95

This sets up my environment correctly by adding the correct compiler and Open MPI bin directories my PATH and explicitly specifying the compilers I want Open MPI to use. If you will always be using the same compilers you should add lines like these to your .bashrc or .cshrc file, so they are set automatically every time you log in.

If you'd like to use the Intel Compilers, you should add this to your .bashrc file:

# Add correct Open MPI binaries to my PATH
MPI=/usr/local/openmpi/intel/x86_64
export PATH=${MPI}/bin:${PATH}
export LD_LIBRARY_PATH=${MPI}/lib:${LD_LIBRARY_PATH}

# Explicitly tell Open MPI which compilers to use
export OMPI_CC=icc
export OMPI_CXX=icpc
export OMPI_F77=ifort
export OMPI_FC=ifort

If you're using the GCC compilers, they are already in your PATH by default, so you only need to add Open MPI settings to your .bashrc file:

# Add correct Open MPI binaries to my PATH
MPI=/usr/local/openmpi/gcc/x86_64
export PATH=${MPI}/bin:${PATH}
export LD_LIBRARY_PATH=${MPI}/lib:${LD_LIBRARY_PATH}

# Explicitly tell Open MPI which compilers to use
export OMPI_CC=gcc
export OMPI_CXX=g++
export OMPI_F77=gfortran
export OMPI_FC=gfortran

If you use csh or tcsh as your shell, don't worry - I haven't forgotten about you. Here the same settings as above in csh syntax that you can add to your .cshrc file.

For PGI compilers, add this to your .cshrc file

# Add correct Open MPI binaries to PATH
set MPI=/usr/local/openmpi/pgi/x86_64
setenv PATH ${MPI}/bin:${PATH}
setenv LD_LIBRARY_PATH ${MPI}/lib:${LD_LIBRARY_PATH}

# Explicitly tell Open MPI which compilers to use
set OMPI_CC pgcc
set OMPI_CXX pgCC
set OMPI_F77 pgf77
set OMPI_FC pgf95

For Intel compilers, add this to your .cshrc file:

# Add correct Open MPI binaries to PATH
set MPI=/usr/local/openmpi/intel/x86_64
setenv PATH ${MPI}/bin:${PATH}
setenv  LD_LIBRARY_PATH ${MPI}/lib:${LD_LIBRARY_PATH}

# Explicitly tell Open MPI which compilers to use
set OMPI_CC icc
set OMPI_CXX icpc
set OMPI_F77 ifort
set OMPI_FC ifort

For GCC, add this to your .cshrc file:

# Add correct Open MPI binaries to PATH
set MPI=/usr/local/openmpi/pgi/x86_64
setenv PATH ${MPI}/bin:${PATH}
setenv LD_LIBRARY_PATH ${MPI}/lib:${LD_LIBRARY_PATH}

# Explicitly tell Open MPI which compilers to use
set OMPI_CC gcc
set OMPI_CXX g++
set OMPI_F77 gfortran
set OMPI_FC gfortran

To make sure that mpicc will be using the correct compiler commands, use the --showme switch to mpicc, or whichever Open MPI compiler you will be using:

$ mpicc --showme
pgcc -I/usr/local/openmpi-1.2.8/pgi-8.0/x86_64/include -pthread 
-L/usr/local/openmpi-1.2.8/pgi-8.0/x86_64/lib -lmpi -lopen-rte -lopen-pal
-libverbs -lrt -lnuma -ldl -Wl,--export-dynamic -lnsl -lutil

$ mpiCC --showme
pgCC -D_REENTRANT -I/usr/local/openmpi-1.2.8/pgi-8.0/x86_64/include 
-L/usr/lib64 -L/usr/local/openmpi/pgi-8.0/x86_64/lib 
-L/usr/local/openmpi-1.2.8/pgi-8.0/x86_64/lib -lmpi_cxx -lmpi -lopen-rte 
-lopen-pal -libverbs -lrt -lnuma -ldl -Wl,--export-dynamic -lnsl -lutil 
-lpthread -ldl

mpif90 --showme
pgf95 -I/usr/local/openmpi-1.2.8/pgi-8.0/x86_64/include 
-I/usr/local/openmpi-1.2.8/pgi-8.0/x86_64/lib 
-L/usr/local/openmpi-1.2.8/pgi-8.0/x86_64/lib -lmpi_f90 -lmpi_f77 -lmpi 
-lopen-rte -lopen-pal -libverbs -lrt -lnuma -ldl -Wl,--export-dynamic 
-lnsl -lutil -lpthread -ldl

If everything above looks okay, then compile the program for real:

$ mpicc -o mpihello mpihello.c

Now create a job submission script. You can call it whatever you want. For this example, I'm using the name mpihello.sh:

#!/bin/bash
#$ -N mpihello
#$ -pe orte 16
#$ -cwd
#$ -V
#$ -R y

MPI=/usr/local/openmpi/pgi/x86_64
PATH=${MPI}/bin:${PATH}
LD_LIBRARY_PATH=${MPI}/lib
mpirun ./mpihello

If your curious about what some of the switches above do, here is a brief description of each. For more details, please see the qsub man page:

-cwd          Execute the job from the current working directory.  This switch
              will activate Grid Engine’s path aliasing facility, if the  cor-
              responding  configuration files are present (see ge_aliases(5)).

-pe parallel_environment n[-[m]]|[-]m,...
              Parallel programming environment (PE) to instantiate.  For  more
              detail about PEs, please see the sge_types(1).

-N name       The name of the job. The name should follow the  "name"  defini-
              tion  in sge_types(1).  Invalid job names will be denied at sub-
              mit time.

              If the -N option is not present, Grid Engine assigns the name of
              the  job script to the job after any directory pathname has been
              removed from the script-name. If the script is read  from  stan-
              dard input, the job name defaults to STDIN.

-R y[es]|n[o]
              Indicates whether a reservation for this  job  should  be  done.
              Reservation  is never done for immediate jobs, i.e. jobs submit-
              ted using the -now yes option.  Please note that  regardless  of
              the reservation request, job reservation might be disabled using
              max_reservation in sched_conf(5) and might be limited only to  a
              certain number of high priority jobs.

              By default jobs are submitted with the -R n option.

 -V           Available for qsub, qsh, qrsh with command and qalter.

              Specifies  that all environment variables active within the qsub
              utility be exported to the context of the job.

For parallel jobs, you should always use the '-R y', which will create a "reservation" for your parallel job.

Now we can finally submit this job to the cluster with the qsub command:

$ qsub mpihello.sh 
Your job 2270 ("mpihello") has been submitted

After submitting the job, you can check the status with the qstat command:

$ qstat
job-ID  prior   name       user         state submit/start at     queue                          slots ja-task-ID 
-----------------------------------------------------------------------------------------------------------------
   2270 0.55500 mpihello   prentice     r     02/05/2009 14:45:30 all.q@node45.hyperion               16    

When the job completes, the standard output and standard error will be written to files named jobname.ojobid and jobname.ejobid, respectively. If the parallel environment produces and messages or errors, they will be written to files jobname.pojobid and jobname.pejobid:

-rw-r--r-- 1 prentice admin    0 Feb  5 14:45 mpihello.e2270
-rw-r--r-- 1 prentice admin  857 Feb  5 14:47 mpihello.o2270
-rw-r--r-- 1 prentice admin    0 Feb  5 14:45 mpihello.pe2270
-rw-r--r-- 1 prentice admin    0 Feb  5 14:45 mpihello.po2270

Notice that all output files, except for the standard output file, are empty. That's a good sign. That means there were no errors!. Now we can just use more to examine the output:

$ more  mpihello.o2270
MPIHello running on 16 processors.
Greetings from processor 0, on host node45.hyperion.
Greetings from processor 1, on host node45.hyperion.
Greetings from processor 2, on host node45.hyperion.
Greetings from processor 3, on host node45.hyperion.
Greetings from processor 4, on host node45.hyperion.
Greetings from processor 5, on host node45.hyperion.
Greetings from processor 6, on host node45.hyperion.
Greetings from processor 7, on host node45.hyperion.
Greetings from processor 8, on host node41.hyperion.
Greetings from processor 9, on host node41.hyperion.
Greetings from processor 10, on host node41.hyperion.
Greetings from processor 11, on host node41.hyperion.
Greetings from processor 12, on host node41.hyperion.
Greetings from processor 13, on host node41.hyperion.
Greetings from processor 14, on host node41.hyperion.
Greetings from processor 15, on host node41.hyperion.