Using the MOAB Workload Manager - MOAB MPICH2 Job
Article Index
Using the MOAB Workload Manager
Basic MOAB Script
MOAB Serial Job
Running an interactive job
Running an MPI job
MOAB MPICH2 Job
MOAB OpenMPI Job
MOAB MPICH1 Job
Job Dependencies
MOAB Queues
All Pages

Example 3: Submitting a parallel job with mvapich2

To compile the trapezoid program using the Intel compiler, we first have to make sure it is in our path by executing:

$ module load intel
$ module load intel-mvapich2

We can check if we have the right compiler by running which mpicc and mpicc -v. To compile our program we run:

$  mpicc -o trap-mpichv2 trap.c  -lm

Note: The above command may produce the message: "warning: feupdateenv is not implemented and will always fail." This is normal and no cause for concern unless you know for certain that your compilation requires the use of this function.

To submit the program to a batch system, you must create a startup script with the appropriate topology. Here is an example that requests 8 nodes, thus a total of 8 processes. The host file that mvampich2 uses must have the number of processors to use as host:N, where N is the number of processes. The mpdboot processes are started at one per node and the mpirun should start jobs at N per node. The argument to mpdboot must match the number of nodes and the argument to mpirun must match the number of nodes times the number of processes per node.

1 #!/bin/bash
2
3 #MOAB -l nodes=8
4 #MOAB -j oe
5 #MOAB -l walltime=60:00
6
7 module load intel-mvapich2
8
9 mpirun $PBS_O_WORKDIR/trap-mpich2

This script first sets up the right environment for the mpichv2 paths for the Intel compiler and then the mpirun program starts the executable trap-mpich2 on 8 nodes. In this example we have set the walltime to 1 hour (60 minutes). Although it's not imperative to set the walltime property, it does make it easier for the scheduler to schedule your job. Be sure not under-estimate the walltime, but over-estimate a bit.

If the file is saved as trap-mpichv2.sh, the job can be executed with:

$ msub trap-mpichv2.sh 

The above script makes the moab job submission very flexible. The number of processes that you request will be obtained not by the topology requested but by the nodes that have free cpus. For example, a 128 node job may end up working on 64 nodes with 2 free CPUs or 128 nodes with 1 free CPU or some linear combination of this.

This example script can be found on the cluster in /panfs/storage.local/system/tutorial/example2/trap-mpichv2.sh.



 

Welcome to

High Performance Computing @ FSU

Computational resources are a critical part of scientific research and engineering programs and play an increasingly important role in preparing students for careers in both commercial and academics fields. Recent advances in data acquisition, algorithm development, and computer hardware have made High Performance Computing (HPC) fundamentally necessary to remain competitive. FSU has long recognized the importance of local HPC resources for research and education. In 2002 FSU gained the distinct honor of having the largest university-owned computer facility worldwide and 34th largest among all facilities. Since its inception this facility has enabled research programs spanning such disciplines as biology, engineering, physics, meteorology, mathematics, oceanography, and more (see our research page for details).

The FSU HPC mission is to:

  • Support multidisciplinary research
  • Provide a general access computing platform
  • Encourage cost sharing by departments with dedicated computing needs
  • Provide a broad base of support and training opportunities