Using the MOAB Workload Manager - MOAB MPICH2 Job
Article Index
Using the MOAB Workload Manager
Basic MOAB Script
MOAB Serial Job
Running an interactive job
Running an MPI job
MOAB MPICH2 Job
MOAB OpenMPI Job
MOAB MPICH1 Job
Job Dependencies
MOAB Queues
All Pages

Example 3: Submitting a parallel job with mvapich2

To compile the trapezoid program using the Intel compiler, we first have to make sure it is in our path by executing:

$ module load intel
$ module load intel-mvapich2

We can check if we have the right compiler by running which mpicc and mpicc -v. To compile our program we run:

$  mpicc -o trap-mpichv2 trap.c  -lm

Note: The above command may produce the message: "warning: feupdateenv is not implemented and will always fail." This is normal and no cause for concern unless you know for certain that your compilation requires the use of this function.

To submit the program to a batch system, you must create a startup script with the appropriate topology. Here is an example that requests 8 nodes, thus a total of 8 processes. The host file that mvampich2 uses must have the number of processors to use as host:N, where N is the number of processes. The mpdboot processes are started at one per node and the mpirun should start jobs at N per node. The argument to mpdboot must match the number of nodes and the argument to mpirun must match the number of nodes times the number of processes per node.

1 #!/bin/bash
2
3 #MOAB -l nodes=8
4 #MOAB -j oe
5 #MOAB -l walltime=60:00
6
7 module load intel-mvapich2
8
9 mpirun $PBS_O_WORKDIR/trap-mpich2

This script first sets up the right environment for the mpichv2 paths for the Intel compiler and then the mpirun program starts the executable trap-mpich2 on 8 nodes. In this example we have set the walltime to 1 hour (60 minutes). Although it's not imperative to set the walltime property, it does make it easier for the scheduler to schedule your job. Be sure not under-estimate the walltime, but over-estimate a bit.

If the file is saved as trap-mpichv2.sh, the job can be executed with:

$ msub trap-mpichv2.sh 

The above script makes the moab job submission very flexible. The number of processes that you request will be obtained not by the topology requested but by the nodes that have free cpus. For example, a 128 node job may end up working on 64 nodes with 2 free CPUs or 128 nodes with 1 free CPU or some linear combination of this.

This example script can be found on the cluster in /panfs/storage.local/system/tutorial/example2/trap-mpichv2.sh.