The HPC cluster includes one Intel Xeon Phi node, with an associated general access queue, phi_q.
Xeon Phi is the brand-name for a new line of co-processors that contain Intel's Many Integrated Core Architecture (MIC). The XEON PHI is a PCI-express add-on card, which is installed in a regular host computer. Each Xeon Phi card contains four 60-core CPUs @ 1 Ghz, 8 GB of RAM, and its own embedded Linux OS.
Submitting Phi Jobs
To submit a job for the Phi Node, simply submit it to the general access
phi_q. An example submit script is below:
#!/bin/bash #MOAB -N "my_phi_job" #MOAB -l nodes=1 #MOAB -j oe #MOAB -q phi_q #MOAB -l walltime=00:00:10 $PBS_O_WORKDIR/my_phi_program.sh
Execution Modes on the Phi
Jobs can be compiled one of three ways for running on the Phi system:
- Host Only - Similar to running a regular HPC job, and the Phi card is not used.
- Phi Only - A program is compiled on the host and is then run entirely on the Phi card.
- Hybrid - Parts of the program run on the host, and parts run on the Phi card.
Compiling for the Phi
You can compile most jobs for the Phi Node on the Submit Node, the same way you would for any other job.
If you experience any compilation errors related to missing libraries, you may need to start an interactive job to compile your code directly on the Phi node. From the login node, run:
# qsub -I -l nodes=1 -q phi_q -l walltime=1:00:00
You will see output similar to:
qsub: waiting for job 7608043.moab.local to start
It can take a very long time for the Phi node to become available. Once it becomes available and your job runs, you will see an interactive shell:
qsub: job7608043.moab.local ready [paulvdm@hpc-15-35 ~]$
Only Intel compilers support the Phi architecture and only the Intel MPI implementation supports running MPI programs on the Phi. All compilation occurs on the host system.
Keep in mind that the Phi runs its own flavor of Linux on a rather small system. It does not have access to general HPC libraries like NetCDF. If you need to use those libraries, you will need to compile them for the MIC architecture yourself in your home directory.
The Intel Math Kernel Library (mkl) is available
for both the host (x86_64) and the MIC architectures. Just compile your software with the
Compiling OpenMP Multi-threaded Applications
To compile for Phi Only mode, use the
-mmic flag in ICC/Fortran. This will create
a binary that can run only on the Phi:
[paulvdm@hpc-15-35 ~] $ module load intel [paulvdm@hpc-15-35 ~] $ icc -o helloworld-openmp helloworld-openmp.c -mmic -openmp [paulvdm@hpc-15-35 ~] $ ./helloworld-openmp -bash: ./helloworld-openmp: cannot execute binary file [paulvdm@hpc-15-35 ~] $ ssh mic0 [paulvdm@hpc-15-35-mic0 paulvdm]$ ./helloworld-openmp Hello World from thread = 187 ... Number of threads = 240
To compile for the Hybrid Mode, you must use the offload engine. Your program will run on the host computer, but parts of the code will be "offload" to the Phi.
In your code, this is done by using the
offload target mic pragma for the part of the code that
needs to run on the Phi. It is usually an
omp parallel for loop or call to an MKL function.
You can specify which data you want to transfer between the host system and the Phi card using the
inout options. For example:
#pragma offload target(mic) out(a:length(N)) #pragma omp parallel for for(i = 0; i < N; i++) a[i]=i;
[paulvdm@hpc-15-35 ~] $ icc -o offload offload.c -openmp
[paulvdm@hpc-15-35 ~] $ ./offload
Refer to the Intel Website for more information about the offload pragma.
Compiling for MPI Applications
MPI Applications for the Phi must be compiled with Intel's MPI compiler. If you create a Phi Only application, you must create an executable for the MIC architecture.
# Load the intel-impi-mic module and compile using mpicc or mpif90 with the -nmic flag # # We advise using a pre- or post-fix in the name of the executable, which reflects the # target architecture [paulvdm@hpc-15-35 example2] $ module load intel-impi-mic [paulvdm@hpc-15-35 example2] $ mpicc -mmic -o trap-impi-mic trap.c -lm -openmp [paulvdm@hpc-15-35 example2] $ ssh mic0 [paulvdm@hpc-15-35-mic0 paulvdm]$ cd tutorial/example2 [paulvdm@hpc-15-35-mic0 example2]$ mpiexec ./trap-impi-mic
If you create a Hybrid application, you must compile two versions of your app; one for the MIC, and one for the x86_64 host system architecture:
# Compile a an executable for both MIC and x86_64 architectures, and then create # a configuration file for mpiexec.hydra [paulvdm@hpc-15-35 example2]$ module purge [paulvdm@hpc-15-35 example2]$ module load intel-impi-mic [paulvdm@hpc-15-35 example2]$ which mpicc /panfs/storage.local/opt/intel/impi/current/mic/bin/mpicc [paulvdm@hpc-15-35 example2]$ mpicc -mmic -o trap-impi-mic trap.c -lm [paulvdm@hpc-15-35 example2]$ module swap intel-impi-mic intel-impi-x86_64 [paulvdm@hpc-15-35 example2]$ which mpicc /panfs/storage.local/opt/intel/impi/current/intel64/bin/mpicc [paulvdm@hpc-15-35 example2]$ mpicc -cc=icc -o trap-impi-x86_64 trap.c -lm [paulvdm@hpc-15-35 example2]$ cat myrun -host hpc-15-35 -np 8 /home/paulvdm/tutorial/example2/trap-impi-x86_64 -host mic0 -np 60 /home/paulvdm/tutorial/example2/trap-impi-mic [paulvdm@hpc-15-35 example2]$ mpiexec.hydra --configfile myrun
Warning: although you can use the Intel MPI version on the regular compute nodes, we advise you not to. It's not integrated with the resource manager and it will not use our fast Infiniband data network.
Compiling for OpenCL Programs
You must load the
intel-oc module to gain access to the ioc64 offline compiler and OpenCL
libraries. The version installed on the HPC is 1.2-3.0.67279.
Below is a list of some errors you may encounter. If you find an error that is not listed here, please let us know.
hello-world-openmp.c(9): warning #161: unrecognized #pragma #pragma omp parallel private(nthreads, tid)
Are you compiling with the
-openmp compiler flag? This is the error messages you will see if you do not use it:
[paulvdm@hpc-15-35 example2]$ mpicc -o trap-impi-x86_64 trap.c -lm ld: skipping incompatible /panfs/storage.local/opt/intel/impi/current/mic/lib/libmpigf.so when searching for -lmpigf ld: skipping incompatible /panfs/storage.local/opt/intel/impi/current/mic/lib/libmpigf.a when searching for -lmpigf ld: cannot find -lmpigf
Are you using the correct mpicc? Use the
module list command to sww which modules
you have loaded and which
which mpicc to see which mpicc is in your path.
[paulvdm@hpc-15-35 example2]$ module load intel-impi-x86_64 intel-impi-x86_64(6):ERROR:150: Module 'intel-impi-x86_64' conflicts with the currently loaded module(s) 'intel-impi-mic' intel-impi-x86_64(6):ERROR:102: Tcl command execution failed: conflict gnu-mvapich gnu-mvapich2 gnu-openmpi intel-mvapich intel-mvapich2 pgi-mvapich pgi-mvapich2 pgi-openmpi pgi-new-openmpi pgi-new-mvapich2 pgi-benchmark-openmpi-1.6-2 intel-openmpi intel-mvapich2 intel-mvapich intel-impi-mic
You cannot load both the intel-impi-x86_64 and intel-impi-mic at the same time. Use
module swap or
module purge/module load. You will also get this error when you load
an MPI module in your
$ ./helloworld-openmp ./helloworld-openmp: error while loading shared libraries: libiomp5.so: cannot open shared object file: No such file or directory
This is a system error; please submit a ticket to us.
$ password: 
If the system asks you for a password when you try to login to the Phi or run a hybrid parallel
program, you need to fix your
~/.ssh/authorized_keys file. The easiest way to do this is to
delete your ~/.ssh directory (
rm -rf/.ssh) and logout, then re-login to the HPC submit node.
$ ssh mic0 ssh: Could not resolve hostname mic0: Name or service not known
You can login to the Phi only from the host computer, which is currently hpc-15-35.
# Phi hangs when attempting to login
Unfortunately, we have noticed some stability issues when mounting the Panasas filesystem on the Phi. If you encounter an error, please let us know.
[paulvdm@hpc-15-35 example2]$ mpiexec.hydra --configfile myrun CMA: unable to get RDMA device list librdmacm: couldn't read ABI version. librdmacm: assuming: 4
You can disregard this notice; your pgoram will run without error. There is a clash between the HPC infiniband drviers and the Intel infinband-over-pci-express drivers.