This is an old revision of the document!
Introduction to GPU accelerated jobs
Currently we have 31 nodes in the yoshi cluster (ygpu01-ygpu31) equipped with GPU boards. The exact hardware config is:
- 2x NVidia Tesla M2070
- 2x Xeon X5570
- 24GB RAM
- QDR Infiniband between all GPU nodes
In oder to use the GPU cards, you need to allocate them through the queuing system using the –gres=gpu:2 option. You could also just use one card if you submit with –gres=gpu:1. You also have to explicitly state the partition to run in using –partition=gpu-main (or gpu-test for the GPU test queue).
GROMACS example using GPU acceleration
Here I give a simple example using GROMACS. First I'll use an interactive session to explore the GPU feature, in the end I'll supply a complete batch script for use with sbatch.
dreger@yoshi:~/gpu> sinfo | grep gpu gpu-test up 2:00:00 1 idle ygpu01 gpu-main up infinite 30 idle ygpu[02-31]
The test partition gpu-test which consists of the single node ygpu01 will most likely be free, since it has a timelimit of 2 hours. So we'll use that for testing:
dreger@yoshi:~/gpu> srun –time=02:00:00 –nodes=1 –tasks=8 –gres=gpu:2 –partition=gpu-test –mem=1G –pty /bin/bash
dreger@ygpu01:~/gpu> env | grep CUDA
CUDA_VISIBLE_DEVICES=0,1
dreger@ygpu01:~/gpu> nvidia-smi
Thu Jun 18 14:16:19 2015
+------------------------------------------------------+
| NVIDIA-SMI 340.65 Driver Version: 340.65 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla M2070 Off | 0000:14:00.0 Off | 0 |
| N/A N/A P0 N/A / N/A | 9MiB / 5375MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 1 Tesla M2070 Off | 0000:15:00.0 Off | 0 |
| N/A N/A P0 N/A / N/A | 9MiB / 5375MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Compute processes: GPU Memory |
| GPU PID Process name Usage |
|=============================================================================|
| No running compute processes found |
+-----------------------------------------------------------------------------+
The nvidia-smi command gives some information on the GPUs. Currently no process is running on the GPUs. We'll start a simple GROMACS computation:
dreger@ygpu01:~/gpu> module load gromacs/non-mpi/4.6.7-cuda dreger@ygpu01:~/gpu> genbox -box 9 9 9 -p -cs spc216 -o waterbox.gro dreger@ygpu01:~/gpu> grompp -f run.mdp -c waterbox.gro -p topol.top dreger@ygpu01:~/gpu> mdrun [...] Using 2 MPI threads Using 4 OpenMP threads per tMPI thread 2 GPUs detected: #0: NVIDIA Tesla M2070, compute cap.: 2.0, ECC: yes, stat: compatible #1: NVIDIA Tesla M2070, compute cap.: 2.0, ECC: yes, stat: compatible 2 GPUs auto-selected for this run. Mapping of GPUs to the 2 PP ranks in this node: #0, #1 [...] Core t (s) Wall t (s) (%) Time: 262.880 34.401 764.2 (ns/day) (hour/ns) Performance: 25.121 0.955
</xterm>
