User Tools

Site Tools


services:cluster:gpunodes

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
services:cluster:gpunodes [2015/06/18 12:02] – created dregerservices:cluster:gpunodes [2024/04/26 14:37] (current) hoffmac00
Line 1: Line 1:
 ====== Introduction to GPU accelerated jobs ====== ====== Introduction to GPU accelerated jobs ======
 +
 +<note info>This information is severely outdated</note>
  
 Currently we have 31 nodes in the yoshi cluster (ygpu01-ygpu31) equipped with GPU boards. The exact hardware config is: Currently we have 31 nodes in the yoshi cluster (ygpu01-ygpu31) equipped with GPU boards. The exact hardware config is:
Line 13: Line 15:
 Here I give a simple example using GROMACS. First I'll use an [[interactivesessions|interactive session]] to explore the GPU feature, in the end I'll supply a complete batch script for use with ''sbatch''. Here I give a simple example using GROMACS. First I'll use an [[interactivesessions|interactive session]] to explore the GPU feature, in the end I'll supply a complete batch script for use with ''sbatch''.
  
 +<xterm>
 +dreger@yoshi:~/gpu> **sinfo | grep gpu**
 +gpu-test     up    2:00:00      1   idle ygpu01
 +gpu-main     up   infinite     30   idle ygpu[02-31]
 +</xterm>
 +
 +The test partition gpu-test which consists of the single node ygpu01 will most likely be free, since it has a timelimit of 2 hours. So we'll use that for testing:
 +
 +<xterm>
 +dreger@yoshi:~/gpu> **srun --time=02:00:00 --nodes=1 --tasks=8 --gres=gpu:2 --partition=gpu-test --mem=1G --pty /bin/bash**
 +dreger@ygpu01:~/gpu> **env | grep CUDA**
 +CUDA_VISIBLE_DEVICES=0,1
 +dreger@ygpu01:~/gpu> **nvidia-smi**
 +Thu Jun 18 14:16:19 2015       
 ++------------------------------------------------------+                       
 +| NVIDIA-SMI 340.65     Driver Version: 340.65                               
 +|-------------------------------+----------------------+----------------------+
 +| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
 +| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
 +|===============================+======================+======================|
 +|    Tesla M2070         Off  | 0000:14:00.0     Off |                    0 |
 +| N/A   N/   P0    N/A /  N/A |      9MiB /  5375MiB |      0%      Default |
 ++-------------------------------+----------------------+----------------------+
 +|    Tesla M2070         Off  | 0000:15:00.0     Off |                    0 |
 +| N/A   N/   P0    N/A /  N/A |      9MiB /  5375MiB |      0%      Default |
 ++-------------------------------+----------------------+----------------------+
 +                                                                               
 ++-----------------------------------------------------------------------------+
 +| Compute processes:                                               GPU Memory |
 +|  GPU       PID  Process name                                     Usage      |
 +|=============================================================================|
 +|  No running compute processes found                                         |
 ++-----------------------------------------------------------------------------+
 +</xterm>
 +
 +The ''nvidia-smi'' command gives some information on the GPUs. Currently no process is running on the GPUs. We'll start a simple GROMACS computation:
 +
 +<xterm>
 +dreger@ygpu01:~/gpu> **module load gromacs/non-mpi/4.6.7-cuda**
 +dreger@ygpu01:~/gpu> **genbox -box 9 9 9 -p -cs spc216 -o waterbox.gro**
 +dreger@ygpu01:~/gpu> **grompp -f {{:services:cluster:run.mdp|}} -c waterbox.gro -p {{:services:cluster:topol.top|}}**
 +dreger@ygpu01:~/gpu> **mdrun**
 +[...]
 +Using 2 MPI threads
 +Using 4 OpenMP threads per tMPI thread
 +
 +2 GPUs detected:
 +  #0: NVIDIA Tesla M2070, compute cap.: 2.0, ECC: yes, stat: compatible
 +  #1: NVIDIA Tesla M2070, compute cap.: 2.0, ECC: yes, stat: compatible
 +
 +2 GPUs auto-selected for this run.
 +Mapping of GPUs to the 2 PP ranks in this node: #0, #1
 +[...]
 +               Core t (s)   Wall t (s)        (%)
 +       Time:      262.880       34.401      764.2
 +                 (ns/day)    (hour/ns)
 +Performance:       25.121        0.955
 +</xterm>
 +
 +While your jobs run you can log in to the node and call ''nvidia-smi'' to see if the GPUs are used at all:
 +
 +<xterm>
 +dreger@ygpu01:~> **nvidia-smi**
 +Thu Jun 18 14:25:21 2015       
 ++------------------------------------------------------+                       
 +| NVIDIA-SMI 340.65     Driver Version: 340.65                               
 +|-------------------------------+----------------------+----------------------+
 +| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
 +| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
 +|===============================+======================+======================|
 +|    Tesla M2070         Off  | 0000:14:00.0     Off |                    0 |
 +| N/A   N/   P0    N/A /  N/A |     67MiB /  5375MiB |     76%      Default |
 ++-------------------------------+----------------------+----------------------+
 +|    Tesla M2070         Off  | 0000:15:00.0     Off |                    0 |
 +| N/A   N/   P0    N/A /  N/A |     67MiB /  5375MiB |     77%      Default |
 ++-------------------------------+----------------------+----------------------+
 +                                                                               
 ++-----------------------------------------------------------------------------+
 +| Compute processes:                                               GPU Memory |
 +|  GPU       PID  Process name                                     Usage      |
 +|=============================================================================|
 +|    0     11481  mdrun                                                 55MiB |
 +|    1     11481  mdrun                                                 55MiB |
 ++-----------------------------------------------------------------------------+
 +</xterm>
 +
 +Please check your job logfiles to see if your program has some problems using the GPUs. In case of GROMACS this might look like:
 +
 +<xterm>
 +**NOTE: GPU(s) found, but the current simulation can not use GPUs
 +      To use a GPU, set the mdp option: cutoff-scheme = Verlet
 +      (for quick performance testing you can use the -testverlet option)**
 +
 +Using 8 MPI threads
 +
 +2 GPUs detected:
 +  #0: NVIDIA Tesla M2070, compute cap.: 2.0, ECC: yes, stat: compatible
 +  #1: NVIDIA Tesla M2070, compute cap.: 2.0, ECC: yes, stat: compatible
 +
 +**2 compatible GPUs detected in the system, but none will be used.
 +Consider trying GPU acceleration with the Verlet scheme!**
 +</xterm>
 +
 +In this case a cutoff-scheme was specified that can not be used with GPU acceleration.
 +
 +Compare the timings with a test run on the same node, that does not use the GPUs. In some cases the GPUs will not help at all, even though ''nvidia-smi'' shows a high utilization. For this example without GPU (note the missing -cuda in the module load command) we get:
 +
 +<xterm>
 +dreger@ygpu01:~/gpu> **module load gromacs/non-mpi/4.6.7**
 +dreger@ygpu01:~/gpu> **grompp -f run.mdp -c waterbox.gro -p topol.top**
 +dreger@ygpu01:~/gpu> **mdrun**
 +
 +               Core t (s)   Wall t (s)        (%)
 +       Time:      844.970      106.315      794.8
 +                 (ns/day)    (hour/ns)
 +Performance:        8.128        2.953
 +</xterm>
 +
 +So in this case the calculation runs about three times faster with two GPU cards.
 +
 +===== Example batch file =====
 +
 +A job script for the example given above could look like:
 +
 +<xterm>
 +#!/bin/bash
 +
 +#SBATCH --mail-user=dreger@physik.fu-berlin.de
 +#SBATCH --mail-type=end
 +
 +#SBATCH --output=job%j.out
 +#SBATCH --error=job%j.err
 +#SBATCH --ntasks=8
 +#SBATCH --mem-per-cpu=1024
 +#SBATCH --time=01:00:00
 +#SBATCH --gres=gpu:2
 +#SBATCH --nodes=1
 +#SBATCH --partition=gpu-main
 +
 +module load gromacs/non-mpi/4.6.7-cuda
 +
 +TAG="${SLURM_JOB_ID}-$(hostname -s)-cuda"
 +
 +grompp -f run.mdp -c waterbox.gro -p topol.top -o output-$TAG
 +mdrun -nt ${SLURM_CPUS_ON_NODE} -testverlet -v -deffnm output-$TAG
 +</xterm>
 +
 +Please make sure you change the email if you use this for your own tests ;)
  
services/cluster/gpunodes.1434628976.txt.gz · Last modified: 2015/06/18 12:02 by dreger

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki