Table of Contents

Interactive sessions with Slurm

You don't have to run your jobs non-interactively using sbatch. It's also possible to open an interactive shell through the queuing system just like when using ssh to a node. You may only use ssh to log into a node to check if your sbatch job is running fine (using top for example). You may not start calculations this way, since this bypasses the queuing system and will take resources assigned to other users. When starting an interactive shell through the queuing system you get the added benefit that the environment will be set up just like for an sbatch type job. So you can even use this to test and debug MPI based jobs using multiple nodes.

Using ''srun''

In order to start an interactive session you need to use the srun command instead of sbatch. The srun command uses the same options as sbatch (e.g. for specifying the number of nodes, the number of cpus or the amount of memory). Please note that you have to specify some time limit for interactive sessions, too. So here is a simple example allocating one core on one node for one day and starting a login shell:

<xterm> dreger@yoshi:~> srun –time=1-00:00:00 –nodes=1 –tasks=1 –mem=1G –pty /bin/bash dreger@y063:~> squeue -al -u dreger Tue Jun 16 16:14:43 2015

           JOBID PARTITION     NAME     USER    STATE       TIME TIMELIMIT  NODES NODELIST(REASON)
          191575      main     bash   dreger  RUNNING       0:16 1-00:00:00      1 y063

</xterm>

The option –pty is important to make the terminal behave correctly. In this session you can now check the environment and limit set up by Slurm:

<xterm> Environment dreger@y063:~> env | grep ^SLURM […] SLURM_NTASKS_PER_NODE=1 SLURM_JOB_ID=191575

limits dreger@y063:~> cat /cgroup/cpuset/slurm/uid_${SLURM_JOB_UID}/job_${SLURM_JOB_ID}/cpuset.cpus 0 dreger@y063:~> cat /cgroup/memory/slurm/uid_${SLURM_JOB_UID}/job_${SLURM_JOB_ID}/memory.limit_in_bytes 1073741824 </xterm>

Using multiple cores or nodes

In the next example we will allocate 2 nodes with 8 cores each and run a MPI version of hello-world:

<xterm> dreger@yoshi:~> mpicc -o mpi_hello mpi_hello.c dreger@yoshi:~> srun –time=1-00:00:00 –nodes=2 –tasks-per-node=8 –mem=1G –pty /bin/bash dreger@y015:~> squeue -al -u dreger Tue Jun 16 16:25:39 2015

           JOBID PARTITION     NAME     USER    STATE       TIME TIMELIMIT  NODES NODELIST(REASON)
          191577      main     bash   dreger  RUNNING       0:05 1-00:00:00      2 y[015-016]

dreger@y015:~> mpirun mpi_hello Process 0 on y015 out of 16 Process 2 on y015 out of 16 Process 4 on y015 out of 16 Process 1 on y015 out of 16 Process 3 on y015 out of 16 Process 6 on y015 out of 16 Process 7 on y015 out of 16 Process 5 on y015 out of 16 Process 12 on y016 out of 16 Process 10 on y016 out of 16 Process 13 on y016 out of 16 Process 15 on y016 out of 16 Process 8 on y016 out of 16 Process 9 on y016 out of 16 Process 11 on y016 out of 16 Process 14 on y016 out of 16 </xterm>