====== Interactive sessions with Slurm ====== You don't have to run your jobs non-interactively using sbatch. It's also possible to open an interactive shell through the queuing system just like when using ssh to a node. You may only use ssh to log into a node to check if your sbatch job is running fine (using top for example). You may not start calculations this way, since this bypasses the queuing system and will take resources assigned to other users. When starting an interactive shell through the queuing system you get the added benefit that the environment will be set up just like for an sbatch type job. So you can even use this to test and debug MPI based jobs using multiple nodes. ===== Using ''srun'' ===== In order to start an interactive session you need to use the ''srun'' command instead of ''sbatch''. The ''srun'' command uses the same options as ''sbatch'' (e.g. for specifying the number of nodes, the number of cpus or the amount of memory). Please note that you have to specify some time limit for interactive sessions, too. So here is a simple example allocating one core on one node for one day and starting a login shell: dreger@yoshi:~> **srun --time=1-00:00:00 --nodes=1 --tasks=1 --mem=1G --pty /bin/bash** dreger@y063:~> **squeue -al -u dreger** Tue Jun 16 16:14:43 2015 JOBID PARTITION NAME USER STATE TIME TIMELIMIT NODES NODELIST(REASON) 191575 main bash dreger RUNNING 0:16 1-00:00:00 1 y063 The option ''--pty'' is important to make the terminal behave correctly. In this session you can now check the environment and limit set up by Slurm: //Environment// dreger@y063:~> **env | grep ^SLURM** [...] SLURM_NTASKS_PER_NODE=1 SLURM_JOB_ID=191575 //limits// dreger@y063:~> **cat /cgroup/cpuset/slurm/uid_${SLURM_JOB_UID}/job_${SLURM_JOB_ID}/cpuset.cpus** 0 dreger@y063:~> **cat /cgroup/memory/slurm/uid_${SLURM_JOB_UID}/job_${SLURM_JOB_ID}/memory.limit_in_bytes** 1073741824 ===== Using multiple cores or nodes ===== In the next example we will allocate 2 nodes with 8 cores each and run a {{:services:cluster:mpi_hello.c|MPI version of hello-world}}: dreger@yoshi:~> **mpicc -o mpi_hello mpi_hello.c** dreger@yoshi:~> **srun --time=1-00:00:00 --nodes=2 --tasks-per-node=8 --mem=1G --pty /bin/bash** dreger@y015:~> **squeue -al -u dreger** Tue Jun 16 16:25:39 2015 JOBID PARTITION NAME USER STATE TIME TIMELIMIT NODES NODELIST(REASON) 191577 main bash dreger RUNNING 0:05 1-00:00:00 2 y[015-016] dreger@y015:~> **mpirun mpi_hello** Process 0 on y015 out of 16 Process 2 on y015 out of 16 Process 4 on y015 out of 16 Process 1 on y015 out of 16 Process 3 on y015 out of 16 Process 6 on y015 out of 16 Process 7 on y015 out of 16 Process 5 on y015 out of 16 Process 12 on y016 out of 16 Process 10 on y016 out of 16 Process 13 on y016 out of 16 Process 15 on y016 out of 16 Process 8 on y016 out of 16 Process 9 on y016 out of 16 Process 11 on y016 out of 16 Process 14 on y016 out of 16