Important notes on cluster usage

Important notes on cluster usage

''/home'' on the cluster

The /home directories on the cluster are separate for each cluster and separate from our regular home directories, so you will need to copy over config you may need, such as SSH keys.

Submit jobs from ''/scratch/username''

For every account on the cluster a directory on the cluster wide filesystem /scratch is created.

You cannot write output files from jobs to /home/username from inside a running job and /scratch/username must also be the working directory for the job. This is easily accomplished by submitting your jobs directly from /scratch/username.

~/.cache is pointing to a temporary filesystem, which you can use nonetheless. You can also just keep output data on /scratch as nothing on the filesystem will ever be deleted. The main difference is, that /home is just a single server while /scratch is a cluster-filesystem based on BeeGFS that utilizes many servers at the same time.

Always specify the amount of memory your job needs in the jobfile

When submitting a job you absolutely must specify an amount of memory to be allocated for your job.

By default, a job gets 1MB of memory per allocated CPU. This is a ridiculously small value, which we set to make sure that some thought goes into how much memory you need. If more memory is allocated than you actually need, your job might wait longer than needed for a free spot in the cluster. If you specify less memory than needed, your program will be killed automatically by Slurm.

Use ''/tmp'' for I/O intensive single node jobs

Jobs that do a lot of I/O operations on a shared cluster filesystem like /scratch can severely slow down the whole system. If your job does not use multiple nodes and is not reading and writing very large files, it might be a good idea to move input and output files to the /tmp folder on the compute node itself.

/tmp is a RAM based filesystem, meaning that anything you store there is actually stored in memory. So space is quite limited. Currently all jobs on a node can use at most 20% of the total system memory for space in /tmp. If you need more space, you should consider using /dev/shm, where you can use up to 50% of the total system memory per job.

Usable space below /tmp and /dev/shm counts towards your job's memory usage and thus is limited by the --mem option

SSH access

The login node allows password-based login only from within the university network. We generally recommend SSH access with key files. To get your SSH key on the login node, when you are not at the university, you have two options:

Use the VPN.
Use an SSH proxy jump via login,physik.fu-berlin.de

The latter is done via

ssh-copy-id \
    -i ~/.ssh/id_sheldon \
    -o ProxyJump=username@login.physik.fu-berlin.de \
    username@headnode.physik.fu-berlin.de

Assuming a key file id_sheldon. You will need to change username to your username and headnode to the name of the head node (login node) of the cluster.

Modules

Modules are a staple in the HPC world. They are a way to change your environment to include paths that are not normally in your binary (~PATH~) or library search paths (~LD_LIBRARY_PATH~) so that you can use a wider variety/other/different versions of programs

These are the most important commands
#+BEGIN_SRC bash
# show available modules
module avail
# load a module
module load name_of_module
# e.g. module load gromacs/double/2020.4
#
# unload a module (usually not necessary in a job script, but you can use
# modules interactively, too)
module unload name_of_module

Somebody has to build the software. This is done by us and interested users, e.g. the GROMACS packages are mostly built by users in AG Netz. The software modules can be found in ~/net/opt~. If you want to contribute, let us know!

Table of Contents