User Tools

Site Tools


services:cluster:localstorage

This is an old revision of the document!


Use local storage on the computer nodes

If many jobs write to or read from the NFS server for the cluster home at the same time, the server can get very slow and even crash. Therefore it's very important that all users try to use local storage available to the nodes if possible. In most cases this will also speed up your jobs. In order to do so you have to tell the queuing system the amount of local disk space you want to reserve for your job. The queuing system will create a directory named /local_scratch/$PBS_JOBID on the nodes. After the computation has finished you must copy the results you want to keep from the local disks back to your home directory.

Exmaple for TORQUE aka sheldon cluster

#!/bin/bash
#PBS -N local-file
#PBS -l walltime=1:00:00
#PBS -l file=10gb
#PBS -m bea -M dreger@physik.fu-berlin.de

# location of local storage directory on the node
local_dir=/local_scratch/$PBS_JOBID

if [[ -d "$local_dir" ]]; then
    echo "# found local storage at $local_dir. copying data from $PBS_O_WORKDIR to $local_dir."
    echo "# maximum file size is:" $(ulimit -f)
    # copy necessary input data
    cp $PBS_O_WORKDIR/input.dat $local_dir
    cd $local_dir    
else
    local_dir=
    echo "# no local storage found. running calculations in $PBS_O_WORKDIR"
    cd $PBS_O_WORKDIR
fi

# run jobs now
md5sum input.dat > result.out

# copy results back to $PBS_O_WORKDIR after job has finished
if [[ -n "$local_dir" ]]; then
    cp result.out $PBS_O_WORKDIR
fi
services/cluster/localstorage.1383904347.txt.gz · Last modified: 2013/11/08 09:52 by dreger

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki