services:cluster:queuing-system
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revisionNext revisionBoth sides next revision | ||
cluster:queuing-system [2012/03/02 17:47] – dreger | services:cluster:queuing-system [2012/10/18 17:25] – pneuser | ||
---|---|---|---|
Line 1: | Line 1: | ||
- | ====== | + | ====== |
The login node of the HPC cluster is '' | The login node of the HPC cluster is '' | ||
Line 26: | Line 26: | ||
===== Submitting a job to the HPC cluster ===== | ===== Submitting a job to the HPC cluster ===== | ||
- | In oder to do any calculations on the HPC cluster **you have to submit your jobs to the queuing system** using the '' | + | In order to do any calculations on the HPC cluster **you have to submit your jobs to the queuing system** using the '' |
You submit jobs to the queuing system by writing a job-script which tells the queuing system about the resources your job needs and about the programs which are to be run. Basically the job-script is a shell script with some magic comments at the top (lines starting with ''# | You submit jobs to the queuing system by writing a job-script which tells the queuing system about the resources your job needs and about the programs which are to be run. Basically the job-script is a shell script with some magic comments at the top (lines starting with ''# | ||
Line 41: | Line 41: | ||
#PBS -l walltime=1: | #PBS -l walltime=1: | ||
#PBS -l nodes=1: | #PBS -l nodes=1: | ||
- | #PBS -m bea -M hpcuser@zedat.fu-berlin.de | + | #PBS -m bea -M hpcuser@physik.fu-berlin.de |
## go to the directory the user typed ' | ## go to the directory the user typed ' | ||
Line 53: | Line 53: | ||
< | < | ||
- | dreger@sheldon: | + | hpcuser@sheldon: |
26103.torque.physik.fu-berlin.de | 26103.torque.physik.fu-berlin.de | ||
+ | hpcuser@sheldon: | ||
+ | Job id Name | ||
+ | ------------------------- ---------------- --------------- -------- - ----- | ||
+ | 26103.torque | ||
+ | </ | ||
+ | You will receive an email message at the specified address when the job starts and a second one when it's finished: | ||
+ | |||
+ | < | ||
+ | From: hpc-torque@physik.fu-berlin.de | ||
+ | Subject: PBS JOB 26103.torque.physik.fu-berlin.de | ||
+ | Date: Fri, 02 Mar 2012 18:38:38 +0100 | ||
+ | To: hpcuser@physik.fu-berlin.de | ||
+ | Message-Id: < | ||
+ | |||
+ | PBS Job Id: 26103.torque.physik.fu-berlin.de | ||
+ | Job Name: | ||
+ | Exec host: n109/4 | ||
+ | Execution terminated | ||
+ | Exit_status=0 | ||
+ | resources_used.cput=00: | ||
+ | resources_used.mem=8688kb | ||
+ | resources_used.vmem=28128kb | ||
+ | resources_used.walltime=00: | ||
+ | </ | ||
+ | |||
+ | In your test directory you should find some files that were produced by your job: | ||
+ | |||
+ | < | ||
+ | hpcuser@sheldon: | ||
+ | -rw-r--r-- 1 hpcuser fbedv 222 Mar 2 18:37 jobfile1 | ||
+ | -rw-r--r-- 1 hpcuser fbedv 1663 Mar 2 18:37 outputfile.txt | ||
+ | -rw------- 1 hpcuser fbedv 0 Mar 2 18:37 some-good-name.e26103 | ||
+ | -rw------- 1 hpcuser fbedv 256 Mar 2 18:37 some-good-name.o26103 | ||
</ | </ | ||
- | Documentation for the options | + | '' |
- | The most important recources that can and should be specified by you jobfile using the -l PBS option are: | + | === Requesting resources === |
- | === Complicated | + | The most important recources that can (and should!) be specified by your job-script using the ''# |
- | < | + | |
+ | ^ Resource ^ Format ^ Description ^ Example ^ Default ^ | ||
+ | | nodes | {< | ||
+ | | walltime | seconds, or [[HH: | ||
+ | | pmem | size* | Maximum amount of physical memory used by any single process of the job. In our case this means per core. | **pmem=8gb** -> request 8gb RAM per core | pmem=2gb | | ||
+ | | file | size* | The amount of **local disk space per core** requested for the job. The space can be accessed at / | ||
+ | |||
+ | **size* format** = integer, optionally followed by a multiplier {b, | ||
+ | |||
+ | === Recommendations on resource usage === | ||
+ | |||
+ | Note that in general it is a bad idea to specify far too large values for pmem or walltime //just to be on the safe side//, since this will very likely delay execution of your jobs. An explanation for this behaviour will be given in an upcoming section on backfill strategy of the queuing system. | ||
+ | |||
+ | Please try to use local disk space on the compute nodes whenever possible. Since access to local storage is faster than access to your $PBS_O_WORKDIR, | ||
+ | |||
+ | === Advanced job-script | ||
+ | |||
+ | < | ||
+ | #!/bin/bash | ||
#PBS -N some_good_name | #PBS -N some_good_name | ||
- | #PBS -l nodes=12: | + | #PBS -l nodes=12: |
#PBS -l walltime=100: | #PBS -l walltime=100: | ||
#PBS -l file=1000M | #PBS -l file=1000M | ||
- | #PBS -m ea -M mymail@zedat.fu-berlin.de | + | #PBS -m ea -M hpcuser@physik.fu-berlin.de |
cd $PBS_O_WORKDIR | cd $PBS_O_WORKDIR | ||
Line 79: | Line 130: | ||
export seq=`cat seq` | export seq=`cat seq` | ||
awk ' | awk ' | ||
- | |||
infile=${flag}.inp | infile=${flag}.inp | ||
Line 104: | Line 154: | ||
===== Run a job interactively ===== | ===== Run a job interactively ===== | ||
- | If you want to run a job on a clusternode | + | If you want to run a job on a compute node interactively (i.e. for debugging purposes), simply put a **'' |
with the torque resource options you are normally using in the #PBS lines in your job script: | with the torque resource options you are normally using in the #PBS lines in your job script: | ||
< | < | ||
- | qsub **-I** -l cput=20: | + | hpcuser@sheldon: |
</ | </ | ||
qsub does not return in this case; instead, as soon as you get scheduled, you get an interactive shell on a node. | qsub does not return in this case; instead, as soon as you get scheduled, you get an interactive shell on a node. | ||
+ | |||
===== Requesting resources ===== | ===== Requesting resources ===== | ||
The following resources can be requested from the queueing system: | The following resources can be requested from the queueing system: |
services/cluster/queuing-system.txt · Last modified: 2014/06/11 14:40 by dreger