This document describes how to use python for numerical and/or batch jobs on the cluster with a heavy focus on using IPython for management of the multiprocessing. Other solutions are possible, but not the focus of this introductory document. Furthermore, this document focuses on how to setup a parallel IPython environment, but will not discuss its usage.
This is version 0.2(1390214905) of this document. If you have any suggestions, send them to behrmann@physik.fu-berlin.de.
A ready-made python environment has been built for the cluster and is located at /opt/python2.7
. This path holds python 2.7.6
and the following packages
behrmj87@sheldon:~> pip freeze Cython==0.19.2 cffi==0.8.1 ipython==1.1.0 matplotlib==1.1.1 mpi4py==0.6.0 numpy==1.8.0 pandas==0.13.0 pip==1.5.0 pycparser==2.10 pyparsing==2.0.1 python-dateutil==2.2 pytz==2013.9 pyzmq==14.0.1 scipy==0.13.2 six==1.5.2 sympy==0.7.4.1 toolz==0.5.1 tornado==3.1.1 wsgiref==0.1.2
To use these newer python either evaluate the following to commands in your shell or add them to your .bashrc (or other respective shell configuration file)
export PATH=/opt/python2.7/bin:${PATH} export PYTHONPATH=/opt/python2.7/lib/python2.7/site-packages
The first line adds all binaries from the python installation in /opt/
to your path and the second one sets the path in which this python installation looks for its libraries.
The python package pip
is also provided, which allows you to install packages you may need on your own. To install python packages loacally in your $HOME
use
pip --user --ignore-installed packagename
which will install the package packagename into $HOME/.local/lib/python2.7/site-packages
. If you install packages locally to your home folder, than you need to change the PYTHONPATH
variable mentioned above as follows PYTHONPATH=${HOME}/.local/lib/python2.7/site-packages:/opt/python2.7/lib/python2.7/site-packages
.
The subset of already installed packages can also be found packaged as wheel
-packages [1] in /opt/python2.7/share/wheel
and can be installed with the command
pip install --user --ignore-installed --use-wheel --no-index --find-links=/opt/python2.7/share/wheel packagename
This is for particular use in virtualenvs, for which, for one reason or the other you do not want to use the provided system packages.
IPython can be easily set up for cluster usage. The IPython parallel documentation can be found here, with the setup explained here.
IPython does provide support for the cluster's queueing interface pbs, but the rationale behind their configuration remains unclear to the author, so we provide a somewhat more simple configuration.
To create an IPython profile you can run the below bash script
#!/bin/bash ipython profile create --parallel --profile=mpi cd %{HOME}/.ipython/profile_mpi patch < /opt/python2.7/share/configs/ipcluster/ipcluster_config.patch patch < /opt/python2.7/share/configs/ipcluster/ipcontroller_config.patch patch < /opt/python2.7/share/configs/ipcluster/ipengine_config.patch
which will immediately patch the the configuration to start the engines using mpiexec
and the controller to listen not only on the local host, which is important if you are using multiple nodes.
Alternatively you can change the lines yourself. In $HOME/.ipython/ipcluster_config.py
you need to set
c.IPClusterEngines.engine_launcher_class = 'MPIEngineSetLauncher'
and in $HOME/.ipython/ipcontroller_config.py
you need to set
c.HubFactory.ip = '*'
Both files are a recommended reading for further options that can be changed.
Furthermore, you may want to set c.IPEngineApp.startup_script = u'/path/to/my/startup.py
' to a startup script for importing all needed packages into the engines.
The usage of IPython in a parallel environment is explained here (for direct access to the IPython engines) and here (for task farming).
Having set up our environment, we will now describe how to write job files using IPythons cluster capabilities. Below you can find a sample jobfile. To understand the header with the information for PBS, please read the wiki page on the queueing system.
#!/bin/bash #PBS -N ipython_testjob #PBS -l walltime=0:02:00 #PBS -l nodes=2:ppn=4 #PBS -m bea -M behrmann@physik.fu-berlin.de ## set environment export PATH=/opt/python2.7/bin/:${PATH} export PYTHONPATH=/opt/python2.7/lib/python2.7/site-packages/ ## go to the directory the user typed 'qsub' in cd $PBS_O_WORKDIR ## start IPython cluster with eight engines using the mpi profile ## remember 8 (engines) = 2 (nodes) x 4 (processors) ipcluster start -n 8 --profile=mpi & ## the engines are ready after 30s, we will give them an ## extra 10s as grace period sleep 40 ## run the script you want to run ## python can be used instead of ipython, but in case of errors ## ipython offers more valuable tracebacks ipython ./test.py ## stop the IPython cluster again ipcluster stop --profile=mpi
The below sample python script prints the PIDs of the engines and information on the nodes on which they are running. Furthermore it tests for all numbers below 5 million for primality once sequentially and once in parallel, printing times for both runs and whether their results agree.
from __future__ import print_function, division, absolute_import from IPython.parallel import Client import time # connect to cluster running the mpi profile # connection will fail without explicit mentioning of # which profile to connect to rc = Client(profile='mpi') dview = rc[:] @dview.remote(block=True) def getpid(): import os return os.getpid() print(getpid()) @dview.remote(block=True) def gethostname(): import os return os.uname() print(set(gethostname())) # external imports on engines # if local=True is not set the serial calculation below will fail with dview.sync_imports(local=True): import numpy as np import scipy as sp import sympy starttime = time.time() serial_result = map(lambda n: sympy.ntheory.isprime(n), xrange(5000000)) serialtime = time.time() - starttime starttime = time.time() parallel_result = dview.map_sync(lambda n: sympy.ntheory.isprime(n), xrange(5000000)) paralleltime = time.time() - starttime correctness = (serial_result == parallel_result) print('Serial and parallel result are equal: {0}.'.format(correctness)) print('Serial calculation took {0} seconds, parallel took {1} seconds'.format(serialtime, paralleltime))
As described on the page of the queueing system, the queue can be used interactively. This can be used for debugging purposes and massively parallel interactive work. The IPython cluster has then to be started manually using the above command from the job script.
MPI can be used with IPython, as described here. An advanced example of using IPython in a parallel workflow, both with and without MPI can be found here.
[1] Documentation for the wheel package