Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
software:python [2016/03/31 12:26] meesters [Selectively Eliminate Attribute Access] |
— (current) | ||
---|---|---|---|
Line 1: | Line 1: | ||
- | ====== Python ====== | ||
- | |||
- | ===== Available versions ===== | ||
- | |||
- | Currently, the following versions of Python are installed and usable using the specified modulefile: | ||
- | |||
- | ^ Version | ||
- | | //2.6.6// | //None// | //None (System default at ''/ | ||
- | | 2.7.7 | pip, virtualenv, virtualenvwrapper, | ||
- | | 3.3.5 | pip, virtualenv, virtualenvwrapper, | ||
- | | 3.4.1 | pip, virtualenv, virtualenvwrapper, | ||
- | | 3.5 | pip, virtualenv, virtualenvwrapper, | ||
- | |||
- | We recommend to **avoid Python 2.6.6** since we can provide better support for the versions that we have installed manually. | ||
- | |||
- | **Note**: The within module '' | ||
- | |||
- | If you need additional Python packages, you can easily install them yourself either [[# | ||
- | |||
- | ===== Additional packages ===== | ||
- | |||
- | In general, having a personal Python environment where you can install third-party packages (without needing root priviliges) yourself is very easy. The preparation steps needed on Mogon are described below. | ||
- | |||
- | While the first variant is already sufficient, using virtualenvs, | ||
- | Virtualenvs can also be shared between users if created in your groups project directory. | ||
- | |||
- | ==== Home directory ==== | ||
- | |||
- | First, create some directories in which installed packages will be placed: | ||
- | |||
- | <code bash> | ||
- | $ mkdir -p ~/ | ||
- | $ mkdir -p ~/ | ||
- | </ | ||
- | |||
- | Then add the created '' | ||
- | |||
- | <code bash> | ||
- | $ echo ' | ||
- | $ source ~/.bashrc | ||
- | </ | ||
- | |||
- | Now create a configuration file for '' | ||
- | |||
- | <code bash> | ||
- | $ echo -e ' | ||
- | $ mkdir -p ~/.pip | ||
- | $ echo -e ' | ||
- | </ | ||
- | |||
- | If you now use '' | ||
- | |||
- | ==== Using virtualenv ==== | ||
- | |||
- | A so called virtualenv can be seen as an isolated, self-contained Python environment of third-party packages. \\ | ||
- | Different virtualenvs do not interfere with each other nor with the system-wide installed packages. | ||
- | |||
- | It is advised to make use of [[http:// | ||
- | |||
- | If you are using Python 2.6.6, you need to install '' | ||
- | |||
- | <code bash> | ||
- | $ easy_install virtualenv | ||
- | Searching for virtualenv | ||
- | Reading http:// | ||
- | Best match: virtualenv 1.10.1 | ||
- | [...] | ||
- | Processing dependencies for virtualenv | ||
- | Finished processing dependencies for virtualenv | ||
- | </ | ||
- | |||
- | We need to remove the easy_install configuration file created above, since the path set there would interfere with virtualenv: | ||
- | <code bash> | ||
- | $ rm ~/ | ||
- | $ rm ~/ | ||
- | </ | ||
- | |||
- | Now you can simply create, activate, use, deactivate and destroy as many virtualenvs as you want: | ||
- | |||
- | === Create === | ||
- | Creating a virtualenv will simply set up a directory structure and install some baseline packages: | ||
- | <code bash> | ||
- | $ virtualenv ENV | ||
- | New python executable in ENV/ | ||
- | Installing Setuptools...done. | ||
- | Installing Pip...done. | ||
- | </ | ||
- | |||
- | With virtualenvs, | ||
- | <code bash> | ||
- | $ virtualenv --python=/ | ||
- | $ virtualenv --python=/ | ||
- | </ | ||
- | |||
- | If you want to install the pre-installed third-party packages (numpy, scipy, matplotlib, etc.) yourself, just omit the '' | ||
- | |||
- | |||
- | === Activate === | ||
- | To work in a virtualenv, you first have to activate it, which sets some environment variables for you: | ||
- | <code bash> | ||
- | $ source ENV/ | ||
- | (ENV)$ # Note the name of the virtualenv in front of your prompt - nice, heh? | ||
- | </ | ||
- | |||
- | === Use === | ||
- | Now you can use your virtualenv - newly installed packages will just be installed inside the virtualenv and just be visible to the python interpreter you start from within the virtualenv: | ||
- | <code bash> | ||
- | (ENV)$ easy_install requests | ||
- | Searching for requests | ||
- | Reading https:// | ||
- | Best match: requests 1.2.3 | ||
- | [...] | ||
- | Processing dependencies for requests | ||
- | Finished processing dependencies for requests | ||
- | </ | ||
- | or | ||
- | <code bash> | ||
- | (ENV)$ pip install requests | ||
- | Downloading/ | ||
- | Downloading requests-1.2.3.tar.gz (348kB): 348kB downloaded | ||
- | Running setup.py egg_info for package requests | ||
- | Installing collected packages: requests | ||
- | Running setup.py install for requests | ||
- | Successfully installed requests | ||
- | Cleaning up... | ||
- | </ | ||
- | |||
- | And now compare what happens with the python interpreter from inside the virtualenv and with the system python interpreter: | ||
- | <code bash> | ||
- | (ENV)$ python -c ' | ||
- | (ENV)$ / | ||
- | Traceback (most recent call last): | ||
- | File "< | ||
- | ImportError: | ||
- | </ | ||
- | |||
- | === Deactivate === | ||
- | Deactivating a virtualenv reverts the activation step and all its changes to your environment: | ||
- | <code bash> | ||
- | (ENV)$ deactivate | ||
- | $ | ||
- | </ | ||
- | |||
- | === Destroy === | ||
- | To destroy a virtualenv, simply delete its directory: | ||
- | <code bash> | ||
- | $ rm ENV | ||
- | </ | ||
- | |||
- | ==== virtualenvwrapper ==== | ||
- | |||
- | Using multiple virtualenvs can be made much more user friendly using [[http:// | ||
- | |||
- | If you are using Python 2.6.5, you can install and configure it using | ||
- | <code bash> | ||
- | $ easy_install --prefix=$HOME/ | ||
- | $ echo ' | ||
- | </ | ||
- | |||
- | If you are using any other version of Python, virtualenvwrapper is already installed and you just need to | ||
- | <code bash> | ||
- | $ echo ' | ||
- | </ | ||
- | |||
- | Re-login to apply the changes. | ||
- | |||
- | ====== Load Environment Modules (module load [mod]) ====== | ||
- | To load environment modules in python: | ||
- | <code python> | ||
- | execfile('/ | ||
- | module(' | ||
- | module(' | ||
- | </ | ||
- | |||
- | From Python 3.4.1 onwards we enabled on mogon a //modules// module ;-), e.g. | ||
- | <code python> | ||
- | import modules | ||
- | modules.module(' | ||
- | import os | ||
- | os.environ[' | ||
- | </ | ||
- | This, of course, requires an environment, | ||
- | |||
- | ====== Job submission ====== | ||
- | For python you can use the maybe basic but friendly bsub package from: https:// | ||
- | |||
- | <code python> | ||
- | from bsub import bsub | ||
- | |||
- | BAM2FQ = " | ||
- | STAR = "star --align .." | ||
- | SAM2BAM = " | ||
- | for dataset in datasets: | ||
- | bam2fq = bsub(" | ||
- | bam2fq = bam2fq( BAM2FQ % dataset ) | ||
- | star = bam2fq.then( | ||
- | sam2bam = star.then( | ||
- | print "First job_id:" | ||
- | print "Last job_id:" | ||
- | last = sam2bam.job_id | ||
- | | ||
- | print "still running? %s" % ( " | ||
- | </ | ||
- | |||
- | ====== Performance Hints ====== | ||
- | |||
- | Many of the hints are inspired by [[http:// | ||
- | |||
- | ===== Profiling and Timing ===== | ||
- | |||
- | Better than guessing is to profile, how much time a certain program or task within this program takes. Guessing bottlenecks is a hard task, profiling often worth the effort. The above mentioned Cookbook covers this chapter. | ||
- | |||
- | ===== Regular Expressions ===== | ||
- | |||
- | Avoid them as much you can. If you have to use them, compile them, prior to any looping, e.g.: | ||
- | <code python> | ||
- | import re | ||
- | myreg = re.compile(' | ||
- | for stringitem in list: | ||
- | | ||
- | # or | ||
- | | ||
- | </ | ||
- | |||
- | ===== Use Functions ===== | ||
- | |||
- | A little-known fact is that code defined in the global scope like this runs slower than code defined in a function. The speed difference has to do with the implementation of local versus global variables (operations involving locals are faster). So, if you want to make the program run faster, simply put the scripting statements in a function (also: see [[http:// | ||
- | |||
- | The speed difference depends heavily on the processing being performed. | ||
- | |||
- | |||
- | ===== Selectively Eliminate Attribute Access ===== | ||
- | |||
- | Every use of the dot (.) operator to access attributes comes with a cost. Under the covers, this triggers special methods, such as __getattribute__() and __getattr__(), | ||
- | |||
- | You can often avoid attribute lookups by using the from module import name form of import as well as making selected use of bound methods. See the illustration in [[http:// | ||
- | |||
- | ===== Too many print statements ===== | ||
- | |||
- | Regardless of the change of the print statement in Python 2.x to a function in Python 3.x, output is still flushed. | ||
- | |||
- | To avoid constant flushing and use buffered output instead, either use Python' | ||
- | ===== Compile Code!!! ===== | ||
- | |||
- | Remember that every Python Module on Mogon comes with [[http:// | ||
- | |||
- | While we cannot give a comprehensive intro in this wiki document, we recommend using Cython whenever possible and give this little example: | ||
- | |||
- | Imaging you have a (tested) script, you need to call frequently. Then create modules your main script can import and write a setup script like this: | ||
- | <code python> | ||
- | # script: setup.py | ||
- | # | ||
- | |||
- | from distutils.core import setup | ||
- | from distutils.extension import Extension | ||
- | from Cython.Distutils import build_ext | ||
- | |||
- | named_extension = Extension( | ||
- | "name of your extension", | ||
- | [" | ||
- | " | ||
- | extra_compile_args=[' | ||
- | extra_link_args=[' | ||
- | include_path = ['/ | ||
- | ) | ||
- | |||
- | setup( | ||
- | name = " | ||
- | cmdclass = {' | ||
- | ext_modules = [named_extension] | ||
- | ) | ||
- | </ | ||
- | |||
- | Replace '' | ||
- | <code bash> | ||
- | $ python ./setup.py build_ext --inplace | ||
- | </ | ||
- | This will create a file '' | ||
- | |||
- | In Cython you can release the global interpreter lock (GIL), see [[http:// | ||
- | |||
- | In particular [[http:// | ||
- | ====== Things to consider ====== | ||
- | |||
- | Python is an interpreted language. As such it should not be used for lengthy runs in an HPC environment. Please use the availability to compile your own modules with Cython; consult the relevant [[http:// | ||
- | |||
- | ====== Special packages ====== | ||
- | |||
- | Please note that we have already installed numpy, scipy and matplotlib in the versions of Python that we provide additionally. | ||
- | |||
- | ===== NumPY ===== | ||
- | |||
- | http:// | ||
- | |||
- | When installing NumPY, the first installation attempt fails at exit. Don't worry, the installation is already finished then, but to be sure, you can simply run the command again to see it exiting cleanly. | ||
- | |||
- | Note that NumPY can also be linked against the [[software: | ||
- | * MKL: http:// | ||
- | * ACML: http:// | ||