Table Of Contents

Previous topic

Developer Documentation

Next topic

Quickstart

This Page

Installation

Quick installation instructions

The following command sequence will generate a virtual environment ratatosk and install all necessary modules and requirements.

INSTALL_PREFIX=~/opt
mkdir $INSTALL_PREFIX
mkvirtualenv --no-site-packages -p python2.7 ratatosk
git clone https://github.com/spotify/luigi.git $INSTALL_PREFIX/luigi
cd $INSTALL_PREFIX/luigi && python setup.py develop
pip install numpy
git clone https://github.com/percyfal/ratatosk.git $INSTALL_PREFIX/ratatosk
cd $INSTALL_PREFIX/ratatosk && python setup.py develop
cd $INSTALL_PREFIX/ratatosk/test && nosetests -v -s test_commands.py

Pre-requisites

It is recommended that you first create a virtual environment in which to install the packages. Install virtualenvwrapper and use mkvirtualenv to create a virtual environment. Note that you need to pass the --no-site-packages -p python2.7 options. Finally, add the following to your .bashrc:

source path/to/bin/virtualenvwrapper.sh
export WORKON_HOME=~/.virtualenvs

(if you’ve used virtualenv-burrito <https://github.com/brainsik/virtualenv-burrito>_ to install virtualenvwrapper, virtualenvwrapper.sh is located in ~/.venvburrito).

Installation

You can download and install ratatosk from the Python Package Index with the command

pip install ratatosk

Alternatively, to install the development version of ratatosk, do

git clone https://github.com/percyfal/ratatosk
python setup.py develop

Known installation issues

luigi

The current luigi Python Package Index is out of date. You may therefore need to manually install the development version from github.

pygraphviz

Installing Pygraphviz with pip install pygraphviz often fails because the installer cannot find the graphviz library. One solution lies in modifying the setup.py that comes with the pygrahviz package. After a failed pip install in virtual environment virtualenv (or whatever you called it), you will typically find the failed build in ~/.virtualenvs/virtualenv/build/pygraphviz. In that directory, modify the following section in setup.py:

# If the setup script couldn't find your graphviz installation you can
# specify it here by uncommenting these lines or providing your own:
# You must set both 'library_path' and 'include_path'

# Linux, generic UNIX
library_path='/usr/lib64/graphviz'
include_path='/usr/include/graphviz'

Dependencies

To begin with, you may need to install Tornado and Pygraphviz (see Luigi for further information).

The tests depend on the following software to run:

  1. bwa
  2. samtools
  3. GATK - set an environment variable GATK_HOME to point to your installation path
  4. picard - set an environment variable PICARD_HOME to point to your installation path
  5. fastqc

Running the tests

Make sure that an instance of the daemon ratatoskd is running in the background. It may be convenient to run it in a screen session.

ratatoskd &

Cd to the test directory (test) and run

nosetests -v -s test_commands.py

To run a given task (e.g. TestCommand.test_bwaaln), do

nosetests -v -s test_commands.py:TestCommand.test_bwaaln

Task visualization and tabulation

By default, the tests use a local scheduler, implemented in luigi. For production purposes, there is also a central planner. Among other things, it allows for visualization of the task flow by using Tornado and Pygraphviz. Results are displayed in http://localhost:8081, results “collected” at http://localhost:8082/api/graph.

In addition, I have extended the luigi daemon and server code to generate a table representation of the tasks (in http://localhost:8083). The aim here would be to define a grouping function that groups task lists according to a given feature (e.g. sample, project).

In order to view tasks, run

ratatoskd &

in the background and run the tests:

nosetests -v -s test_commands.py