Genomics requires the use of many computational tools starting from assessing the quality of sequenced data and assembly, to annotation, and comparison and analyses. But Bioinformatics software are often hard to install. And upgrades changes the input/output behaviour, making experiments difficult to reproduce. To make matters worse, Genomicists often lack necessary computational training to setup complex Bioinformatics software, at times tens of them to compare and choose the best for the task at hand.
Easy access to Bioinformatics software is needed:
- on personal computer for small tasks
- on HPC and fat servers for more computationally intensive tasks, without having to go through sysadmin (which can be very time consuming for most labs)
- setup should be reproducible and shareable
Docker is a new technology that make reproducible setups possible. Docker works by creating "to the specification" image from a Dockerfile which are then run in an isolated container. Dockerfiles or the resulting images can be persisted forever, and shared or published over the Internet, making it possible for anybody to recreate the exact same setup at any point of time in the future.
The aim of this project is to make complex Genomics software and even BioLinux! available in just one command.
After switch has been installed:
# list available packages
switch -l
# switch to BioLinux
switch biolinux
# Run a command in BioLinux
echo 'MNTLWLSLWDYPGKLPLNFMVFDTKDDLQAAYWRDPYSIPLAVIFEDPQPISQRLIYEIRTNPSYTLPPPPTKLYSAPISCRKNKTGHWMDDILSIKTGESCPVNNYLHSGFLALQMITDITKIKLENSDVTIPDIKLIMFPKEPYTADWMLAFRVVIPLYMVLALSQFITYLLILIVGEKENKIKEGMKMMGLNDSVF' > query.fa
switch biolinux blastp -remote -html -db nr -query query.fa > result.html
Switch runs the specified package within a docker container. Processes are run in the container as the same user name, the same home directory, the same login shell and the same working directory as on the host. Volumes (external drives, NAS) mounted on the host are available in the container at the path.
Using switch is just like opening a new terminal tab, except:
- while user name is preserved, the user id need not be the same. This doesn't affect file permissions on mounted volumes though.
- login shell will be preserved only if it's one of bash/zsh/fish on the host. If not, bash will be used.
- current working directory will be preserved only if that path exists in the container. If not, home directory will be used.
One can obtain shell access to the container, or directly run a command available in the container. In the latter case the container terminates automatically once the command has been executed, output is printed to the terminal and can be redirected, and returns the exit status of the the command run within container.
We have tested switch on Mac OS X Yosemite, Ubuntu 14.04.1, CentOS 7.
Installing docker - https://docs.docker.com/installation/mac/
Installing docker - https://docs.docker.com/installation/ubuntulinux/
Add yourself to docker group so you can run docker client without sudo:
$ sudo usermod -aG docker `whoami`
# then logout and login again for the above command to take effect
Installing docker - https://docs.docker.com/installation/centos/
Add yourself to docker group so you can run docker client without sudo:
$ sudo usermod -aG docker `whoami`
# then logout and login again for the above command to take effect
Disable SELinux as it gets in the way of mounting volumes within the container:
$ sed -i .bak 's/SELINUX=enforcing/SELINUX=disabled/' /etc/selinux/config
# then reboot your system
The above command backs up the original file to /etc/selinux/config.bak. If
you are concerned about disabling SELinux, do note that we are trying to work
out a better solution.
The following should give an encouraging message:
$ docker run hello-world
$ git clone https://github.com/yeban/switch
$ cd switch
$ gem install bundler && bundle
$ bundle exec bin/switch biolinux
Obtaining BioLinux can take long. You may quickly want to test switch's featureset using our docker baseimage, which is essentially Ubuntu 14.04.1.
$ bundle exec bin/switch baseimage
- stress test against various use cases
- brew recipe for Mac