INSaFLU (“INSide the FLU”) is a bioinformatics free web-based suite (https://insaflu.insa.pt/) that deals with primary data (reads) towards the automatic generation of the output data that are the core first-line “genetic requests” for effective and timely viral influenza and SARS-CoV-2 laboratory surveillance (e.g., type and sub-type, gene and whole-genome consensus sequences, variants annotation, alignments and phylogenetic trees). Data integration is continuously scalable, fitting the need for a real-time epidemiological surveillance during the flu and COVID-19 epidemics.
Here, you can find how to easily set up your local INSaFLU instance.
- Processor: 8 cores (4 minimal if only surveillance module is required);
- RAM: 32GB of memory (16GB minimal if only surveillance module is required);
- Disk Space: 1TB (suggestion; depends on the volume of data to process);
Docker:
-
Install docker in your linux server; (recent versions of docker already include docker compose)
-
Install the docker extension local-persist;
$ chmod a+x install.sh
$ sudo ./install.shINSaFLU:
$ git clone https://github.com/INSaFLU/docker.git
$ cd docker
## to define the directory where the data will be saved and the web port exposed, edit the .env file:
$ cp .env_temp .env
$ vi .env
OR
$ nano .env
## test if docker is installed and running
$ docker ps
$ docker run hello-world
## (optional) to run the new viral detection module you need to set up the software and databases
## This step can take several hours
$ ./up_televir.sh
## register TELEVIR references in INSaFLU. This step can take some time, consider running it in the background.
$ docker exec -it insaflu-server register-televir-references
## Now run INSaFLU
$ ./up.sh
## create an user, in other terminal or you can use 'screen' in previous steps
$ docker exec -it insaflu-server create-userNow, you can go to a web explorer and link to the address "127.0.0.1:<port defined in .env>". Default port is 8080
To stop:
$ ./stop.sh
To start again:
$ ./up.sh
With these commands you can interact with INSaFLU image to do several tasks.
How to run: $ docker exec -it insaflu-server <>
Commands:
* create-user ## create a user in insaflu;
* list-all-users ## list all users in insaflu;
* update-password ## update password for a specific user;
* remove-fastq-files ## remove fastq files to increase sample in hard drive. You must have a copy of these files;
* unlock-upload-files ## unlock files samples thar are zombie when upload multiple samples;
* restart-apache ## restart web server, for example, after change something in insaflu/env/insaflu.env file;
* upload-reference-dbs ## place new references in db/references and you can update them;
* update-nextstrain_builds ## update the nextstrain builds to the latest information;
* update-insaflu ## update insaflu software to a new version;
* test-email-server ## test you smtp server, change parameters first in insaflu/env/insaflu.env file;
* register-televir-references ## register TELEVIR references in INSaFLU;
* confirm-email-account
Examples:
$ docker exec -it insaflu-server create-user
$ docker exec -it insaflu-server update-tbl2asn
$ docker exec -it insaflu-server restart-apache
$ docker exec -it insaflu-server update-password <some login>
$ docker exec -it insaflu-server update-insafluYou can customize your environment. Some of the relevant variables include the maximum reads size for upload (e.g., MAX_FASTQ_FILE_UPLOAD = 104857600), indicate if the files should be (or not) downsized after upload (i.e., DOWN_SIZE_FASTQ_FILES = True/False), indicate the maximum files size after downsizing (e.g. MAX_FASTQ_FILE_WITH_DOWNSIZE = 429916160), maximum length of external consensus sequences for nextstrain analysis (eg. MAX_LENGTH_SEQUENCE_TOTAL_FROM_CONSENSUS_FASTA = 104857600), etc. To change these variables you need to edit the config file .env, as described below:
### get into INSaFLU docker
$ docker exec -it insaflu-server /bin/bash
### change the values here
$ vi /insaflu_web/INSaFLU/.env
### get out INSaFLU docker
$ Ctrl^D
### restart apache
$ docker exec -it insaflu-server restart-apacheIf you want to perpetuate the changes in future updates of INSaFLU webserver you also need to update "insaflu/env/insaflu.env".
TELEVIR is module that can be installed in INSaFLU. This module is used to detect viral sequences in metagenomic samples. The installation of this module is optional and can be done with the command "./up_televir.sh". The installation of this module can take several hours.
TELEVIR offers a modular workflow framework, where each module can have multiple software available. TELEVIR software installation, with associated databases in the case of detection or host depletion, is controlled via the file components/televir/config_install.py by setting the individual variables to True or False.
To configure installation, modify the file components/televir/config_install.py and set the variables to True or False according to the software you want to install. Then run the commands "./build.sh" and "./up_televir.sh" to install the software.
To update TELEVIR databases, modify the file components/televir/config_install.py and set the variable UPDATE to True. Then run the commands "./build.sh" and "./up_televir.sh" to update the databases. This will update the centrifuge index and the refseq nucleotide reference database. Remember to register-televir-references after updating the databases to ensure that the references are correctly registered in INSaFLU.
$ docker exec -it insaflu-server update-televir-databasesThis command will take some time to run, consider running it in the background (e.g. using the -d option of docker exec).
TELEVIR local installation makes available aditional databases not available in the web version. These databases are:
- The Centrifuge bacterial database
- The Kraken2 bacterial database
Some influenza sequences of the abricate database for "contigs2sequences" assignment currently being used on INSaFLU free website (latest version can be found here: https://insaflu.readthedocs.io/en/latest/data_analysis.html#type-and-sub-type-identification) are not included as part of this repository as they are protected by the terms of GISAID sharing (we gratefully acknowledge the Authors, Originating and Submitting laboratories, as indicated in the lists provided in the Documentation). These sequences will need to be collected by the user and the database will need to be build based on abricate instructions on "making your own database" (https://github.com/tseemann/abricate). Please contact us if you need help for building the database currently being used on INSaFLU free website.
This steps are for the users that already have previous docker installations of INSaFLU. This re-installation maintains all previous data that were generated in older installations.
$ cd <move to the previous instalation of insaflu docker>
$ ./stop.sh
$ git pull
$ docker image ls
REPOSITORY TAG IMAGE ID CREATED SIZE
<user_name> insaflu-server 637475d74da0 16 hours ago 8.38GB
docker_db_insaflu latest 30f7aa670a79 16 hours ago 331MB
prodrigestivill/postgres-backup-local latest 5ff2ca2295f1 25 hours ago 326MB
postgres 10 3cfd168e7b61 2 weeks ago 200MB
centos 7 7e6257c9f8d8 2 months ago 203MB
funkyfuture/deck-chores 1 848ca42ff6aa 3 weeks ago 321MB
$ docker image rm -f <IMAGE ID that exist in your docker for insaflu-server>
# In my case:
$ docker image rm -f 637475d74da0
$ ./build.sh
$ ./up.sh
# It will give an error,
Recreating insaflu-server ... error
.....
.....
Continue with the new image? [yN]y "Press 'y' to update the insaflu-server"You can update only INSaFLU website to last version (keep your previous data).
For INSaFlu versions equal or higher 1.5.2
### update INSaFLU website
$ docker exec -it insaflu-server update-insafluFor INSaFLU versions before 1.5.2
It is necessary to install the last INSaFLU docker.
NOTE: When you make the update of INSaFLU docker you update all software and INSaFLU website. When you update only INSaFLU website you only update INSaFLU.
