Authors: Alice Patania, Pierluigi Selvaggi, Mattia Veronese, Ottavia Dipasquale, Paul Expert, and Giovanni Petri
Understanding how gene expression translates to and affects human behaviour is one of the ultimate aims of neuroscience. In this paper, we present a pipeline based on Mapper, a topological simplification tool, to produce and analyze genes co-expression data. We first validate the method by reproducing key results from the literature on the Allen Human Brain Atlas, and the correlations between resting-state fMRI and gene co-expression maps. We then analyze adopamine-related gene-set and find that co-expression networks produced by Mapper returned a structure that matches the well-known anatomy of the dopaminergic pathway. Our results suggest that topological network descriptions can be a powerful tool to explore the relationships between genetic pathways and their association with brain function and its perturbation due to illness and/or pharmacological challenge.
DISCLAIMER: Running all the scripts in this repository is going to give the list of all the results found in the paper, but not the figures or the standard exploratory analysis ( i.e. the histograms and KS tests ). I am willing to change this decision if anyone needs it, write to me or start an issue
- put up the datasets
- make a parameters selection script
- make all code into scripts that can be run from command line
- make a script to compute the agreement matrix
- make the shortest path script
- write a tutorial on how to run all the code
- find a way to put the dataset that are too big for git
- add the list of sample ids used by Richiardi et al. in their paper
- add dependencies
- data:
- dataset normalized: Download the data used in the study here.
- the two list of genes used in the study
dopamine.txtrichiardi.txt
- code:
MapperTools.py: All the functions needed to build the graphparameters.py: Computes the statistics used for the choice of parameters.
takes as input the dataset id (dopamine,richiardi, orfull) and saves the statistics in a csv in the folderoutput.selection.py: Selects the optimal parameters using the output fromparameters.py.
takes as input the dataset id (dopamine,richiardi, orfull) and saves the parameters in a txt in the folderoutput.run.py: Builds the graph for the optimal parameters found byselection.py.
takes as input the dataset id (dopamine,richiardi, orfull) and saves the adjacency matrix and node information in 2 pickled dictionaries in the folderoutput.agreement_matrix.py: Computes the agreement matrix for the different graph built byrun.py.
takes as input the dataset id (dopamine,richiardi, orfull) and saves the matrix a pickled pandas DataFrame in the folderoutput.shortest_path.py: Computes the shortest path from the nodes containing samples of VGA and substantia nigra to the rest of the brain.
takes as input the dataset id (dopamine,richiardi, orfull) and saves the information for each node in a pickled dictionary in the folderoutput.
If you make use of this work in your research please cite the following paper:
Patania, Alice, Pierluigi Selvaggi, Mattia Veronese, Ottavia DiPasquale, Paul Expert, and Giovanni Petri. "Topological gene-expression networks recapitulate brain anatomy and function." bioRxiv (2018): 476382.
@article{patania2018topological,
title={Topological gene-expression networks recapitulate brain anatomy and function},
author={Patania, Alice and Selvaggi, Pierluigi and Veronese, Mattia and DiPasquale, Ottavia and Expert, Paul and Petri, Giovanni},
journal={bioRxiv},
pages={476382},
year={2018},
publisher={Cold Spring Harbor Laboratory}
}
$ python parameters.py name_gene_list
$ python selection.py name_gene_list
$ python run.py name_gene_list
$ python agreement matrix.py name_gene_listwith name_gene_list is one of (dopamine, richiardi, or full)
The file shortest_path.py can be run with any output from run.py and selection.py. In the paper we only looked at the outcomes from the dopamine related mappers, but it can be run on any other output.
An up-to-date Python 3.5 distribution, with the standard packages provided by the anaconda distribution is required.
In particular, the code was tested with:
pandas (version) etc