Inspiration
The multi-step process from fasta to phylogenetic tree
What it does
Accepts a fasta file, generates an MSA file and a Newick tree with renamed leaves that can be used in your favorite phylogenetic tree visualizer.
How I built it
Python (deepnote for collaboration, visual studio for debugging)
Challenges I ran into
Using muscle through python.
Accomplishments that I'm proud of
Having an almost working program by the submission deadline. OS detecting auto-installer for muscle. (in progress)
Tasks completed this BNFOthon
Create a multiple sequence alignment of the given FASTA file containing the amino acid sequence of the COL5A1 gene. Investigate the evolutionary relatedness of collagens from other organisms using a phylogenetic tree and overlaying different protein structures for mouse and human organism Model its 3D structure using AlphaFold and Chimera Xand validate. Finding conserved domains and motifs.
What I learned
It is difficult to make a unified automatic pipeline
What's next for FastNewPy
Actually working fully as intended. Adding checks for different fasta sequence ID formats and ability to name leaves based on these as well. Optionally automatically generating visualization of the phylogenetic tree.
Differences between presentation data and code:
Clustal Omega used for MSA data in presentation. Muscle used in code for MSA.
Jalview used for Newick tree file generation in presentation Biopython used for Newick tree file generation in code.
iTOL used to create final presentation phylogenetic tree. https://itol.embl.de/
Current limitations
Only fully works on Linux, other operating systems in progress for support. Requires installing biopython
Built With
- biopython
- muscle
- python
Log in or sign up for Devpost to join the conversation.