Inspiration

The multi-step process from fasta to phylogenetic tree

What it does

Accepts a fasta file, generates an MSA file and a Newick tree with renamed leaves that can be used in your favorite phylogenetic tree visualizer.

How I built it

Python (deepnote for collaboration, visual studio for debugging)

Challenges I ran into

Using muscle through python.

Accomplishments that I'm proud of

Having an almost working program by the submission deadline. OS detecting auto-installer for muscle. (in progress)

Tasks completed this BNFOthon

Create a multiple sequence alignment of the given FASTA file containing the amino acid sequence of the COL5A1 gene. Investigate the evolutionary relatedness of collagens from other organisms using a phylogenetic tree and overlaying different protein structures for mouse and human organism Model its 3D structure using AlphaFold and Chimera Xand validate. Finding conserved domains and motifs.

What I learned

It is difficult to make a unified automatic pipeline

What's next for FastNewPy

Actually working fully as intended. Adding checks for different fasta sequence ID formats and ability to name leaves based on these as well. Optionally automatically generating visualization of the phylogenetic tree.

Differences between presentation data and code:

Clustal Omega used for MSA data in presentation. Muscle used in code for MSA.

Jalview used for Newick tree file generation in presentation Biopython used for Newick tree file generation in code.

iTOL used to create final presentation phylogenetic tree. https://itol.embl.de/

Current limitations

Only fully works on Linux, other operating systems in progress for support. Requires installing biopython

Built With

Share this project:

Updates

posted an update

Inspiration

The multi-step process from fasta to phylogenetic tree

What it does

Accepts a fasta file, generates an MSA file and a Newick tree with renamed leaves that can be used in your favorite phylogenetic tree visualizer.

How I built it

Python (deepnote for collaboration, visual studio for debugging)

Challenges I ran into

Using muscle through python.

Accomplishments that I'm proud of

Having an almost working program by the submission deadline. OS detecting auto-installer for muscle.

What I learned

It is difficult to make a unified automatic pipeline

What's next for FastNewPy

Actually working fully as intended. Adding checks for different fasta sequence ID formats and ability to name leaves based on these as well. Optionally automatically generating visualization of the phylogenetic tree.

Differences between presentation data and code: Clustal Omega used for MSA data in presentation.

Muscle used in code for MSA.

Jalview used for Newick tree file generation in presentation

Biopython used for Newick tree file generation in code.

iTOL used to create final presentation phylogenetic tree. https://itol.embl.de/

Log in or sign up for Devpost to join the conversation.