Why Alternative Splicing?

Between 90% and 95% of multi-exon human genes are alternatively spliced. As many human diseases are caused by the expression or non-expression of certain proteins, and the misregulation of alternative splicing has been proven to cause or modify diseases. Genetic and biological factors cause alternative splicing. One such factor is the presence and likelihood of certain DNA sequences to bind with RNA-binding proteins that regulate splicing. Another example is the tissue in which the transcription occurs. With the aim of deciphering the splicing mechanism, there has been much research in the field of computational biology to build a splicing code that can help predict and understand alternative splicing based on genomic data.

Our Solution

In this project, we set out to improve upon the state of the art in predictive computational models for alternative splicing: deep neural networks. With the advent of machine learning, splicing codes have become incredibly accurate in their predictions of how commonly a given exon is included in RNA and thus how frequently differential proteins are synthesized. This semester, we have posited one such formulation, namely, that DNA is best thought of as an image and is therefore amenable to machine learning architectures that arose to perform image processing tasks. With this in mind, we have set out to create a Convolutional Neural Network to predict alternative splicing.

Built With

Share this project:

Updates