Inspiration

We were inspired to do this project due to its meaningful problem statement. We as a team feel strongly about promoting mental health and thought that creating a project centered around predicting psychiatric disorders around EEG signals would not only be a way to promote a good cause but also to explore novel machine learning techniques. While doing literature searches to further inform our project decisions, we read about other interesting work that has been done in this field, and we were further drawn into this project. Looking at the data, we also saw cool neural network opportunities based on the natural graph-like structure of the electrodes. From doing further literature review, we settled on the use of Graph Neural Network based on previous domain applications.

What it does

In this project, we aim to predict psychiatric disorder diagnosis using EEG data. The data used in this project was collected from healthy individuals and individuals with psychiatric disorders including addictive disorder, anxiety disorder, mood disorder, obsessive compulsive disorder, schizophrenia, and trauma and stress related disorder. Two types of EEG data — power spectrum density (PSD) and coherence — were collected with 19 electrodes.

After data cleaning and exploratory data analysis, we trained and evaluated the performances of several multi-class classification models to predict an individual's psychiatric disorder. We started with classical machine learning models, including tree-based models such as random forest and light gradient-boosting machine (LightGBM). Then, we moved on to more complex neural networks.

We built a graph neural network (GNN) that can accurately predict an individual’s main disorder with 38% accuracy. The GNN mimics the brain by creating a graphical model of the nodes and their connections to the other nodes. Using this, we were able to run a convolution on the graph, achieving 38% accuracy for predicting the main disorder and 25% accuracy for predicting the specific disorder. The graph neural network was especially useful because we were able to utilize our domain-specific knowledge. We placed edges in the graph with weights equal to their normalized average coherence only if that coherence was large enough in any band. By placing these edges, the model can learn specific patterns of power densities using the structure of the brain. Additionally, the GNN was supported by a convolutional neural network (CNN) that found patterns in the raw coherence values by processing a 19 x 19 grid of coherences for each band and finding patterns. By placing associated electrodes close to each other on the x and y axes, this also encoded useful spatial information. The results of the GNN and CNN were then put through several fully-connected layers to achieve our categorical results.

How we built it

We did this project by splitting into several teams. Ben was responsible for video creation and data exploration, Lauren made visualizations and experimented with classification with tree based models, Jonathan focused on building and fine tuning various models, and Ian architected and built several neural net models.

Challenges we ran into

The first challenge we encountered was understanding the problem statement and the data. We have not worked with EEG data, so we did literature searches to understand how researchers have leveraged EEG data and modeling to understand psychiatric disorders. We learned a lot about neurotechnology and terms like power density and coherence, and learned to utilize this data to gain insights into the brain's structure and create a tailored, creative solution. Another challenge was deciding how to format and handle the high-dimensional dataset. There is a lot of information in each column name (e.g. frequency bands and spatial information given by the electrode) that we wanted to pull out to further inform our modeling.

Accomplishments that we're proud of

We worked with models and data that we haven’t worked with before, and this was a valuable learning experience. This required lots of reading, whiteboarding, and rewriting code. We also strove to build models around the project context, driving model architecture and feature choices by our literature searches and exploratory data analysis. Although this was difficult, it was rewarding and fun.

In addition to this, the video was highest quality it's ever been. It included a good number of demonstrations and thoroughly explained the problem at hand. There is also a slightly hidden music piece playing in the background of the video that was composed of solely within the 36 hours of the Datathon that is also serves as the composer's 2nd composition ever. Don't listen too carefully though; it may not be entirely proper background music.

What we learned

We learned about graph neural networks and how to implement them. We also learned about how to visualize EEG data using readily available packages like MNE. We also learned how EEG data can be quantified using power spectrum density (PSD) and coherence, and we learned different ways we could structure the data to represent these features.

What's next for IntrEEGing NEURONets

While there is some predictive power in PSD and coherence values to predict psychiatric disorders, our work here shows that there is still room for model performance improvement. Some ideas we experimented with were looking into hemisphere asymmetries (e.g. frontal alpha asymmetry is associated mood and depression and bringing in information about functional relationships between different parts of the brain. Through feature engineering and creative model architectures, we would try to predict main disorder categories more accurately and also try to predict specific disorders. More granular EEG data, such as data collected with more electrodes, and a larger sample size may be beneficial in future work.

Built With

Share this project:

Updates