CxC Intact Submission

Technique

We found the probability that certain words appear in doctor notes of various specializations, then we used Bayes' Theorem on doctor's notes to predict their category.

We used spaCy to handle stop words and to lemmatize the words (to group similar words).

Challenges Faced

We were all busy with midterms during the first half of this datathon. We weren't on campus for the remainder of the time, which complicated communication.

Overall Outcomes

Despite our limited experience in the field of Data Science, we were able to achieve an F-score of 80%, which was pretty good for our first datathon. There's definitely some optimizations that could be made, but we were quite happy with this result.

Built With

jupyter
python
spacy

Updates

Max Huang started this project — Feb 24, 2023 08:50 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.