Technique
We found the probability that certain words appear in doctor notes of various specializations, then we used Bayes' Theorem on doctor's notes to predict their category.
We used spaCy to handle stop words and to lemmatize the words (to group similar words).
Challenges Faced
We were all busy with midterms during the first half of this datathon. We weren't on campus for the remainder of the time, which complicated communication.
Overall Outcomes
Despite our limited experience in the field of Data Science, we were able to achieve an F-score of 80%, which was pretty good for our first datathon. There's definitely some optimizations that could be made, but we were quite happy with this result.
Built With
- jupyter
- python
- spacy
Log in or sign up for Devpost to join the conversation.