Inspiration
Inspired by a desire to offer companies good analysis of their brands and products, which would foster good competition. As well as allow researchers and data analysts to better understand trends in the world
What it does
The code takes tweets with certain keywords and shows statistics about how many of said tweets were positive and how many were negative. It can do this with two sets of tweets to allow for easy comparison.
How we built it
We used a Python library, Tweepy to scrape data from Twitter with the Twitter API, NLTK in Python in order to work with machine learning and train the Naive Bayes. Flask was to allow for communication between the back end and front end. The front end was made with HTML, CSS, jQuery, and Bootstrap. The site was hosted on Heroku.
Challenges we ran into
We ran into issues with working with the twitter API, and communicating between backend and frontend.
Accomplishments that we're proud of
We are proud of the accuracy of the data, and the fact that we were able to live up to our ambition and offer a working project.
What we learned
We learned a great deal about libraries specific to Twitter and working with the Twitter API. This was also the first time we used technologies such as Heroku and Flask.
What's next for Twitter Keyword Analysis
We would like to improve the Naive Bayes classifier, which currently ignores the order of words. We would also like to make the site more dynamic in the way it can add new keywords. It would be beneficial for companies if we could scrape data from other social medias, and offer hourly analysis for the most popular brands and trends.
Log in or sign up for Devpost to join the conversation.