A Comprehensive Data Analysis on a WhatsApp Group Chat
- Have a look at a detailed Medium Article for this project!
- Whatsapp Chat Data Analysis - Notebook on GitHub
If the Notebook fails to load:
- Check the complete code, on this basic HTML Page.
- Whatsapp Chat Data Analysis - Notebook on Jovian
Overview
- Introduction
- Data Retrieval & Preprocessing
-
- Overall frequency of total messages on the group.
- Top 10 most active days.
- Top 10 active users on the group (with a twist)
- Ghosts present in the group. (shocking results.)
- Top 10 users most sent media.
- Top 10 most used emojis.
- Most active hours and days.
- Heatmaps of weekdays and months.
- Most active hours, weekdays, and months.
- Most used words - WordCloud
Introduction:
Whatsapp has quickly become the world’s most popular text and voice messaging application. Specializing in cross-platform messaging with over 1.5 billion monthly active users, this makes it the most popular mobile messenger app worldwide.
I thought of various projects on which I could analyse data like - Air Quality Index or The cliched Covid-19 Data Analysis.
But I thought why not do Data Analysis on a WhatsApp group chat of college students and find out interesting insights about who is most active, who are ghosts (the ones who do not reply), my sleep schedule, the most used emoji, the most actives times of the day, or does the group use phones during college teaching hours?
These would be some interesting insights for sure, more for me than for you, since the people in this chat are people I know personally.
Exploratory Data Analysis
Importing Necessary Libraries
We will be using :
- Regex (re) to extract and manipulate strings based on specific patterns.
- pandas for analysis.
- matlotlib and seaborn for visualization.
- emoji to deal with emojis.
- References:
- wordcloud for the most used words.
- datetime for datetime manipulation.
Log in or sign up for Devpost to join the conversation.