Inspiration

Our inspiration for creating the TimeTek extension came from discovering a common problem in people's activity on the Internet. In today's digital world, we spend a significant amount of time browsing the web, but often, we're unaware of how that time is being utilized. Doing a very quick series of interviews we concluded that almost all students get distracted by random non-informative pieces of text every 20 minutes!. That's why we decided to create a tool that helps users, especially students, track the usefulness of their web activities. The extension measures the productivity of their daily online activity and gives recommendations on a healthier way to surf the world wide net. This encourages users to be more focused on studying and saves their energy and time.

What it does

As you browse the web, the extension scrapes the text content of the web pages you visit. It employs natural language processing and machine learning techniques to determine whether the content is productive (e.g., educational articles, research papers) or non-productive (e.g., social media posts, clickbait articles). At the end of the day or on demand, you can access a statistics dashboard. It offers a summary of your web browsing habits, showcasing the percentage of time spent on productive versus non-productive content. Additionally, it may provide insights into your most frequently visited categories of websites.

How we built it

We started by sampling a custom dataset that served as the foundation for training our model. This dataset is a combination of various online sources, carefully selected to represent a diverse range of web content, such as scientific articles, Reddit jokes, various prelabeled websites, etc. To process and classify the text content of web pages as productive and non-productive, we utilized a pretrained GloVe model. In order to fine-tune this model, we used scikit-learn library to vectorize our dataset into word embeddings. We developed a Chrome extension using web technologies like HTML, CSS, and JavaScript. We used Chart.js library to plot the productivity graph in the extension. We also store all of the data in local storage, so we can use it to provide the user with statistics later.

Challenges we ran into

One of the primary challenges was the absence of a readily available dataset with labeled text as productive or non-productive. Creating our custom dataset required extensive effort in data collection, labeling, and cleaning. Developing a Chrome extension that interacts with web pages can be challenging due to a myriad of weird restrictions imposed by the browser. Chrome's policies made it difficult to establish a connection between two files. Deciding on the ML algorithms and NLP techniques to use for training the model was another challenge. The variety of options available for model architectures, feature extraction, and classification methods led to discussions and debates within the team.

Accomplishments that we're proud of

We successfully made a custom main dataset out of other Kaggle datasets. This dataset serves as the resource for training and fine-tuning our model, despite the initial challenge of finding labeled data. Our team achieved significant progress in fine-tuning the model. This involved adapting a pretrained GloVe model and customizing it for our specific task, resulting in an astonishing accuracy of 0.994 on the test set of our collected dataset. We developed a statistics dashboard that offers users insights into their daily web browsing habits, including the percentage of time spent on productive versus non-productive content. This feature helps users change their Internet activity for a more productive lifestyle.

What we learned

We gained practical experience in web development, machine learning, and NLP techniques, including working with Chrome extensions, handling datasets, fine-tuning models, and integrating AI into real-world applications. We learned to overcome technical challenges, make decisions under pressure, and find creative solutions to unexpected issues. Collaboration and communication within our team were essential. We learned the importance of dividing tasks, leveraging each team member's strengths, and working cohesively towards a common goal. Hackathons have tight deadlines, teaching us to manage our time effectively, prioritize tasks, and make efficient use of the limited time available. Creating and working with a custom dataset provided insights into data collection, cleaning, labeling, and the importance of high-quality data for machine learning projects. We had the opportunity to connect with peers, mentors, and industry professionals during the hackathon, expanding our network and potentially opening doors for future collaborations. The experience of participating in our first hackathon fostered a growth mindset, encouraging us to embrace challenges, learn from failures, and continuously improve our skills.

What's next for TimeTrek

We are going to allow users to customize the criteria for what they consider productive or non-productive content based on, for instance, their major, interests, hobbies, etc. This customization can make the tool even more personalized and effective. We will enhance the statistics dashboard to provide deeper insights into users' browsing habits. Analyze trends over time, offer suggestions for improvement, and help users set goals for productive online behavior. We are going to continue to fine-tune and update the machine learning model with a larger and more diverse dataset; we will explore advanced NLP techniques to improve content classification accuracy. We will collaborate with educational institutions to promote the tool among students as a means to enhance online learning and productivity. Partner with researchers in the field of digital well-being and productivity to contribute to academic studies and explore new insights into online behavior. Develop a sustainable monetization strategy, which could include premium subscription models that will include more personalized suggestions and more insightful data. Premium plan users will get comprehensible and useful statistics and a _better NLP model that will be able to classify into more categories, such as life science, gaming, malicious, etc. These significant advantages will make the premium subscription a very appealing choice for any free plan user, be it a student or an adult.

Built With

Share this project:

Updates