Focii: An Anti-Procrastination Chrome Extension
Focii utilizes a custom machine-learning algorithm (no OpenAI) to determine whether a website is related to someone's current study keywords in order to block unrelated and distracting websites.
Inspiration
Like any project, we wanted to make a product that we could use in our day-to-day lives. As avid procrastinators, we thought creating an anti-procrastination Chrome extension would be extremely helpful for our study habits.
What it does
Our extension takes in a list of keywords from the user, which are related to what they are studying. For example, if someone was studying Vector Calculus, some keywords might be "vector calculus, vectors, calculus, curves, parametric, dot product, cross product". For any website that a user visits, the website is compared to these keywords using our custom machine learning algorithm and is blocked if unrelated to the study keywords, meaning that a user is restricted to only visiting websites that would help them study.
How we built it
Frontend
We built the front end in Vanilla Javascript, HTML, and CSS. To scrape the website keywords we obtained all of the text on the website, filtered out basic keywords that are unrelated to the meaning of the website, and sent this to Summary.js which filters out conjunctions and meaningless words, outputting a list of keywords that describe the overall content of the website. We then send the user-defined study keywords and the website keywords to the backend where we handle whether or not to block the website.
Backend
Our backend is built entirely in Python. We utilize the pre-trained sentence-transformers model all-mpnet-base-v2 in order to handle embedding the keywords from the user and website so that we could semantically compare the two and determine whether they are similar.
We first started simply using the cosine-similarity metric to compare the embeddings of the words and block if below some threshold value, but we found that the accuracy of this alone was not good enough for our classification task. We decided that for each list of keywords, we would instead average all of the word embeddings to obtain a single embedding representation of the keywords, multiply this by some weight, add a constant error term to the embeddings, and then compare the transformed embeddings using the cosine-similarity metric. By feeding the averaged word embeddings through a linear equation, we could then optimize the weight and error term to minimize our classification error.
We collected training data so that we could utilize supervised learning on the dataset and optimize our blocking threshold and the parameters of the linear transformation. Within SciPy, we use the Nelder-Mead optimization method (since we didn't have access to gradients), with our objective function being to minimize the amount of error with blocking classification. We found that by averaging the word embeddings, and optimizing the weight, error term, and blocking threshold on our training data, we were able to reduce our classification error by 75%.
Challenges we ran into
All of us had almost zero experience with Javascript, which was a big hurdle. We started off wanting to do everything in Javascript, but we eventually realized we lacked the expertise to effectively classify whether websites should be blocked or not, so we decided to switch our backend to Python since it was also easier to use machine learning techniques.
Accomplishments that we're proud of
Our blocking algorithm is highly effective in classifying whether or not a website should be blocked based on the website content and the user-defined keywords. Since the back end just takes in two lists of keywords/phrases and compares them for similarity, we could generalize our extension for a whole host of content-filtering applications including spam, hate speech, spoilers (our personal favorite), and censorship.
We are also proud of the fact that we have a usable Chrome extension!
What we learned
We learned how to link front-end and back-end to create a full-stack application! We also learned that managing permissions, states, and scopes in Javascript is super hard.
What's next for Focii
- Personalized content-filtering (continuous learning based on users' website activity and user feedback to determine optimal parameters for a user's studying and browsing habits)
- Keep everything on front-end so there is nothing stored on a server
- Add pomodoro timer
Built With
- chrome
- chromewebapi
- css
- html
- javascript
- machine-learning
- python
- pytorch
- rest
- restapi
- scipy
- summary.js
- transformers



Log in or sign up for Devpost to join the conversation.