Inspiration
What it does
iCursor is a chrome extension that moves your cursor where you look in the browser using gaze prediction and saliency mapping (points of interest / heatmap) on live webcam footage.
How we built it
- MediaPipe for expressions and facial feature positioning / landmarks.
- TranSalNet for saliency prediction (points of interest on an active page).
- Python, Tensorboard, and exploring many other models for gaze prediction.
- Typescript, React, Next.js, Node, Chakra for building the frontend.
- Lots of Chrome Extension wrangling...
Challenges we ran into
- Gaze Prediction with a laptop webcam
- Chrome extensions: there are multiple different contexts that your code can be executing in, and its not always clear which context is being used for a certain piece of code. This can get particularly frustrating when you’re expecting a JS module to be in the environment but it’s just Not There.
Accomplishments that we're proud of
Our initial ambition was to address the challenge of gaze tracking, which is inherently complex. Existing solutions, while we did exhaust them we found them very lacking. We made the decision to develop our own set of models aimed at tackling this task. As we spent more time on this, we found with the given time constraints that we were gonna fall short before the deadline. But nonetheless, we gave it out best shot.
What we learned
Working with chrome extensions, building the functionality that would allow for communication between the pop-up and services that had control of the tabs. Building logic for gaze calibration and detection within our models and running into the hurdles in building them.
- Learned how to deploy a machine learning model to a strange environment (a chrome extension!)
- Learned about Webgazer.
What's next for iCursor
- Spending more time in training
- Fine tune saliency model for web/computer content
- Finer UX controls and calibration
Citations
@inproceedings{papoutsaki2016webgazer,
author = {Alexandra Papoutsaki and Patsorn Sangkloy and James Laskey and Nediyana Daskalova and Jeff Huang and James Hays},
title = {WebGazer: Scalable Webcam Eye Tracking Using User Interactions},
booktitle = {Proceedings of the 25th International Joint Conference on Artificial Intelligence (IJCAI)},
pages = {3839--3845},
year = {2016},
organization={AAAI}
}




Log in or sign up for Devpost to join the conversation.