Inspiration

During the start of this hackathon, a conversation with a mentor about ADA regulations led me to think about where inaccessibility hurts people the most. Healthcare kept popping up in my head. Patients leave hospitals with dense, very detail-heavy documents that they can hardly comprehend. I wanted to build something that broke down this barrier and help increase accessibility. I believe patients do not need more information. They need better understanding of the information they already have.

What it does

Navis lets you upload a medical document and have a live voice conversation with an AI about it. You can ask questions, get plain English explanations, and navigate the document by voice. When the AI mentions page 7, the viewer scrolls there. Speech-impaired users can use ASL instead of voice. Everything personal is stripped out before anything reaches the AI. Other features include downloading of tailored documents like downloading a list of medications, and a calendar event for all the medications (so users do not miss any!). Users can also look up nearby pharmacies.

How we built it

Gemini Live handles the voice conversation, with the document's anonymized text loaded as context. A transcript parser reads what Gemini says in real time and controls the PDF viewer, handling scrolling and highlighting without any extra API calls. PHI redaction runs entirely in the browser across 13 identifier categories before anything leaves the device. The ASL classifier was trained from scratch using MediaPipe Hands during the hackathon. The features of downloading tailored documents was possible with the help of Groq AI.

Challenges we ran into

The biggest challenge was getting the ASL mode and the detection working. The best option would have been to implement a complete pre-trained ASL detection model using Tensorflow.js. But that would have been extremely heavy for my project. Hence, I chose to train it myself using a combination of Mediapipe Hands and Tensorflow.js, which worked wonderfully.

Accomplishments that we're proud of

The accomplishment I am most proud of is the auto-scroll feature. If Gemini mentions "Page X", the PDF will automatically jump to that page. I wanted to implement this for easier accessibility and making the document interactive. This was truly a pat in the back for me.

What we learned

The biggest learning was about building the trust around a service in the medical industry. It is a very delicate industry and things need to be extremely accurate. This also included being honest with users about exactly what happens with their data. Being able to redact information from the document to not let personal information flow into the network was indeed a part of the learning process

What's next for Navis

I want to first expand by supporting multiple documents. Then grow towards building a proper HIPAA grade architecture so Navis can be eligible for real-world usage.

Built With

Share this project:

Updates