Inspiration

The World Health Organization estimates 2.2 billion people live with some form of visual impairment. For them, the modern web is filled with "information black holes." But the barriers don't stop there. Users with reading difficulties like dyslexia face walls of text, while language differences exclude billions more. The result is a fractured, inaccessible digital experience.

Our inspiration was to build a single, intelligent tool to tear down all these walls at once. We asked: how can we use AI to create a unified accessibility layer for the entire internet, ensuring anyone can not just access, but truly understand digital content, regardless of ability or language?

What it Does

Aura Vision is a comprehensive, AI-powered accessibility and productivity suite built as a Google Chrome Extension. It transforms how users interact with web content through a seamless, integrated experience:

  • AI-Powered Reader Mode & Summarization: With a single click, Aura Vision activates its Reader Mode. Using Mozilla's Readability.js, it cleans away ads and clutter. It then sends the article's text to the Gemini API to generate a concise summary, which is translated into the user's preferred language and spoken aloud. This turns a 20-minute read into a 1-minute insight.

  • Advanced Image Description: By hovering over any image, users get an instant, rich description from the Google Gemini API. The system is smart enough to use Tesseract.js OCR to also read and include any text found within the image, making banners and infographics fully accessible.

  • Real-Time Text Processing: Selecting any text on a page automatically triggers the AI to detect the source language, translate it if necessary, and read the final text aloud in a clear, natural voice.

  • Full User Control: A sleek, modern settings panel allows users to customize their experience, including language, voice speed and pitch, and enabling or disabling automatic features.

Impact and Social Benefit

Aura Vision directly addresses UN Sustainable Development Goal #10 (Reduced Inequalities) by empowering people with disabilities. Our tool provides a tangible solution to digital exclusion, giving users with visual impairments or reading difficulties the autonomy to consume online information. By summarizing and translating content, it also acts as a powerful learning accelerator for students and professionals, promoting equal access to education, healthcare information, and global knowledge.

Uniqueness

While other tools exist, they are fragmented. Aura Vision's innovation lies in its seamless, all-in-one integration:

1 - vs. Traditional Screen Readers: They are blind to images without alt-text. Aura Vision acts as the "eyes" for these tools.

2 - vs. Reader Mode Apps (like Pocket): They clean up articles but lack the built-in AI summarization and translation capabilities.

3 - vs. Standalone Translation Tools: They require a clunky, manual process of copying and pasting. Aura Vision makes translation an invisible, automatic part of the reading experience.

Our solution combines three separate product categories into one intuitive tool.

How We Built It & Architecture

Aura Vision is built on a modern, robust architecture. A content.js script, augmented with Readability.js and Tesseract.js, handles all on-page interactions. It sends messages to a background.js service worker, which acts as the central orchestrator. For AI tasks, the background script performs secure fetch calls to the Google Gemini 1.5 Flash API. The JSON response is parsed and sent to the native Chrome TTS API. All user settings are managed via the Chrome Storage API and controlled through a custom-built UI in the extension's popup.

Challenges We Faced

Our primary challenge was a strategic pivot. We initially aimed to use the on-device Gemini Nano API. However, we encountered a persistent platform bug in pre-release Chrome versions (No On-Device Feature Used state). This forced us to re-architect our solution in real-time to use a more reliable and powerful cloud-based API. This taught us invaluable lessons in creating resilient, production-ready applications that prioritize the user over the initial technical plan.

What We Learned

This project was a masterclass in Universal Design. We learned that by solving for a specific accessibility need, we ended up building a powerful productivity tool that benefits everyone. A student can use the summarizer to study faster, and a professional can use the translator to understand international reports. We learned that the best technology doesn't just grant access; it enhances understanding for all.

What's Next for Aura Vision (Our Grand Vision)

Our vision is to evolve Aura Vision from an accessibility tool into a full-fledged "comprehension engine" for the web. Our roadmap includes:

1 - "Converse with the Page": Allowing users to select content and ask the AI direct questions, like "Explain this paragraph more simply."

2 - Aura Vision for Developers: A new mode that audits websites for accessibility issues and uses AI to automatically generate alt text suggestions, helping fix the web at its source.

3 - Augmented Memory: A feature that allows users to save and tag insights, with the AI proactively resurfacing relevant saved notes as they browse new sites.

Built With

Share this project:

Updates

posted an update

Our AI accessibility extension now features one-click article summarization. Go from a 20-minute read to a 1-minute audible insight, translated into your language. Understand the web faster. #AuraVision #AI #ChromeExtension #Accessibility #EdTech

Log in or sign up for Devpost to join the conversation.

posted an update

We consume countless pieces of information online every day. A brilliant idea on one site, a critical statistic on another, a powerful image on a third. But where does it all go? Human memory is fallible, and valuable connections are often lost in the digital noise.

This led us to ask a bigger question: What if an accessibility tool could do more than just help you perceive the web? What if it could help you remember and connect it?

This is why we are designing the ultimate evolution for Aura Vision: Memory Mode.

Here’s how it would work:

1- Capture Instantly: While browsing, you find an insightful paragraph or a crucial image. With a simple command, you tell Aura Vision, "Remember this about 'Artificial Intelligence'." The information is instantly saved and tagged in your personal, private knowledge base within the extension.

2- Recall Intelligently: Weeks later, you're reading a different article on a new site. Aura Vision silently understands the context of the page. A subtle notification appears: "You have a saved note related to 'Artificial Intelligence' from another article. Would you like to see it?"

This transforms Aura Vision from an accessibility tool into a true augmented memory or a "second brain". It proactively connects your past knowledge with your present browsing, helping you discover insights, build a deeper understanding of complex topics, and never lose a valuable idea again. The more you use it, the smarter your personal web becomes.

Our mission started with providing access, and that remains our core. But the future we envision is one where AI doesn't just grant access, but truly augments human intelligence. Memory Mode is the next step on that journey. #AI #Vision #Productivity #FutureTech

Log in or sign up for Devpost to join the conversation.

posted an update

We're excited to share the next major feature we're developing for Aura Vision! While our core mission began with accessibility, we're now expanding into productivity and learning.

The web can be a cluttered and distracting place. To solve this, we're implementing a beautiful, distraction-free Reader Mode. With one click, Aura Vision will use Mozilla's Readability.js library to extract the main article content, presenting it in a clean, focused view.

But we didn't stop there. This new mode is powered by a new AI capability: instant summarization.

Instead of reading a 2,000-word article, users can now ask Aura Vision to provide a concise, spoken summary of the key points. This transforms Aura Vision from an accessibility tool into a powerful productivity companion for students, researchers, and professionals alike.

This is the next step in making web content not just accessible, but also effortlessly understandable. Stay tuned as we roll out this feature! #AI #Productivity #Accessibility

Log in or sign up for Devpost to join the conversation.

posted an update

Every developer knows that a project rarely follows a straight line. Our journey with Aura Vision is a perfect example of this, and we wanted to share a challenge we faced and how it ultimately made our project even stronger.

Our initial plan was clear: use the brand-new, built-in Gemini Nano API to provide fast, private, on-device image descriptions. The promise of using on-device AI to power a crucial accessibility tool was incredibly exciting and perfectly aligned with the spirit of this challenge.

However, we quickly hit an unexpected and persistent environmental roadblock. Despite meeting all documented hardware requirements (over 6GB of VRAM) and correctly enabling the feature flags in multiple pre-release Chrome channels (Canary and Dev), the browser's internal state consistently reported No On-Device Feature Used. This platform-level issue prevented the LanguageModel API from ever loading, completely blocking the local AI pathway.

A core tenet of engineering is resilience. When one path is blocked, you build another.

We decided that waiting for a potential browser update wasn't an option; our mission to deliver a working accessibility tool was too important. So, we pivoted. We re-architected Aura Vision to call the powerful, server-side Google Gemini 1.5 Flash API directly from our extension's background script.

This strategic pivot not only solved our immediate problem but also made the extension more robust. It now guarantees that Aura Vision works for all users, regardless of their hardware capabilities or potential browser bugs.

This journey was a powerful lesson in adaptability. While we were excited about the potential of on-device AI, we're even more proud of building a resilient solution that puts the user's need for a working, reliable accessibility tool first.

We hope sharing our debugging story is helpful to other developers! #Debugging #AI #Accessibility #ProblemSolving

Log in or sign up for Devpost to join the conversation.

posted an update

Have you ever stopped to think about how much of the internet is visual? News stories, product reviews, social media moments, and even crucial data in charts are all primarily conveyed through images.

For many of us, this experience is seamless. But for millions of people with visual impairments, this visual web is filled with invisible barriers. Every undescribed image is a missing piece of the story, an "information black hole" that can lead to frustration and exclusion.

This is where the idea for Aura Vision was born. It started with a simple but powerful question: Can we use the incredible power of modern AI to bridge this accessibility gap?

We were inspired to build a tool that doesn't just see an image, but gives it a voice. Our mission with Aura Vision goes beyond just one extension. It's about advocating for a more inclusive digital world, where technology proactively ensures that no one is left behind. Accessibility shouldn't be an afterthought; it should be the foundation.

We're excited about this journey and would love to hear your thoughts on digital accessibility in the comments below!

Log in or sign up for Devpost to join the conversation.