Inspiration
We were inspired by the need to make web browsing more accessible and efficient. Many people with motor impairments struggle with traditional mouse-and-keyboard interfaces, and even those without impairments often crave faster, hands-free interaction with their devices. We wanted to build a solution that addresses both needs while showcasing the power of voice automation.
What it does
Our project allows users to control their web browser entirely through voice commands. Users can: Open and close websites by saying commands like "Open Google" or "Close this tab." Search the web with phrases like "Search for data science jobs on Google." Click on links by saying "Click link_name." Summarize webpage content by saying "Give a summary of this page." Save important information to Notepad by saying "Save this to Notepad."
How we built it
Speech Recognition: Web Speech API, Python SpeechRecognition Web Automation: Selenium Backend: Python, OpenAI API
Challenges we ran into
Speech Recognition Accuracy: Ensuring commands were interpreted correctly in noisy environments. Web Automation: Managing dynamic web pages with varying structures. Real-time Performance: Reducing latency for an instant-response experience
Accomplishments that we're proud of
Building a fully functional prototype within a limited timeframe. Creating an accessible tool that can benefit people with motor impairments. Implementing advanced summarization and automation features
What we learned
Integrating speech recognition with web automation is both challenging and rewarding. Real-time voice control requires efficient processing pipelines. Accessibility features can significantly improve user experience.
What's next for Voice controlled automation
Multi-language Support: Expanding to support non-English commands. Customizable Commands: Allowing users to set their own shortcuts. Browser Extensions: Developing an easily installable Chrome/Firefox extension. Enhanced AI Summarization: Further refining AI models for even better content understanding and summaries.
Log in or sign up for Devpost to join the conversation.