SpeakSecure

Inspiration

Inspiration for my project came from my experience volunteering with Special Olympics, where accessibility for individuals who are blind or visually impaired was celebrated and encouraged. The goal was to create a system that allows for secure password entry using voice recognition technology. By leveraging the Google Speech-to-Text API, I aimed to build a more inclusive and accessible authentication process.

What it does

The project captures a user's spoken audio through a web interface, processes the audio to transcribe the spoken words into text using the Google Speech-to-Text API, and then compares the transcribed password against a stored hashed password for authentication purposes.

How we built it

I built the system using a combination of front-end and back-end technologies. On the front-end, I used HTML and JavaScript to capture audio input from the user's microphone. I then used the MediaRecorder API to record the audio and send it to a Python Flask back-end server. On the server side, I processed the audio file with the Google Speech-to-Text API to convert the spoken audio to text. I also implemented password hashing for secure password storage and comparison.

Challenges we ran into

• Ensuring accurate and secure transmission of audio data from the client to the server. • Handling the audio data correctly on the server side and interfacing with the Google Speech-to-Text API. • Dealing with CORS (Cross-Origin Resource Sharing) issues when making requests from the front-end to the back-end. • Configuring the environment properly to authenticate with Google Cloud services. • Debugging server errors such as "INTERNAL SERVER ERROR" when processing the audio.

Accomplishments that we're proud of

This was my first time working with Google's Speech-to-Text API. I also have not used HTTP request methods since this past summer, so recalling those skills was a nice challenge. I successfully set up a system that could capture audio from the user's microphone and transmit it to the back-end for processing. Integrating the Google Speech-to-Text API to transcribe audio accurately was a challenge, but best demonstrated in the Jupyter Notebook where I first played around with the technology before I used it in my project.

What we learned

• Capturing and processing audio data in real-time using web technologies. • The intricacies of using Google Cloud APIs, particularly the Speech-to-Text API. • The importance of error handling and debugging in a multi-component system. • The best practices for securing user authentication, such as not storing plain text passwords and utilizing hashing.

What's next for SpeakSecure

• Implementing better secure practices for handling passwords by discussing the use of hashing. • Implementing more robust error handling and logging to better deal with and understand issues when they arise. • Enhancing security measures, possibly by adding multi-factor authentication or implementing more advanced voice recognition features to prevent spoofing. • Refining the user interface to make it more user-friendly and accessible, with clear prompts and feedback.

Built With

apache
bcrypt
css
flask
google-speech-to-text-api
gunicorn
html
javascript
mediarecorder
python

Updates

Emily Sam started this project — Jan 21, 2024 02:56 AM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.