Inspiration
The challenge of delivering speeches with clarity and confidence inspired us to create SpeechSage. Furthermore, improved communication skills lead to better job opportunities, which is important for college undergrads. Whether for presentations, interviews, or public speaking, we wanted to offer a tool to help users perfect their communication skills.
What it does
SpeechSage analyzes speech input and provides real-time feedback on content, tone, and pace. It helps users refine their speeches, ensuring they deliver confidently and clearly.
How we built it
We built SpeechSage using React for the front end and integrated AI models from Google Gemini for content analysis. Additionally, HumeAI was used to analyze the emotions conveyed in the speech. The app provides a user-friendly interface for easy input and real-time feedback. A crucial part of the development involved managing the integration of these APIs, ensuring seamless communication between the speech input, emotion detection, content analysis, and the feedback delivery system.
Challenges we ran into
Integrating the APIs and ensuring accurate real-time feedback was a significant hurdle. One challenge was how the APIs accepted parameters, as well as the security layer in certain APIs, which restricted deployment on client websites. Managing the flow between user input, processing time, and feedback delivery posed additional technical challenges. However, we are proud that we managed to obtain meaningful results despite these constraints.
Accomplishments that we're proud of
We’re proud to have built a functional prototype that effectively provides helpful feedback on speech. Successfully integrating both HumeAI for emotional analysis and Google Gemini for content analysis was a major technical achievement. The emotional feedback helps users align their tone with their content, and this dual analysis ensures a holistic improvement in public speaking.
What we learned
We learned the importance of user experience in AI-driven applications, particularly in speech analysis. Ensuring that both emotional and content feedback is timely, clear, and helpful taught us how to handle complex interactions between AI and front-end systems. Additionally, the security challenges around API deployment reinforced our understanding of how to handle API security in web applications.
What's next for SpeechSage
We plan to make SpeechSage even easier to use with improved user experience through constant feedback and optimization. Future updates will include more detailed feedback, expanding language support, and refining both HumeAI and Google Gemini models to better adapt to various speaking styles and emotional contexts. We also aim to simplify API integration to make the system more robust and secure for client-side deployments.
Log in or sign up for Devpost to join the conversation.