Inspiration
A lot of people want to speak online but don’t feel safe doing it. Between cyberbullying and privacy concerns, many stay silent. We wanted to give people a way to express themselves without exposing their identity.
What it does
Echo.io lets you speak naturally while an AI character speaks for you. It blends your words, pacing, and emotion into a VTuber-style voice so listeners hear how you feel without hearing your real voice.
How we built it
We stream speech through ElevenLabs and process it in real time. We track timing, speed, and tone to estimate emotion, then clean and re-express the speech through an AI character while keeping it natural.
Challenges we ran into
Live speech is messy. Transcriptions constantly change, words repeat, and AI likes to hallucinate. Making everything feel smooth in real time without breaking character was hard.
Emotions do not translate that well
Accomplishments that we're proud of
Even with multiple AI layers, the system stays surprisingly stable. The avatar for speaking is working better than what we initially expected
What we learned
There has been so much tools created for us to make life so much easier. We are able to learn a lot and know that understanding some of the tools will allow us to move so much faster as long as we understand the basic knowledge for what we are using.
What's next for Echo.io
Finding a more reliable way to stream data, and improve on the efficiency of the speech-to-text and text-to-speech pipeline. Provide more options for speech models, and accurately find the mood from wpm and context.
Built With
- digitalocean
- elevenlabs
- fastapi
- gemini
- react
- tailwindcss
- vite
Log in or sign up for Devpost to join the conversation.