Inspiration
We were inspired to target the often-neglected population of the elderly in our solution, specifically the growing population of those with dementia. One of our team members have a family member who worked with dementia patients and this inspired the concept of providing a familiar face to those living with dementia.
What it does
HarmonyAI.de takes in audio from the user and displays back a video of a loved-one responding using generative AI.
How we built it
HarmonyAI.de processes audio input using Whisper into a query that is sent to OpenAI through Langchain in the format of a chatbot. The output is processed using IBM Watson text-to-speech, which is fed into a lip-syncing algorithm using Gooey AI to give the appearance of an animated image speaking to the user.
Challenges we ran into
We cumulatively were more back-end experienced, so the full-stack was a challenge for us. We also had to learn about generative AI and lip-syncing.
Accomplishments that we're proud of
We are proud to be able to have successfully implemented functionality which is able to process audio into the finalized video file.
What we learned
We learned a lot about the minutia of full-stack development as well as how to break up tasks based on experience. We also learned that searching for more specific use-cases results in a more interesting product.
What's next for HarmonyAI.de
We have several ideas about what functionality we would want to implement should we have more time and resources. The most exciting of these being the ability to monitor the condition of the patient based on conversational statistics.
Built With
- css
- flask
- gooeyai
- html5
- ibm-watson
- javascript
- node.js
- openai
- python
- react
Log in or sign up for Devpost to join the conversation.