FiggleSpeak

Inspiration

1 in 12 individuals report having impediments relating to speech, voice, language or swallowing.

However, only 1 in 2 have sought treatments/ therapies in the last 12 months for their condition.

Among the most common factors why more don't seek treatment was cost. And many can't afford to attend these costly sessions often. In addition, many live packed lives and may only have small pockets of time available here and there.

There exists current solutions for articulation or pronunciation practice, they often lack a way for users to seek more personalised feedback on which specific areas they need to work on.

In addition, at different stages of speaking fluency, there are more considerations than just knowing how to pronounce words, such as breath control and tongue movement. Most apps do not take note of this and

What it does

We hence thereby propose FiggleSpeak - The all in one, AI-powered speech therapy toolkit

With FiggleSpeak, users to be able to practice their pronunciations, live, with quick feedback and pointers.

Here's how it works:

User accesses website, and selects a language, and difficulty level. They are served this text and record themselves reading it.
The website then sends this audio file for analyses, showing users, which specific phonemes (sound blocks) which they have not pronounced accurately.
The user might then get additional pointers on how to better pronounce words which they got wrong.
If they wish, the user may also watch an AI-empowered generation of a person accurately pronouncing the word, leading to better understanding.
User’s progress is tracked, and the experience is somewhat gamified. Hence, users will be able to naturally progress

How we built it

https://imgur.com/a/WvDnqx7

https://imgur.com/a/ffNnVhs

Challenges we ran into

First time hosting an AI API on the web, suffered a lot of problems in infrastructure, and ensuring good runtime environment.

Accomplishments that we're proud of

Persevering till the end. Technical improvements, knowledge gain of all members in all domains.

What we learned

Dev-OPs lifecycle.
CI/CD
Dockerisation
Google Cloud servicecs
ASR
Speechbrain and other audio AI libs

What's next for FiggleSpeak

Oral Examination (WIP) Fixed list of images User expected to record themselves describing the image GPT-4 / Gemini to identify what points user missed out Phoneme analysis to identify potential mispronunciation of certain words Conversational Buddy Chat with the AI Assistant (powered by GPT-4 / Gemini) Similar analysis of mispronunciation and confidence as above More Gamification (incorporating leaderboards and daily exercises)

NOTE: best viewed on mobile

Github links: Frontend: https://github.com/FiggleSpeak/figglespeak-web Backend: https://github.com/FiggleSpeak/figglebottom-api