Inspiration

Memorization is hard, especially while presenting.

What it does

The user inputs a list of points they want to cover during the presentation and the server checks if you've covered them in real time.

How we built it

I used Expo for the frontend and Bun's built in websocket server for the backend.

The client creates a websocket connection to the server, and starts streaming 16-bit pcm once per second. The server appends the data onto the user's buffer, while performing root-mean-square on the newest chunk to determine if there is silence. On silence, the server runs Whisper-v3-turbo locally on the buffer and clears it, getting the transcript. The transcript is inferenced with OpenAI's ChatGPT api, which returns which points have been covered to the client.

I was able to get a mvp in an hour using on-device transcription; however, the threshold for audio chunks was too large and it was generally a lot more inaccurate.

Challenges we ran into

Differences between React and React Native I wasn't used to.

Accomplishments that we're proud of

Staying awake

What we learned

Native libraries are surprisingly limited compared to web standards

What's next for Engress

Storing presentation sessions and making use of time/audio data that is currently discaraded.

Or overcharge for the ChatGPT wrapper who knows :>

Share this project:

Updates