Learning Buddy

Inspiration

We noticed a persistent gap between digital study tools and the physical classroom environment. While many AI tutors exist, they often require manual, post-lecture data entry. We wanted to build a "Living" study assistant—a hardware-software ecosystem that sits on a student's desk, listens to lectures in real-time, and immediately transforms that audio into a searchable, interactive knowledge base.

What it does

Learning Buddy is a full-stack AI platform that merges a high-performance web dashboard with a physical ESP32-S3 recording device.

Real-Time Capture: Stream live audio via WebSockets directly from the hardware to the backend.
Contextual RAG: Whether through PDF/DOCX uploads or live recordings, the system chunks and embeds data for grounded AI responses powered by Google Gemini.
Voice-First Interaction: Integrated ElevenLabs Conversational AI allows for low-latency, hands-free dialogue with study materials.
Device Management: A seamless pairing system using 6-character keys to link physical hardware to web accounts.
Gamified Productivity: Includes DeskPet, an interactive digital companion that reacts to learning activity and milestones.

How we built it

The project was engineered with a focus on modern reactivity and performance:

Frontend: Developed using SvelteKit 2 and Svelte 5 (Runes), styled with Tailwind CSS v4 for a streamlined UI.
Backend: A Python Flask server utilizing Flask-SocketIO for real-time PCM audio processing and Flask-JWT-Extended for secure authentication.
Hardware: ESP32-S3 Sense firmware (via PlatformIO) utilizing a PDM microphone to stream audio data.
The AI Pipeline: We used faster-whisper for efficient transcription, Google Gemini for intelligence and embeddings, and ElevenLabs for the low-latency voice interface.
Deployment: The frontend builds to static files served directly by the Flask backend, all containerized via Docker.

Challenges we ran into

The primary hurdle was the Real-time Audio Pipeline. Handling raw PCM chunks over WebSockets from an ESP32 and assembling them into a valid WAV format on the backend required precise buffer management to prevent data loss. Additionally, adopting

Accomplishments that we're proud of

Successfully implementing a reliable WebSocket streaming pipeline from an embedded device to a cloud-based transcription engine.
Achieving a low-latency voice-to-voice experience that stays grounded in the user's specific source materials.
Building a robust, single-server deployment strategy that handles both the high-frequency SocketIO traffic and static frontend delivery.

What we learned

We deepened our collective understanding of Vector Embeddings and the nuances of chunking strategies for varied document types. We also gained significant experience in embedded systems, specifically regarding memory management and maintaining network stability during continuous audio streaming on the ESP32.

What's next for Learning Buddy

Multi-Device Sync: Allowing several "Buddies" to contribute to a single shared knowledge base.
Edge Processing: Moving transcription or smaller LLM tasks to the edge to increase privacy and reduce latency.
Proactive DeskPet: Evolving the digital companion to provide proactive study reminders and insights based on the user's recorded lecture history.

Tech Stack

Layer	Technology
Frontend	Svelte 5, SvelteKit 2, Tailwind CSS v4, TypeScript
Backend	Python Flask, Flask-SocketIO, MongoDB, Rust
Hardware	ESP32-S3 Sense (C++/Arduino), PDM Microphone
AI/ML	Google Gemini (LLM/RAG), ElevenLabs (Voice), faster-whisper

Built With

Submitted to

HackFax x PatriotHacks 2026
- Winner [MLH] Best Use of MongoDB Atlas

Created by

I worked on the hardware design. It is my first time using an ESP32 and I2S to get audio recording and audio output. It was quite confusing figuring the I2S protocol, but I got it to works in the end.

Toan Do
TimsShips Shipman
Gagan M
My passion is using Computer Science with practical Engineering to enhance technology for the future.
Jonathan Ventimiglia

Updates

TimsShips Shipman started this project — Feb 14, 2026 12:00 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.