OmniScribe AI: Smart Learning PDE
(Personal Development Environment)
💡 The Problem: The 40% Information Gap
Academic success is fundamentally limited by the "Note-taker’s Paradox": students cannot effectively process new information while simultaneously recording previous points.
- Students fail to record 40% of critical lecture content (Hartley & Cameron).
- Mind-wandering occurs during 30% to 50% of a typical lecture (Wammes et al.).
- Without comprehensive notes, student retention can drop to just 5%.
OmniScribe AI bridges this gap by creating the world’s first "witness-aware" educational ecosystem. By giving AI "eyes" in the classroom and a "voice" in the study hall, we ensure the lecture never ends—it evolves into a personalized, persistent conversation.
🚀 The Solution: A Multimodal Powerhouse
OmniScribe is built on a sophisticated pipeline that transforms passive listening into active mastery.
1. The Vision Engine (Powered by Overshoot)
We utilize Overshoot as the project’s primary sensory organ, providing the AI with "visual memory."
- The Screen-to-Vision Bridge: We engineered a custom virtual driver to pipe high-resolution screen-share data into Overshoot’s vision pipeline, allowing it to "watch" digital lectures with the student.
- Visual Milestone Detection: Rather than simple periodic captures, we leverage Overshoot to detect critical visual changes—like whiteboard erasures or slide transitions—to trigger context-aware summaries.
- The Real-Time "Rescue" Chat: By maintaining a rolling visual buffer, OmniScribe can "brief" a student who lost focus. Using the last 20 seconds of visual context, the AI provides an instant briefing so the student can rejoin the lecture with confidence.
2. The Voice Mentor (Powered by LiveKit)
We leverage LiveKit to deliver the "Gold Standard" of 1-on-1 tutoring with human-parity latency.
- Our LiveKit tutor isn't a generalist—it is "Witness-Aware." It receives a processed stream of the lecture context captured by Overshoot, allowing it to answer questions specifically about your professor’s unique diagrams and nomenclature.
- Circle-to-Search-to-Speech: The most intuitive UX feature for learning (we're biased :) where a student circles a complex diagram on their screen; the LiveKit agent "sees" the selection and explains it via voice in sub-second real-time.
- Contextual Tool-Calling via leanMCP: We replaced standard search APIs with a leanMCP implementation. This allows the Voice Mentor to dynamically access external and internal knowledge:
- Semantic Note Retrieval: The agent can search through the student's previous lecture notes to find connections between today's class and last week's topics.
- On-the-Fly YouTube Sourcing: If a student is struggling with a concept (e.g., "The Fourier Transform"), the agent calls an MCP tool to fetch highly relevant YouTube tutorials or visualizers, providing links or summaries instantly.
🛠️ Technical Execution & Innovation
We focused on pushing the boundaries of what is possible with real-time AGI primitives:
- Multimodal Hand-off: We developed a proprietary protocol to sync Overshoot’s visual encoding with LiveKit’s voice output, ensuring the tutor’s "brain" is perfectly aligned with its "eyes."
- Latency Optimization: We implemented a tiered prompt architecture—utilizing Overshoot for lean visual tokens, Gemini API for high-order reasoning, and Flux.1 Pro to transmute messy lecture sketches into textbook-quality "Mastery Canvas" illustrations.
- Engineering the Bridge: By solving the challenge of piping screen-shares into a vision-in feed, we’ve made "Witness-Aware" AI accessible for every online learner.
📊 Quantifying the Impact
- Closing the Retention Gap: By capturing the "lost" 40%, we aim to move student retention from a baseline of 5% up to 80%+.
- Eliminating the "Introvert Tax": A private, LiveKit-powered voice channel empowers the 40% of introverted students to ask questions without the friction of public interruption.
- Scaling the 2 Sigma Shift: We are democratizing the elite 1-on-1 tutoring experience traditionally reserved for the few, making it available to the many.
📚 Research & References
- Bloom, B. S. (1984). The 2 Sigma Problem: The Search for Methods of Group Instruction as Effective as One-to-One Tutoring.
- Hartley, J., & Cameron, A. (1967). Some observations on the efficiency of lecturing.
- Wammes, J. D., et al. (2016). The consequences of mind wandering during a video-recorded lecture.
Built With
- exa
- firebase
- flux
- gemini
- leanmcp
- livekit
- mcp
- nextjs
- overshoot
- react
- trae
- websockets
Log in or sign up for Devpost to join the conversation.