Inspiration
LEGO Smart Play showed bricks can have sensors inside them. We wondered: what if AR glasses could bring any LEGO creation to life, no special bricks needed? Just build, look, and play.
What it does
Put on Snap Spectacles, pinch to scan your LEGO creations, and each one gets a unique generated sound. Grab objects with your hands, shake a plane to hear engines, wave a dragon to hear it roar. Objects make impact sounds when they collide. Background music is generated to match the scene. Every scan is completely different.
How we built it
Built in Lens Studio (TypeScript) for Snap Spectacles. One pinch captures a camera frame + depth data. Gemini 3 Flash analyzes the image in a single call and returns structured JSON: bounding boxes, labels, sound prompts, collision prompts, colors, and a music style. Sound prompts go to TangoFlux (open-source audio model on Replicate), background music goes to Google Lyria. Objects spawn at real-world 3D positions using the depth data and become grabbable via the Spectacles Interaction Kit.
Challenges we ran into
- Converting 2D bounding boxes to 3D world positions using synced depth frames
- Prompt engineering: getting Gemini to write sound prompts that actually produce good audio from TangoFlux
- Orchestrating many async operations (vision, sound gen, music gen, VFX) into a smooth experience
- Tuning hand interaction to feel natural (velocity-based sound, hysteresis, cooldowns)
Accomplishments that we're proud of
One Gemini call handles detection, labeling, sound design, color, and music direction, all via structured schema. The result is a fully interactive soundscape where every session is unique and it actually feels physical.
It was also very fun to record the video 🎬
What we learned
Gemini 3 Flash's structured output is perfect for AR: complex multi-field responses with zero parsing issues. Generative audio is now fast enough for interactive use. And the combination of vision AI + generative sound, particles + hand tracking creates something that feels like magic.
What's next for LEGO AR
Combining this with LEGO Smart Play: physical sensors inside the bricks plus AR and generative AI around them. A LEGO city that reacts from the inside and comes alive through your glasses. All the pieces exist today.
Built With
- google-gemini-3-flash
- google-lyria
- lens-studio
- replicate-api
- snap-spectacles
- spectacles-interaction-kit-(sik)
- typescript

Log in or sign up for Devpost to join the conversation.