Inspiration
As a software engineer and a newer stand-up comedian in NYC, I’ve felt how slow the comedy feedback loop can be: write a joke, wait days for an open mic, try it on stage, hear silence, rewrite, and repeat. Refining even a solid five minutes can take months.
That frustration led me to build Room Sense, an earlier AI-native performance lab for comedians. Room Sense focused on the “Green Room” experience: comedians could upload a set, get a structured analysis of its logic and structure, and then jam in real time with four specialized Gemini-powered voice agents focused on premise, structure, editing, and performance.
Building Room Sense taught me that commentary alone is not enough. Comedians do not just need notes after the fact. They need to feel how material lands in the moment. That raised a new question: what if, instead of only talking to coaching agents, a comedian could rehearse in front of a live virtual room that reacts in real time?
Room Sense Live explores that idea. Instead of practicing to silence, comedians can test jokes in a simulated room that listens, reacts, and offers feedback as the set unfolds.
What it does
Room Sense Live is an AI-native performance lab where comedians rehearse their sets in front of a live virtual audience.
As the performer speaks, the system listens through the microphone and reacts like a real crowd, triggering laughs, oofs, claps, and emotional cues while the room’s energy rises and falls.
The project introduces three modes:
Performance Mode simulates a live audience that reacts in real time as the comedian delivers a set.
Compare Mode runs the same performance through multiple audience personas simultaneously so comedians can see how different crowds interpret the same joke.
Analysis Mode breaks down the performance and allows the comedian to ask individual personas what they thought of specific jokes.
Instead of guessing if a joke works, comedians can test it with a room first.
How we built it
Room Sense Live is built using Node.js and Express with a browser-based frontend and real-time audio interaction powered by Gemini Live.
The system captures microphone input and streams audio to Gemini models that analyze the spoken set and trigger reactions using tool calls.
Audience reactions are expressed through layered sound effects, energy visualization, and visual room effects such as brightness shifts, subtle camera shake, and environmental filters.
Each persona operates on an independent Gemini Live channel, allowing the same performance input to produce different reactions simultaneously.
For deeper insights, a separate analysis endpoint generates structured feedback about the set using a Gemini reasoning model.
Together, these components turn a simple microphone input into a responsive rehearsal environment.
Challenges we ran into
The biggest challenge was creating the feeling of a “live room” rather than a simple chatbot interaction.
Real audiences react quickly, and reproducing that responsiveness required careful handling of streaming audio, reaction triggers, and UI feedback.
Another challenge was designing reactions that felt believable but still deterministic enough for demo scenarios. This required combining tool calls, energy modeling, and controlled prompts to produce consistent audience responses.
Balancing immersion with useful feedback was also tricky. The interface had to support performance, experimentation, and analysis without overwhelming the performer.
Accomplishments that we're proud of
One of the most exciting results is that the system genuinely feels like a room.
When a joke lands, the audience reacts and the room energy rises. When a joke misses, the room cools. That dynamic feedback changes how practicing feels.
We are also proud of the persona comparison feature, which demonstrates how the same joke can land differently depending on the audience.
Finally, the project shows how multimodal AI can move beyond text interfaces and become part of interactive creative environments.
What we learned
This project reinforced how deeply context shapes humor.
A joke isn’t just about the words. Timing, tone, and audience expectations all influence how it lands.
It also showed how powerful real-time multimodal AI can be when used as part of a creative tool rather than just a conversational interface.
Designing AI that reacts to performance rather than simply answering questions opens up new possibilities for rehearsal, coaching, and experimentation.
What's next for Room Sense Live
There are many directions to explore next.
Future versions could include more diverse audience personas, richer emotional signals, and deeper analysis of joke structure and callbacks.
Another direction is using real audience data to train more realistic reaction models.
Ultimately, Room Sense Live could evolve into a creative rehearsal platform not only for comedians, but also for public speakers, storytellers, and performers who want to test how their ideas land with different audiences.

Log in or sign up for Devpost to join the conversation.