Flame

Inspiration

Inspired by Omegle and “The Button” series on YouTube's channel The Cut, our concept combines the spontaneous thrill of randomly connecting with strangers with a structured speed-dating format. Omegle shut down because its anonymous, unmoderated platform became overrun with inappropriate content and safety concerns, making it legally unsustainable. We're addressing this by introducing an AI mediator, Janitor Lana, in every conversation that guides interactions through curated questions ranging from light icebreakers to deep personal topics. Starting with voice-only chat eliminates visual exploitation, while users can earn a face reveal based on chemistry. This keeps the excitement of the "next" button and random matching, but in a safer, accountable environment. Flames reclaims the magic of meeting strangers and having genuine conversations, bringing back what made Omegle appealing and the engaging format of The Button, without the toxic chaos.

What it does

The user creates their anonymous identity. They choose a flame color, pick an avatar name, select if they're an introvert or extrovert, describe their hobbies, and explain what they're looking for in conversations. They also set their conversation intensity level (how deep/casual they want talks to be). Once everything is filled out, they click "Find Your Match" to start matching. In the chat page, both users would see each other's flames growing larger as their conversation progresses through levels. An AI mediator asks questions to guide the conversation. Starting with voice-only, users can unlock video/face reveal once they reach max level and both feel comfortable.

How we built it

We built Flame using a mix of immersive frontend design and intelligent voice-driven AI mediation.

The app’s interface was built in React with Tailwind CSS for smooth responsiveness and a modern aesthetic. We used Three.js to create the 3D animated flame avatars that visually grow and change color based on the conversation’s depth and energy.

For AI mediation, we used Janitor AI to power our in-app mediator, Janitor Lana. Lana joins every conversation as a third party observer who keeps the dialogue safe, engaging, and meaningful. Using prompt engineering and the Janitor AI API, we fine-tuned her to act as a moderator, sarcastic host, and question generator. Lana dynamically adjusts her prompts depending on the users’ chosen intensity level—starting with friendly icebreakers and smoothly transitioning to deeper emotional or even unfiltered spicy questions. This makes each conversation feel natural and emotionally fitting rather than robotic or random.

For the voice-driven experience, we integrated Fish Audio for real-time voice processing and natural AI narration. Fish Audio’s TTS engine brings Lana to life—she doesn’t just type responses but speaks to users in a expressive voice that reacts to tone and sentiment. We also used Fish Audio’s voice emotion controls to let her shift between playful, calm, or empathetic tones depending on the context of the conversation. To give Lana more soul, we even gave her the voice of the actual "Button"!

To handle user matching and real-time communication, we implemented WebRTC to create peer-to-peer voice chat rooms. This ensures smooth interaction even when users are on different networks or have unstable connections. Each session connects two users plus the AI mediator, while our backend manages matchmaking, signaling, and session state persistence.

Challenges we ran into

One of our biggest challenges was getting two users into the same virtual room and maintaining a stable, low-latency connection throughout the conversation. Building real-time peer-to-peer communication meant handling different networks, NAT configurations, and firewalls, which often caused dropped connections or audio/video lag. We also had to make the system resilient enough to survive Wi-Fi switches or weak connections, which required retry logic and reconnection handling in WebRTC. To avoid router firewall issues, we also had to use Ngrok for tunneling our IP address. This whole networking mess was further amplified by the HORRIBLE wifi at the venue so....

Integrating Fish Audio and Janitor AI into a seamless conversational loop was another major technical hurdle. We wanted our AI mediator, Janitor Lana, to know when to interject, when to stay quiet, and how to speak naturally—not interrupting the users mid-sentence, but also not leaving awkward silences. Getting this timing right required carefully balancing speech detection, voice streaming, and AI response latency so that conversations felt spontaneous and human rather than scripted.

Another difficulty was synchronizing voice and video streaming between both users and the AI mediator. Handling simultaneous audio input/output while ensuring that Fish Audio’s real-time synthesis didn’t interfere with user voices took a lot of testing and fine-tuning. We also faced challenges getting the camera and microphone permissions to work smoothly across browsers, as well as ensuring consistent rendering of the 3D flame animation while the real-time audio session was running.

All these problems pushed us to learn about networking, real-time media APIs, and AI synchronization in a way that went far beyond typical frontend development.

Accomplishments that we're proud of

We’re proud that we were able to make everything work together—from real-time matching and communication to AI-driven moderation and voice synthesis. Getting two users connected in a stable room with both audio and video streaming and an AI mediator in the loop was a huge milestone.

Our prompt engineering for Janitor AI was another major success. We fine-tuned Lana’s responses so she could adapt to the flow of conversation, changing tone and question depth naturally. It was rewarding to see her handle different scenarios—sometimes guiding, sometimes listening, always maintaining an emotionally aware tone.

We also successfully integrated Fish Audio’s real-time TTS engine, turning Lana from a text-based moderator into a living, speaking personality. Hearing her respond with genuine warmth and nuance made the project feel alive.

Finally, we built a visually immersive experience using React and Three.js, where the 3D flames grow brighter and more animated as users connect on a deeper emotional level. That moment when the flame responds to human connection perfectly encapsulated our vision for what Flames should be—a space where technology enhances human emotion rather than replaces it.

What we learned

This project gave us valuable experience in blending AI moderation, natural voice interaction, and real-time networking—a combination that is both technically challenging and socially meaningful.

Through Janitor AI, we learned how to build adaptive conversational agents that can subtly mediate social dynamics. Instead of generic chatbots, we focused on creating Lana as a sensitive listener who knows when to speak and when to stay silent, maintaining a balanced and emotionally aware conversation.

With Fish Audio, we explored how deeply voice and tone influence human trust and engagement. We learned to manipulate pitch, pacing, and mood in real time to simulate genuine personality and emotional presence. These experiments showed us how natural-sounding AI voices can transform digital interactions into something truly human and immersive.

By combining these tools, we learned to design emotionally intelligent, voice-first AI systems that make users feel seen, heard, and safe. That sense of emotional connection and authenticity was our ultimate goal in reimagining online social interaction.

What's next for Flame

We plan to continue expanding both integrations and push the limits of what voice and AI mediation can achieve.

For Fish Audio, we want Lana’s voice to evolve dynamically with the conversation. Our next goal is to detect laughter, pauses, or hesitation from users and let her respond with matching tone and energy. We also plan to experiment with Fish Audio’s multi-speaker streaming API so users can choose between different “Lana” voices—ranging from calm and flirty to confident or humorous.

For Janitor AI, we aim to introduce a personality slider that allows users to customize Lana’s conversational style. Depending on the mood, she could take on the role of a therapist, comedian, or deep talker. We are also exploring contextual memory, enabling Lana to remember details from past conversations to personalize future prompts and detect chemistry trends between users.

We are building a voice emotion analytics layer using Fish Audio’s sentiment analysis. The idea is to visualize “emotional resonance” as changes in the flame’s brightness and movement—so the flame literally glows more vividly as users connect on a deeper level.

Lastly, we plan to enhance safety features by expanding Janitor AI’s moderation capabilities. Using Fish Audio’s voice event detection, we can flag moments of discomfort, toxicity, or unsafe dialogue early and intervene appropriately. This ensures Flames remains a space that is both emotionally authentic and safe for everyone.