Inspiration

Inspired by how UFC turned disparate martial arts into a thrilling spectator sport; pitting boxers, wrestlers, and karateka head-to-head before massive audiences with rankings and betting odds; we created Versus, the “UFC of AI.” Rather than forcing GPT-4, Claude, Gemini, and dozens of other models to compete in dry academic benchmarks, Versus brings them into real-time battles complete with live matchups, audience predictions, and compelling storylines. By making AI combat visible and entertaining, we’ve transformed benchmarking into a mainstream spectacle and given everyone front-row seats to the era of AI gladiators.

What it does

Versus is the world's first AI prediction gaming platform where language models battle head-to-head in real-time strategy games. Think fantasy football meets esports, but with AI as the players. Users can:

  • Watch Live AI Battles: GPT-4, Claude, Gemini, and other models compete in games like Battleship, Wordle, and Trivia
  • Vote & Predict: Use QR codes to vote on winners, track predictions, and climb leaderboards
  • Real-time Engagement: See live game state, model reasoning, and performance analytics
  • Community Competition: Join prediction leagues, build model portfolios, and compete with other spectators
  • Interactive Entertainment: Transform boring benchmarks into thrilling spectator sports with live commentary and audience participation

How we built it

Frontend: React + Vite with real-time WebSocket connections, responsive design, and QR code integration for mobile voting Backend: Python FastAPI server with WebSocket support, unified game engine, and modular architecture AI Integration: Multi-provider LLM clients (OpenAI, Anthropic, Google, Groq, HuggingFace) with standardized game interfaces Games: Implemented full game logic for Battleship, Trivia, Wordle, with NYT Connections and Connect-4 in progress Real-time Features: WebSocket-based live updates, voting systems, and performance tracking

Challenges we ran into

We faced several challenges during development. Managing real-time game state with WebSockets was complex, especially with multiple clients and AI models interacting simultaneously. Language models often struggled with strict rules and spatial reasoning, leading to inconsistent behavior in games like Battleship. Integrating multiple LLM APIs introduced issues with varying response formats and rate limits. We also had to ensure fair gameplay despite AI errors and timeouts. Finally, making these interactions intuitive for users while preserving technical depth required careful UI/UX design.

Accomplishments that we're proud of

3 fully functional games with 2 more almost complete - proving the concept works across different game types Really clean, modern UI that makes AI competitions accessible and engaging Robust real-time architecture handling multiple concurrent games and spectators Multi-provider AI integration supporting 5+ different language model APIs Live audience engagement with QR code voting and real-time performance tracking Modular codebase that makes adding new games and features straightforward

What we learned

WebSockets are powerful but complex - real-time multiplayer systems require careful state management VAPI integration opened up possibilities for voice-based AI interactions and commentary LLMs have surprising weaknesses - they're terrible at Battleship but excel at Trivia reasoning Entertainment value matters - making AI accessible through gaming creates genuine excitement Community features drive engagement - prediction and voting systems transform passive observation into active participation

What's next for Versus

Cryptocurrency/Prediction Markets: Implement token-based betting, model performance portfolios, and community tournaments with real stakes Expanded Game Library: Add chess, poker, coding challenges, and user-generated game modes Competitive Programming Revolution: Allow developers to upload custom Small Language Models (SLMs) to compete against each other, redefining competitive programming Enhanced Post-Game Experience: Use Letta for persistent AI memory and VAPI for AI-generated commentary and interviews

Built With

+ 1 more
Share this project:

Updates