Hire Or Fire

Landing page
Tutorial tab, press on the purple question mark icon for full tutorial
Choose mode menu
Create lobby with role and experience level
Waiting room: can transfer party leader, kick players, and invite via link or ID
Invite people easily! Anyone in lobby can invite not just party owner
Screens like this to really make it feel like a game
Behavioural question (1st question is same for all players, second is tailored follow-up to your answer
Example of a follow up question tailored to my answer
Another example of follow up questions where I said I used bug spray to debug
Behavioural answering (the video is not being saved dw!)
Scores are calculated using STAR for behavioural and factoring in things like time complexity for the coding part
Leaderboard updated after each round
Timer between rounds to make it feel like game
Multiple choice round (2 lives quick fire round)
IDE tab in the technical practical round (there's syntax highlighting, can add files in different languages and print code)
Text tab in the technical practical round (can style text if you copy the styling is preserved)
Draw tab in the technical practical round (downloads as png)
Top person gets hired, only one person gets hired in this market (there is falling confetti and sound effect)
You're fired (bad sound effect)
Jackbox style answer comparing, shows best and worst answer, in quick fire round
Jackbox style answer comparing, shows best and worst answer
Final rankings! Cooked made a comeback
Last minute cram tips

Link to try out game

https://codejam25-production-57ae.up.railway.app/landing recommend 3+ ppl as the UI looks better If it stops working maybe I ran out of money on my API key lol I only put a few dollars

Inspiration

Interviews are stressful, and prepping for them can be awkward, boring, and lonely. Everyone here has probably blanked on a question, stumbled through an answer, or felt unprepared. We wanted to make practice more engaging, a way to challenge yourself, play with others, and actually enjoy leveling up your skills. That spark led to HIRE OR FIRE.

What it does

HIRE OR FIRE is a fast-paced multiplayer interview game where you compete in real time against others preparing for the same role. The gameplay flows like this:

Lobby Creation: Create or join a game lobby with a link or code. Assign a party owner, manage players, and sync everyone for real-time online play. The whole lobby competes together at the same role and level.
Behavioural Round: Answer a STAR-style question against your competition, tailored to the lobby's role and level (e.g., Backend Intern), with a camera feed. After your first answer, you also get a follow-up question entirely based on your response to the first question, completely different from the rest of the lobby.
Theory Round: Rapid-fire multiple-choice questions testing knowledge and speed. Each player has 2 lives and a timer.
Practical Round: Submit answers via IDE, text, or drawing, just like a real interview. Use tools like brushes, fonts, and download options to show your solution creatively.
Final Verdict: Players see who is “HIRED” or “FIRED” in the match.
Match Summary: Highlights of the lobby’s funniest, most dramatic, or impressive moments.
Feedback & Analysis: Personalized report on your performance with tips for improving next game or for the real deal.

How We Built It

Frontend

React + TypeScript SPA structured by game phases, with a shared global state synced through REST + WebSockets.
Lobby system supports real-time player management, ownership, join codes, and synchronized phase transitions.
Phase UIs for behavioural, theory, and practical rounds with timers, instant feedback, and smooth progression.
WebSocket channel /ws/lobby/{id} keeps all players aligned on questions, scores, and countdowns.

Backend Architecture

Tech Stack

FastAPI + SQLAlchemy + WebSockets + OpenAI API (GPT-4o-mini)

Architecture

FastAPI + SQLAlchemy power the match engine, persistent match state, and question pools. LobbyManager, Match, and PhaseManager coordinate lifecycle, scoring, and real-time updates. QuestionManager loads seeded questions and auto-generates missing role/level sets, storing them in the DB for future matches.

Core Components

LobbyManager (lobby/manager.py): In-memory lobby state, WebSocket connection pooling, match lifecycle
WebSocket System (router.py): /ws/lobby/{lobby_id} handles 20+ message types (submit_answer, ready_to_continue, game_end, etc.)
Game State (database/game_state.py): JSON-based match state in OngoingMatch.game_state for flexible schema evolution
PhaseManager (game/phase_manager.py): Tracks phases (behavioural → technical_theory → technical_practical), timers, completion detection
Scoring (game/*_scoring.py): LLM-powered judges (BehaviouralJudge, TheoreticalJudge, PracticalJudge) evaluate answers via OpenAI API
QuestionManager (game/question_manager.py): Database pools + LLM fallback for missing role/level combinations

Synchronization

Tracker-based system: In-memory dictionaries (ready_to_continue_tracker, ready_to_continue_podium_tracker, etc.) track player readiness per lobby/phase. When all players ready → broadcast completion message → frontend navigates.
Key sync points: Phase progression, score display, comparison→podium navigation, podium→rankings viewing.
Race prevention: question_request_locks prevent duplicate question generation, scores_calculating flags prevent duplicate calculations.

Database

ongoing_matches: Central match record with JSON game_state column
*_pool tables: Pre-seeded questions (behavioural, technical_theory, technical_practical) by role/level with difficulty values

Dev Tools

DevTools emerged out of necessity: much of our backend logic wasn’t reachable through the standard UI. We built a comprehensive set of frontend rendering endpoints and a custom database GUI so the team could accurately track backend development, verify behavior quickly, and test features end-to-end without friction.

AI Integration

Compact, structured, and reliable evaluation across rounds:

Behavioural Judge: STAR-based LLM scoring with enforced Pydantic structure for deterministic output.
Theory Judge: Pure backend validator (correct/incorrect) for fast, consistent scoring.
Practical Judge: Separate IDE and Text scoring models (correctness, completeness, clarity).
Each component is evaluated independently, then merged into a single verdict with tone bands.

All LLM outputs pass through a strict, thoughtfully designed schema validation to keep results consistent.

Challenges We Ran Into

Consistent UI Keeping a cohesive visual identity across the whole game while giving each screen its own personality was tougher than expected. We aligned on a simple design system and reusable components to lock in the fundamentals, then layered small, intentional variations per phase. This lets us handle unpredictable content and different interactions without breaking the look and feel across devices.
LLM Latency and Variability Timed rounds broke when model calls spiked or outputs weren’t clean JSON, forcing costly re-asks. Fix: Fast model with capped tokens and compact prompts, strict timeouts, UI decoupled from LLM calls, and a fixed JSON contract with a resilient extractor (fence strip, regex, brace-count). Impact: Lower tail latency and smooth, uninterrupted gameplay even during inference hiccups.
Evaluating Multiple Formats
Designing judges that handle behavioural answers, free-form text, and code (across different languages) required distinct scoring models. Ensuring consistency in tone, fairness, and point scaling across all formats took several iterations.
Ensuring Tone Control
Our judges needed fun, readable feedback without drifting into chaotic LLM randomness. We implemented tone bands (praise/neutral/roast) tied to numeric score ranges so the personality stays consistent across matches.
Multiplayer Sync & Retries Under load, some clients advanced or scored out of sync; we tightened server broadcasts, added targeted retries, and aligned transitions (e.g., unanimous skip gating, current-score sync) so the room moves together even with spotty networks.

Accomplishments that we're proud of

Multiplayer that actually works: Real-time sync across browsers, you can join with your friends on any browser
AI that feels personal: Follow-up questions aren't generic; they dig into your specific answer, making every game unique.
Polish under pressure: We shipped a fully playable minimum viable product with creative UI, sound effects, animations, and analytics in under 2 days.
Scalable architecture: The Modular phase system means adding new round types (video interviews, whiteboarding) is trivial.

What we learned

WebSockets are hard: Real-time sync requires careful state management, heartbeat pings, and reconnection logic.
LLMs are unpredictable: Even with strict prompts, you need fallback parsing and validation for AI-generated content.
UX matters: A countdown isn't just a timer—it's a psychological tool that builds tension and keeps players engaged.
Multiplayer design is a game-changer: Competing against a friend makes practice feel less like work and more like a sport.

What's next for Hire Or Fire

Custom question set and lobby by job URL: We are adding support for generating a full interview lobby directly from a job posting URL. The system will scrape and summarize the job description, extract required skills, responsibilities, and seniority signals, then auto-build a tailored question set that matches the role. Candidates will enter a lobby where every prompt, scoring rubric, and difficulty level is aligned with that specific posting. This creates role-specific interviews that feel intentional rather than generic, and it lets companies test exactly what they care about without writing their own questions.
TTS Text-Voice Feedback: We plan to introduce natural-sounding text-to-speech feedback so interviews feel like live sessions rather than static text exchanges. After each answer, the system will speak the evaluation aloud, highlight strengths, point out missing details, and coach the candidate on improvement. This creates a more immersive simulation of a real interview environment, helps users practice verbal pacing and presence, and turns the platform into a richer training tool rather than a simple quiz engine.
Battle Royale Mode: We are exploring a competitive mode where groups of candidates enter a shared lobby and face progressive elimination each round. Questions increase in difficulty or shift format as the pool narrows, and custom round types let hosts mix technical, behavioral, rapid-fire, or scenario-based challenges. The mode creates a high-energy, tournament-style experience that’s ideal for events, recruiting fairs, or team assessments, and it highlights top performers under pressure.

Built With

fastapi
git
npm
openai
pydantic
react
react-router
sqlalchemy
tailwind
venv

Submitted to

McGill CodeJam 15

Created by

I built the AI integration for tailored follow-up questions, adaptive scoring, personalized feedback, and role or level based question selection. We used an LLM instead of regex so the system actually understands language, not patterns. To keep outputs consistent, I designed structured prompt templates, separate judges for each round, and pydantic validation. Two users giving the same answer always get the same score, and I also played around with the tone of the feedback tone myself so if it calls you NGMI or suggests applying to McDonalds that is my fault.

Since LLMs are slower and cost more, we cache all new roles and questions in the database and use a cheap model with tested templates.

I also built the technical practical tab on the frontend with a Monaco IDE, typing and drawing modes, file downloads, and multi language support, plus the backend routing for that flow.

Chantal Zhang
I worked on the front-end including landing pages and UI design choices

KevinJiayeLiu Liu
Ray Bao
Wired
Malak Oualid