SocialScript

Inspiration

1 in 36 children are diagnosed with autism. Millions more live with social anxiety. The common thread: everyday conversations are the hardest part - not exams, not job skills, but the unscripted social moments that neurotypical people navigate on autopilot.

Existing solutions are either clinical (expensive therapy sessions with months-long waitlists), robotic (scripted social skills apps that feel nothing like real life), or passive (watching videos about what to do). None of them let you actually practice a real conversation safely, get real-time feedback on social dynamics, and review what happened afterward.

We built SocialScript because practice is the most evidence-backed intervention for social skill development and there's nowhere safe to practice.

What it does

SocialScript is a conversational simulator where users practice real workplace social scenarios with characters who have names, emotions, and dynamic reactions.

Navigate a workplace map: walk your character to different rooms (break room, meeting room, desk, hallway) to enter different scenarios
Set your own goal: describe the situation you want to practice (e.g., "my manager keeps taking credit for my work" or "I need to ask for a deadline extension")
Have a real conversation: the AI character speaks to you via ElevenLabs voice synthesis, you respond by voice (Web Speech API) or text
Get live social signals: after each exchange, a real-time mood indicator shows how the character is feeling and why (e.g., "Marcus seems skeptical - he expected you to push back harder")
Choose from structured, tagged responses: besides speaking, there are structured response options labeled with its communication approach (Direct, Empathetic, Diplomatic, etc.) so users learn why certain responses work
Review with annotated replay: the reflection page lets you tap through each turn with the social signal from that moment, plus AI-generated feedback on strengths, improvements, and phrases to try
Track progress in a private journal: all sessions save locally with feedback summaries so users can revisit conversations and track their growth

But why these decisions, specifically? Our social signals are based on Social Thinking methodology (Winner, 2007) - research shows explicitly naming what others are thinking builds social cognition better than memorizing scripts (Crooke et al., 2016). Our strategy-tagged options draw from the PEERS intervention (Laugeson et al., 2012), where explicit strategy labeling transfers to real-world settings significantly better than implicit learning. The CBT-structured reflection mirrors meta-analysis-backed approaches for social anxiety (Mayo-Wilson et al., 2014).

The UI is sensory-conscious by design. There are options for low-contrast warm tones, dark mode, and reduced-motion support reflect research on sensory processing differences in autism populations (Marco et al., 2011). The calm, non-gamified aesthetic is intentional - gamification could trigger performance anxiety in exactly this population (Mazurek et al., 2015). The private localStorage journal with no accounts and "this isn't a score" language on the reflection page exist to reduce the social threat that already makes practice hard.

How we built it

Architecture: React + Vite frontend, FastAPI backend, Google Gemini 2.5 Flash for conversation AI, ElevenLabs for text-to-speech.

The conversation engine is built on structured prompt engineering. Each AI response returns labeled fields (CHARACTER, SCENE, DIALOGUE, MOOD, SIGNAL, OPTIONS) parsed via regex on the backend. This separation lets us:

Only speak the dialogue (not narration) through TTS
Drive the character's facial expression from the MOOD field
Show narration as silent scene-setting text
Display social signals as coaching overlays

Challenges we ran into

AI output was unpredictable: the model kept returning placeholder text like [Your Name], narrating its own stage directions out loud, and suggesting responses to the wrong message. The root cause was swapped conversation labels, where the AI confused the user and the character. Fixed after multiple rounds of prompt restructuring.

Browser speech recognition is fragile: the mic would cut out after half a second, transcribe everything three times, or auto-send before the user finished talking. The Web Speech API requires continuous mode, separate tracking of interim vs. final results, and explicit confirmation before triggering send.

The AI wouldn't commit to the scenario: if a user set up "my boss has a crush on me," the AI would just respond professionally. It understood the situation but wouldn't act it out. We had to reframe the prompt so the AI knows its job is to create the situation, not just acknowledge.

Accomplishments that we're proud of

Voice in, voice out, with reactive facial expressions: feels like a real conversation, not a chatbot
Strategy-tagged response options teach communication patterns implicitly: users learn when to be Direct vs. Diplomatic without explicit instruction
Annotated replay turns every session into a reviewable learning artifact
Full voice pipeline, dynamic character system, structured AI parsing, session persistence, and reflection

What we learned

Prompt design matters as much as code: small wording changes in AI prompts caused completely different conversation quality. We treated prompts with the same rigour as application code.
Voice input needs careful state management: browser speech recognition has quirks like duplicate transcriptions and premature cutoffs. Getting it to feel natural required tracking committed vs. interim text separately.
Matching AI tone to the scenario makes practice feel real: users engage more when the AI character actually acts out the situation instead of just responding generically. Adding mood, facial expressions, and social signals made a noticeable difference.

What's next for SocialScript

Multiplayer practice mode: two users paired together with an AI coach observing and providing signals to both sides
Custom scenario builder: for example, educators can create specific scenarios for their students
Difficulty progression: AI characters that adapt based on the user's demonstrated skill level across sessions