Inspiration ✨
Networking events move fast. You can meet dozens of people in a day, swap contacts, and still forget the most important part: what made the conversation meaningful. In typical interactions, you keep contact info, but not real context: what you talked about, what you promised, where you met, or even what the person looked like. We built Converge to turn those fleeting moments into durable memory you can actually act on.
What it does 🤝
Converge is an AI-powered personal CRM for professional networking. It records real conversations at events and transforms them in real-time into rich, searchable connection profiles.
Converge currently works as a mobile + web experience, but our long-term goal is to bring it to AR glasses so networking can stay hands-free, present, and human. The point isn’t to replace real connection, it’s to protect it, by capturing context in the background so you can stay fully engaged in the moment.
With Converge you can:
- Record a networking conversation (audio + video).
- Automatically extract a structured profile: name, company, role, topics discussed, challenges, and follow-up hooks.
- Capture visual memory: appearance + environment, plus a face embedding to help you remember the person later.
- Review and approve a draft profile, editing anything the AI got wrong (especially low-confidence fields).
- Browse your network in a clean connections grid and open a full detail view with conversation context and follow-ups.
Query your network naturally via voice/text like:
- “Who was that VC from I talked to first today?”
- “Who did I meet at NexHacks who works in AI?”
- “Person in a blue shirt by the Stripe booth”
How we built it 🏗️
We designed Converge as a real-time capture + parallel processing pipeline.
Mobile
- Recording flow (“New Connection” + event/location context)
- Live recording indicator + preview
Frontend (React)
- Draft review + inline editing UI
- Connections grid + connection detail view
- Auth flow + protected routes
Backend (Express)
- REST API for auth + connections (merge, approve, discard, list, detail)
- WebSocket streaming integration
- JWT middleware for protected endpoints
Audio pipeline (LiveKit)
- Streams audio and produces transcript + extracted profile fields
- Uses speaker diarization to distinguish between different speakers in the conversation
- Returns confidence scores per field (high/medium/low)
Video pipeline (Overshoot)
- Streams video frames
- Detects primary face + tracks subject
- Generates a face embedding vector
- Produces appearance + environment descriptions
Database (MongoDB Atlas)
- Stores profiles in
connections - Stores activity in
interactions - Uses Atlas Vector Search for face embedding similarity search (stretch-ready)
- Text indexes for fast search over names, companies, topics, and summaries
Merge logic
- Combines LiveKit (audio) + Overshoot (visual) outputs into one draft profile
- Adds context (event, location, timestamp)
- Flags low-confidence fields for manual review
- Saves as
draftuntil user approves
Challenges we ran into 😤
- Synchronizing audio + video pipelines: Audio and video are streamed live in parallel processes, so we needed a clean way to merge async outputs into one profile without losing context.
- Confidence + review UX: Extracted data isn’t always perfect, so we built a review flow that highlights low-confidence fields and makes edits fast.
- Streaming reliability: Permissions, WebSockets, and stable streaming intervals (250ms audio, ~200 ms video feedback) can break easily, especially across multiple integrations.
- Schema design under time pressure: We needed a schema that supports MVP today and stretch features we wish to add (follow-ups, analytics, vector search) without a rewrite.
Accomplishments that we're proud of 🏆
- Built an end-to-end system that turns a conversation into a draft, reviewable profile with real context, not just contact fields.
- Integrated parallel AI processing: LiveKit for transcription and the voice agent, and Overshoot for real-time visual analysis
- Designed a MongoDB schema + indexes that support search, filtering, and future voice + vector features.
- Delivered a demo-ready story: record -> extract -> approve -> remember -> follow up.
What we learned 📚
- The hardest part of “AI apps” isn’t calling a model, it’s building the system around it: streaming, merging outputs, and handling uncertainty gracefully.
- Confidence scores only matter if the UI makes them actionable.
- A strong hackathon MVP comes from a tight pipeline and a clear demo narrative: capture -> memory -> action.
What's next for Converge 🚀
- Smart follow-up suggestions: Generate personalized outreach messages that reference the actual conversation and promised actions, integrated into social platforms
- Analytics dashboard: Track follow-ups, events attended, and how your network evolves over time.
- LinkedIn enrichment: Match and enrich profiles with verified background info (with manual fallback).
- Network graph visualization: Explore your network by clusters (industry, event, shared topics) to spot key relationships faster.
Built With
- deepgram
- express.js
- face-api
- fnjs
- livekit
- mongodb
- node.js
- overshoot
- react
Log in or sign up for Devpost to join the conversation.