ThirdParty

ThirdParty is a relationship mirror and guided journaling app for TreeHacks 2026.

Stack

Next.js 14 App Router
TypeScript
Anthropic SDK + strict JSON validation with zod
Local JSON storage in data/

Quick start

Install dependencies
```
npm install
```
Environment
```
cp .env.example .env.local
```
Edit .env.local and set at least:
- OPENAI_API_KEY=sk-... (required for Voice tab transcription + diarization)
- ANTHROPIC_API_KEY=... (for mediator/reflections; optional)
Never commit .env.local or paste keys into the repo.
Run the app
```
npm run dev
```
Open http://localhost:3003. Use the Voice tab to upload or record and transcribe with speaker labels.
Optional: real speaker IDs (ECAPA-TDNN)

For persistent “who is this voice?” across sessions (not just placeholder labels):
- Install Python 3 and pip, then run the embedder in a second terminal:
```
npm run embedder
```
  Or manually:
```
cd services/speaker_embedder
pip install -r requirements.txt
uvicorn app:app --host 0.0.0.0 --port 5000
```
- In .env.local add (or uncomment):
```
SPEAKER_EMBEDDER_URL=http://localhost:5000/embed
```
- Restart npm run dev. The Voice page will show “ECAPA-TDNN” when the embedder is reachable.
Optional: audio conversion (webm/mp3 → WAV)

The app uses ffmpeg-static (installed with npm install) to convert uploads to 16 kHz mono WAV for best embedder results. No separate ffmpeg install needed. If conversion fails (e.g. unsupported format), transcription still runs; speaker IDs may be less accurate without the embedder.

Run locally (summary)

Same as Quick start above: npm install → copy .env.example to .env.local and set keys → npm run dev → optional npm run embedder + SPEAKER_EMBEDDER_URL.

Conversation awareness and Meta glasses

The app now includes a conversation-awareness detector and recording pipeline:

POST /api/conversationAwareness/listen:
- body: { "listeningEnabled": true | false }
GET /api/conversationAwareness/state:
- returns detector state, recent sessions, and recent events
POST /api/conversationAwareness/ingestSignal:
- body: { "source": "microphone" | "meta_glasses" | "phone_camera", "audioLevel": 0..1, "presenceScore": 0..1, "speakerHints": [{ personTag, speakingScore }] }
POST /api/conversationAwareness/uploadClip:
- body: { "sessionId": "...", "audioBase64": "...", "mimeType": "audio/webm" }
POST /api/metaGlasses/ingest:
- body: { "deviceId": "...", "audioLevel": 0..1, "speakerHints": [{ personTag, speakingScore }] }

Safety behavior

Facial recognition is not implemented.
Identity is based on consented person tags and speaker hints only.
Raw captured audio is stored locally in data/awareness/clips and is not shared by the shared-session flow.
Phone camera mode computes co-presence and motion scores only. It does not identify people and does not persist video frames.

UI flow

Go to /timeline
Tap the gear icon to open /settings
Start listening to activate microphone monitoring, optional phone camera co-presence monitoring, and detector-triggered recording
Use the Meta glasses signal panel to ingest device-side speaker hints

Voice: Transcribe + Speaker Identification

Two pipelines:

OpenAI + speaker memory (recommended)
OpenAI gpt-4o-transcribe-diarize for transcription + diarization (speaker turns). Then speaker embeddings + clustering (cosine similarity, centroid updates) to build persistent “who is this voice?” across sessions. No Azure Speaker Recognition; open-world discovery. See docs/voice-pipeline.md.
Pyannote diarization + speaker memory (optional)
Run local pyannote diarization service and set:
- VOICE_DIARIZATION_BACKEND=pyannote
- PYANNOTE_DIARIZER_URL=http://localhost:5010/diarize This uses pyannote for speaker-turn detection and keeps the same speaker clustering/persistent profiles pipeline.
Google + Azure (optional)
Google Speech-to-Text for diarization; Azure Speaker Recognition to identify enrolled speakers only.

Setup (OpenAI pipeline)

See Quick start above. In short:

Set OPENAI_API_KEY in .env.local (never commit it).
Real speaker IDs (optional): Run npm run embedder in a second terminal (or run the Python service manually; see Quick start). Set SPEAKER_EMBEDDER_URL=http://localhost:5000/embed in .env.local.
Details: docs/speaker-embedding-analysis.md.
Audio conversion: The app uses ffmpeg-static (installed with npm) to convert uploads to WAV 16 kHz mono; no separate ffmpeg install needed.

Setup (Pyannote diarization backend)

In a second terminal run:
- npm run diarizer
- or follow services/pyannote/README.md
In .env.local set:
- VOICE_DIARIZATION_BACKEND=pyannote
- PYANNOTE_DIARIZER_URL=http://localhost:5010/diarize
Keep OPENAI_API_KEY optional for fallback behavior if the local pyannote service is unavailable.

Setup (Google + Azure)

Google Cloud
- Create a project and enable the Speech-to-Text API.
- Create a service account, download a JSON key, and set in .env.local:
  - GOOGLE_APPLICATION_CREDENTIALS=/absolute/path/to/your-key.json
- Or use gcloud auth application-default login and set GOOGLE_CLOUD_PROJECT=your-project-id.
Azure
- Create a Speech resource and in .env.local set:
  - AZURE_SPEECH_KEY=your-key
  - AZURE_SPEECH_REGION=westus (or your region).
Copy .env.example to .env.local and fill in the keys.

Flow

Voice tab: Choose “OpenAI + speaker memory” (default) or “Google + Azure”. Upload or record → “Transcribe & identify”. With OpenAI: segments get stable speaker IDs over time; you can name speakers via PATCH /api/voice/speakers. With Google+Azure: enroll people in People → person → “Enroll voice”, then transcribe to match to those enrolled.

About

TreeHacks 2026 project

Name		Name	Last commit message	Last commit date
Latest commit History 65 Commits
app		app
components		components
data		data
docs		docs
lib		lib
packages/shared		packages/shared
services		services
.env.example		.env.example
.gitignore		.gitignore
App.tsx		App.tsx
README.md		README.md
index.html		index.html
index.tsx		index.tsx
metadata.json		metadata.json
next-env.d.ts		next-env.d.ts
next.config.mjs		next.config.mjs
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json
tsconfig.tsbuildinfo		tsconfig.tsbuildinfo
types.ts		types.ts
vite.config.ts		vite.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ThirdParty

Stack

Quick start

Run locally (summary)

Conversation awareness and Meta glasses

Safety behavior

UI flow

Voice: Transcribe + Speaker Identification

Setup (OpenAI pipeline)

Setup (Pyannote diarization backend)

Setup (Google + Azure)

Flow

About

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ThirdParty

Stack

Quick start

Run locally (summary)

Conversation awareness and Meta glasses

Safety behavior

UI flow

Voice: Transcribe + Speaker Identification

Setup (OpenAI pipeline)

Setup (Pyannote diarization backend)

Setup (Google + Azure)

Flow

About

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages