My Stardust

Tagline: To learn it, teach it. To master it, teach yourself.

My Stardust is a personalised AI learning companion that acts as a digital mirror of the user, cloned from their voice and a 3D headshot. Designed for university students, it turns solo revision into an active teaching experience — the user "trains" their Stardust by explaining concepts to it, leveraging active recall and the Feynman Technique through an evolving, holographic companion.

Architecture Overview
Tech Stack
External APIs & Services
Electron Process Model
IPC Bridge Layer
ElevenLabs Conversational Agent
Client Tool Definitions
Active Recall Pipeline
Knowledge Base (Vector Memory)
Google Gemini Integration
3D Particle Avatar System
Onboarding Pipeline
Exercise Generation Pipeline
Visual Generation Pipeline
Window System
Project Structure
Environment Variables
Scripts & Dev Tools
Build & Distribution

Architecture Overview

My Stardust is a desktop application built on the Nextron framework (Electron + Next.js). It runs as a transparent, frameless, full-screen overlay on the user's desktop. The rendering layer is a Next.js 14 app (static export, no SSR) served inside Electron's BrowserWindow. The main process handles all privileged operations (file I/O, native APIs, AI service calls) and communicates with the renderer via a strict IPC bridge.

┌─────────────────────────────────────────────────────────┐
│                     Electron Main Process               │
│  main/background.js                                     │
│  ┌───────────────────┐   ┌──────────────────────────┐  │
│  │  electron-store   │   │  services/knowledge.js   │  │
│  │  (user profile)   │   │  (vector memory, JSON)   │  │
│  └───────────────────┘   └──────────────────────────┘  │
│  ┌───────────────────────────────────────────────────┐  │
│  │  services/gemini.js                               │  │
│  │  - generateEmbedding (gemini-embedding-001)       │  │
│  │  - generateText (gemini-2.5-flash)                │  │
│  │  - generateImage (gemini-2.5-flash-image)         │  │
│  │  - generateCodeExerciseStream (streaming)         │  │
│  └───────────────────────────────────────────────────┘  │
└──────────────────────────┬──────────────────────────────┘
                           │ contextBridge (ipc)
┌──────────────────────────┴──────────────────────────────┐
│              Electron Renderer Process                  │
│              Next.js 14 (static export)                 │
│  ┌───────────────────────────────────────────────────┐  │
│  │  pages/home.jsx                                   │  │
│  │  - useConversation (@elevenlabs/react)            │  │
│  │  - clientTools: add_knowledge, retrieve_knowledge │  │
│  │  -              question_user_recall,             │  │
│  │                 check_user_recall_answer,         │  │
│  │                 generate_visual,                  │  │
│  │                 generate_code_exercise            │  │
│  └───────────────────────────────────────────────────┘  │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐  │
│  │  Scene.jsx   │  │  Whiteboard  │  │  Exercise    │  │
│  │  (R3F Canvas)│  │  Overlay.jsx │  │  Overlay.jsx │  │
│  └──────────────┘  └──────────────┘  └──────────────┘  │
│  ┌──────────────────────────────────────────────────┐   │
│  │  GhostHead.jsx (particle system, 7 morph states) │   │
│  └──────────────────────────────────────────────────┘   │
└─────────────────────────────────────────────────────────┘
         │
         │ WebSocket (ElevenLabs Realtime)
         ▼
┌─────────────────────┐     ┌─────────────────────┐
│  ElevenLabs         │────▶│  Anthropic           │
│  Conversational AI  │     │  claude-3-5-sonnet   │
│  (ASR + TTS)        │     │  (LLM reasoning)     │
└─────────────────────┘     └─────────────────────┘

Tech Stack

Runtime & Framework

Layer	Technology	Version
Desktop shell	Electron	`^34.0.0`
Next.js bridge	Nextron	`^9.5.0`
Renderer framework	Next.js	`^14.2.4`
UI library	React	`^18.3.1`
Language	JavaScript (ESM in main process, JSX in renderer)	—

3D Rendering

Library	Version	Role
Three.js	`^0.183.1`	Low-level WebGL geometry, materials, BufferGeometry
@react-three/fiber	`^8.17.10`	React reconciler for Three.js (Canvas, useFrame)
@react-three/drei	`^9.122.0`	`useGLTF`, `OrbitControls`, `Suspense` helpers

AI SDKs

Library	Version	Role
@elevenlabs/react	`^0.14.0`	`useConversation` hook, WebSocket session management
@google/genai	`^1.42.0`	Gemini text, embedding, image, streaming APIs

Electron Utilities

Library	Version	Role
electron-store	`^8.2.0`	Persistent key-value store for user profile
electron-serve	`^1.3.0`	Serves the static Next.js export in production

UI Utilities

Library	Version	Role
react-icons	`^5.5.0`	`IoIosResize` icon for the resize handle

Build Tooling

Tool	Version	Role
electron-builder	`^24.13.3`	Packages the app into `.dmg` / `.zip` / `.exe`
Next.js webpack	built-in	Bundles renderer JS

External APIs & Services

1. ElevenLabs Conversational AI

Base URL: https://api.elevenlabs.io/v1/
WebSocket: Managed by @elevenlabs/react SDK
Agent ID: agent_1701khz2fkm6fpatfs601dpfn1w9
Agent LLM: claude-3-5-sonnet (Anthropic, routed through ElevenLabs)
TTS Model: eleven_turbo_v2
Default Voice ID: fVVjLtJgnQI61CoImgHU
ASR Provider: ElevenLabs built-in
ASR Audio Format: PCM 16000 Hz (input)
TTS Audio Format: PCM 16000 Hz (output)
Turn Model: turn_v2
Turn Timeout: 7 seconds
Max Session Duration: 600 seconds (10 minutes)
Streaming Latency Optimisation: Level 3
TTS Stability: 0.5 | Similarity Boost: 0.8 | Speed: 1.0
Client Events subscribed: audio, interruption, user_transcript, agent_response, agent_response_correction

Onboarding REST endpoints used:

POST /v1/convai/agents/{BASE_AGENT_ID}/duplicate   → clone agent for new user
POST /v1/voices/add                                → upload 10s webm, get voiceId
PATCH /v1/convai/agents/{newAgentId}               → set tts.voice_id

2. Google Gemini (`@google/genai`)

Model	Use case	API call
`gemini-embedding-001`	Generate vector embeddings for knowledge base	`models.embedContent`
`gemini-2.5-flash`	Text generation — question creation, answer evaluation	`models.generateContent`
`gemini-2.5-flash`	Exercise streaming — full HTML/CSS/JS game	`models.generateContentStream`
`gemini-2.5-flash-image`	Visual generation — whiteboard diagrams	`models.generateContent` with `responseModalities: ['IMAGE']`

API Key env var: NEXT_PUBLIC_GEMINI_API_KEY
Client instantiation: Lazy singleton new GoogleGenAI({ apiKey }) in main/services/gemini.js
Image prompt constraint: Appends "Minimalist whiteboard illustration style, clean black ink strokes on a pure white background, hand-drawn diagrammatic style. Aspect ratio 8:5." to all image prompts.

3. fal-ai / Meshy (Image-to-3D)

Storage upload: POST https://fal.run/fal-ai/storage/upload
- Headers: Authorization: Key {FAL_KEY}, Content-Type: image/png
- Body: raw image blob
- Returns: { url: "https://..." }
Job enqueue: POST https://queue.fal.run/fal-ai/meshy/v6/image-to-3d
- Body: { input: { image_url } }
- Returns: { request_id }
Status polling: GET https://queue.fal.run/fal-ai/meshy/v6/image-to-3d/requests/{request_id}/status
- Polls every 3 seconds, up to 60 attempts (3-minute timeout)
- Statuses: IN_QUEUE, COMPLETED, FAILED
Result fetch: GET https://queue.fal.run/fal-ai/meshy/v6/image-to-3d/requests/{request_id}
- Returns: { model_glb: { url }, thumbnail: { url } }
API Key env var: NEXT_PUBLIC_FAL_KEY

Electron Process Model

Main Process (`main/background.js`)

Entry point defined in package.json → "main": "app/background.js" (compiled output)
Source: main/background.js (ESM with Webpack)
Responsibilities:
- Create the transparent, frameless BrowserWindow
- Request macOS microphone permission via systemPreferences.askForMediaAccess('microphone')
- Handle all IPC channels (ipcMain.handle / ipcMain.on)
- Manage the electron-store user profile
- Download and replace avatar.glb via Node.js https/http streams
- Register global shortcuts

BrowserWindow config:

{
  width: screenW,          // full work area width
  height: screenH,         // full work area height
  x: 0, y: 0,
  transparent: true,       // desktop shows through
  frame: false,            // no title bar / chrome
  resizable: true,
  hasShadow: false,
  webPreferences: {
    preload: 'preload.js',
    nodeIntegration: false,
    contextIsolation: true, // security: renderer cannot access Node directly
  },
}

Chromium command-line switches applied:

--disable-features=WebRtcHideLocalIpsWithMdns   (prevents WebRTC crash)
--enable-features=WebRTCPipeWireCapturer         (Linux compatibility)

Global shortcuts:

Cmd/Ctrl+Shift+I — Toggle DevTools
Cmd/Ctrl+Shift+R — Clear electron-store and reload (dev reset)

Renderer Process (`renderer/`)

Next.js 14 app, output mode: export (static HTML/JS, no server)
distDir: ../app in production, .next in development
trailingSlash: true, images.unoptimized: true
Loaded via app://./home (prod) or http://localhost:{port}/home (dev)
All Node.js APIs are strictly off (nodeIntegration: false) — renderer communicates only through the window.ipc object injected by the preload script.

Preload Script (`main/preload.js`)

Bridges renderer ↔ main using Electron's contextBridge:

contextBridge.exposeInMainWorld('ipc', {
  send(channel, value),          // fire-and-forget
  invoke(channel, ...args),      // returns Promise
  on(channel, callback),         // event listener; returns unsubscribe fn
})

IPC Bridge Layer

All communication between the React renderer and the Electron main process goes through named IPC channels:

Channel	Direction	Type	Handler	Description
`add-knowledge`	renderer→main	`invoke`	`addKnowledge(text, kbPath)`	Embed + store knowledge entry
`retrieve-knowledge`	renderer→main	`invoke`	`retrieveKnowledge(query, topK, kbPath)`	Cosine similarity search
`gemini-generate`	renderer→main	`invoke`	`generateText(prompt)`	Gemini 2.5 Flash text
`generate-visual`	renderer→main	`invoke`	`generateImage(prompt)`	Gemini image → base64
`generate-exercise`	renderer→main	`send`	`generateCodeExerciseStream(query, onChunk)`	Streaming HTML game
`exercise-chunk`	main→renderer	`send`	—	Streamed text chunk
`exercise-done`	main→renderer	`send`	—	Stream complete signal
`exercise-error`	main→renderer	`send`	—	Stream error signal
`resize-window`	renderer→main	`send`	`win.setBounds(...)`	IPC-driven window resize
`set-ignore-mouse-events`	renderer→main	`send`	`win.setIgnoreMouseEvents(...)`	Click-through toggle
`get-user-profile`	renderer→main	`invoke`	`userStore.store`	Read persisted profile
`set-user-profile`	renderer→main	`invoke`	`userStore.set(k, v)`	Write persisted profile
`download-and-replace-avatar`	renderer→main	`invoke`	Downloads GLB via `https.get`, writes to `renderer/public/avatar.glb`	Avatar pipeline step 3

Knowledge base path: app.getPath('userData') + '/knowledge_base.json'

ElevenLabs Conversational Agent

The agent is configured in agent_config.json (full snapshot of the ElevenLabs agent definition):

{
  "agent_id": "agent_1701khz2fkm6fpatfs601dpfn1w9",
  "conversation_config": {
    "asr": {
      "quality": "high",
      "provider": "elevenlabs",
      "user_input_audio_format": "pcm_16000"
    },
    "turn": {
      "turn_timeout": 7,
      "mode": "turn",
      "turn_eagerness": "normal",
      "turn_model": "turn_v2"
    },
    "tts": {
      "model_id": "eleven_turbo_v2",
      "voice_id": "fVVjLtJgnQI61CoImgHU",
      "agent_output_audio_format": "pcm_16000",
      "optimize_streaming_latency": 3,
      "stability": 0.5,
      "speed": 1,
      "similarity_boost": 0.8
    },
    "conversation": {
      "max_duration_seconds": 600,
      "client_events": ["audio","interruption","user_transcript","agent_response","agent_response_correction"]
    }
  }
}

System prompt:

"You are 'My Stardust', a student's digital twin. When a user wants to be tested, use question_user_recall. After the user answers, use check_user_recall_answer. If you are correct, you must say 'Correct answer: [Explanation]'. If incorrect, say 'Incorrect answer: [Explanation]'. For all other chat, be warm and empathetic."

LLM: claude-3-5-sonnet (temperature: 0, parallel tool calls: false)

React SDK usage (pages/home.jsx):

const conversation = useConversation({
  clientTools: { add_knowledge, retrieve_knowledge, question_user_recall,
                 check_user_recall_answer, generate_visual, generate_code_exercise },
  onConnect, onDisconnect, onMessage, onError,
})

// Start/stop session
await conversation.startSession({ agentId: "agent_1701khz2fkm6fpatfs601dpfn1w9" })
await conversation.endSession()

// State
conversation.status        // 'connected' | 'disconnected'
conversation.isSpeaking    // boolean (drives pulse animation)

Client Tool Definitions

All 6 tools are registered as clientTools in useConversation. ElevenLabs calls them on the client; the renderer handles them and returns results back to the agent.

`add_knowledge({ text })`

Trigger: Agent decides to store new information the user has explained
ElevenLabs timeout: 20 seconds
Flow:
1. Sets isThinkingRef.current = true (triggers thinking morph)
2. Calls window.ipc.invoke('add-knowledge', text) → main process
3. Main calls addKnowledge(text, kbPath):
  - Calls generateEmbedding(text) → Gemini gemini-embedding-001
  - Appends { id, text, embedding, createdAt } to knowledge_base.json
4. Returns confirmation string to agent

`retrieve_knowledge({ query, topK? })`

Trigger: Agent needs to recall something the user previously taught it
ElevenLabs timeout: 20 seconds
Default topK: 3
Flow:
1. Calls window.ipc.invoke('retrieve-knowledge', query, topK)
2. Main calls retrieveKnowledge(query, topK, kbPath):
  - Embeds the query via gemini-embedding-001
  - Loads all entries from knowledge_base.json
  - Computes cosine similarity for every entry
  - Sorts descending, returns top-K as { id, text, score, createdAt }[]
3. Formats results as [1] text\n\n[2] text... string for agent

`question_user_recall({ topic })`

Trigger: User says "quiz me on X" or "ask me a question about Y"
ElevenLabs timeout: 30 seconds
Flow:
1. window.ipc.invoke('retrieve-knowledge', topic, 3) — fetch what student knows about topic
2. Builds prompt: "This is what the student knows: {notes}\n\nGenerate ONE short-answer question about "{topic}" the question must be answerable using the limited knowledge provided... DO NOT provide the answer."
3. window.ipc.invoke('gemini-generate', prompt) → gemini-2.5-flash
4. Returns generated question string to ElevenLabs agent (agent speaks it aloud)

`check_user_recall_answer({ user_answer })`

Trigger: User has responded to a quiz question
ElevenLabs timeout: 30 seconds
Flow:
1. Builds full conversation history string from chatHistoryRef.current
2. Prompt: "Recent Conversation History:\n{history}\n\nUser's Answer: {user_answer}\n\nEvaluate... If correct, respond EXACTLY starting with 'Correct answer: '... If incorrect, 'Incorrect answer: '..."
3. window.ipc.invoke('gemini-generate', prompt) → gemini-2.5-flash
4. Parses evaluation string to determine morph state:
  - "correct answer:" → resultMorphRef.current = 1 → success morph (3s auto-reset)
  - "incorrect answer:" → resultMorphRef.current = -1 → failure morph (3s auto-reset)
5. Returns evaluation string to ElevenLabs (agent speaks it in user's cloned voice)

`generate_visual({ prompt, text })`

Trigger: User asks for a visual explanation, diagram, or illustration
Flow:
1. Sets isGeneratingRef.current = true (gear morph)
2. Opens WhiteboardOverlay (loading state, no image yet)
3. window.ipc.invoke('generate-visual', prompt) → main process
4. Main appends whiteboard style suffix to prompt, calls gemini-2.5-flash-image
5. Extracts inlineData.data (base64 PNG), returns data:image/png;base64,...
6. Sets generatedImageUrl → overlay renders the image
7. Returns text param to agent to read aloud

`generate_code_exercise({ query, text })`

Trigger: User asks for an interactive game or exercise
Flow:
1. Sets isGeneratingRef.current = true (gear morph)
2. Opens DigitalExerciseOverlay with loading state
3. window.ipc.send('generate-exercise', query) — fire-and-forget (streaming)
4. Main calls generateCodeExerciseStream(query, onChunk):
  - Sends chunked HTML to renderer via event.sender.send('exercise-chunk', chunk)
  - Strips markdown fences (```html) from chunks before sending
  - Fires exercise-done when stream completes
5. Renderer appends each chunk to exerciseCode state (progressive render)
6. Returns text immediately so agent starts speaking while code generates

Active Recall Pipeline

Full two-stage interactive quiz loop:

User: "Quiz me on programming"
           │
           ▼
ElevenLabs detects intent → calls question_user_recall(topic="programming")
           │
           ▼
[Client] window.ipc.invoke('retrieve-knowledge', 'programming', 3)
           │
           ▼
[Main] Embeds "programming" → cosine search → returns top 3 notes
           │
           ▼
[Client] Builds prompt → window.ipc.invoke('gemini-generate', prompt)
           │
           ▼
[Main] gemini-2.5-flash generates question → returns to client
           │
           ▼
Client returns question string to ElevenLabs
           │
           ▼
ElevenLabs speaks question in user's cloned voice

User: "OOP uses objects to store data..."
           │
           ▼
ElevenLabs calls check_user_recall_answer(user_answer="OOP uses objects...")
           │
           ▼
[Client] Builds history + answer → window.ipc.invoke('gemini-generate', evalPrompt)
           │
           ▼
[Main] gemini-2.5-flash evaluates → "Correct answer: ..."
           │
           ▼
[Client] Parses verdict → sets resultMorphRef = 1 (success)
         Avatar morphs: head → ✅ (green, 3 seconds) → resets to head
           │
           ▼
ElevenLabs speaks evaluation in user's cloned voice

Knowledge Base (Vector Memory)

Implementation: main/services/knowledge.js

The knowledge base is a flat JSON array persisted to the user's userData directory. There is no external database — everything is local and offline.

Storage format (knowledge_base.json):

[
  {
    "id": "1751234567890-abc123",
    "text": "OOP uses objects to store data and methods that operate on that data.",
    "embedding": [0.023, -0.041, ...],   // 3072-dimensional float array
    "createdAt": "2025-06-01T12:00:00.000Z"
  }
]

addKnowledge(text, dbPath):

generateEmbedding(text) → GoogleGenAI.models.embedContent({ model: 'gemini-embedding-001', contents: text })
Generates unique ID: Date.now() + '-' + Math.random().toString(36).slice(2,8)
Appends entry, writes JSON with JSON.stringify(store, null, 2)

retrieveKnowledge(query, topK, dbPath):

generateEmbedding(query) → query vector
Loads all entries from JSON file

For each entry, computes cosine similarity:

dot / (Math.sqrt(normA) * Math.sqrt(normB))

Sorts descending by score, slices top-K
Returns { id, text, score, createdAt }[] (embedding stripped from response)

User profile store (electron-store):

Store name: user-profile
Keys stored: agentId (cloned agent), voiceId (cloned voice)
Cleared on Cmd+Shift+R dev shortcut

Google Gemini Integration

Implementation: main/services/gemini.js

The module runs entirely in the Electron main process (Node.js). It reads the API key from renderer/.env at startup if the env var is not already set.

// .env loading (main process)
fs.readFileSync('../renderer/.env').split('\n').forEach(line => {
  // Parses KEY=VALUE pairs, strips quotes
})

Functions exported:

Function	Model	Method	Returns
`generateEmbedding(text)`	`gemini-embedding-001`	`models.embedContent`	`Float64Array` (3072-dim)
`cosineSimilarity(a, b)`	—	Pure math	`number` (-1 to 1)
`generateText(prompt)`	`gemini-2.5-flash`	`models.generateContent`	`string`
`generateImage(prompt)`	`gemini-2.5-flash-image`	`models.generateContent` with `responseModalities: ['IMAGE']`	`"data:image/png;base64,..."`
`generateCodeExerciseStream(query, onChunk)`	`gemini-2.5-flash`	`models.generateContentStream`	`void` (callback-driven)

Exercise prompt template:

Act as an expert educational game developer. Create a single-file HTML/CSS/JS solution for an interactive exercise based on: {query}.

Constraint 1: Everything MUST be in one self-contained HTML file (internal CSS/JS).
Constraint 2: No external CDNs or libraries (use raw JS/Canvas).
Constraint 3: Use a 'dark mode' aesthetic with glowing accents (cyber-education).
Constraint 4: The game must be responsive. CRITICAL: the exercise will initially load in a small floating window (approx 400x300 pixels).

3D Particle Avatar System

Implementation: renderer/components/GhostHead.jsx + renderer/components/Scene.jsx

Scene Setup (`Scene.jsx`)

<Canvas
  camera={{ position: [0, 0, 3.5], fov: 40, near: 0.01 }}
  gl={{ antialias: true, alpha: true }}
  onCreated={({ gl }) => {
    gl.setClearColor(0x000000, 0)   // fully transparent background
  }}
>
  <OrbitControls enablePan={false} minDistance={0.3} maxDistance={5} />
  <GhostHead ... />
</Canvas>

Dynamically imported with ssr: false (Three.js is not SSR-compatible)
Wrapped in a React Component error boundary

Particle Geometry Construction (`GhostHead.jsx` — `useMemo`)

Five GLB files are loaded via useGLTF:

avatar.glb — user's 3D head (default or generated by fal-ai/meshy)
thinking.glb — thought-bubble shape
success.glb — checkmark (✅)
failure.glb — X mark (❌)
gear.glb — gear (used during exercise generation)

For each GLB, vertices are extracted and face-sampled:

Vertex extraction: Traverses all THREE.Mesh nodes, applies matrixWorld transforms, collects position and normal attributes.
Face sampling: For each triangle, samples SAMPLES_PER_FACE = 3 random barycentric points (r1 + r2 ≤ 1; r3 = 1 - r1 - r2).
Normalisation: Non-avatar GLBs are scaled to match the avatar's Y-span (avatarHeight = maxY - minY) so all morph targets fit in the same visual space.
Sphere positions: Random uniform sphere sampling (SPHERE_RADIUS = 0.25) — used for speaking state.

All positions stored as Float32Array in memory. A single BufferGeometry is created once (useMemo) and its position attribute is mutated every frame.

Animation Loop (`useFrame` — per frame)

Morph state variables (all smoothly lerped):

Ref	Target	Rate
`smoothPulseRef`	`pulseRef.current` (0 or 0.5–1.0)	`TARGET_SMOOTHING = 0.15`
`morphRef`	1.0 if speaking, else 0.0	`0.15 × 0.4`
`thinkMorphRef`	1.0 if thinking, else 0.0	`0.15 × 0.55`
`generateMorphRef`	1.0 if generating, else 0.0	`0.15`
`smoothResultRef`	`resultMorphRef` (-1, 0, or 1)	`0.15`

Morph priority (layered):

Generating (gear) overrides everything
Success/failure overrides thinking
Thinking overrides speaking
Speaking (sphere) is base

Per-particle position computation (executed for every particle each frame):

Jitter: sin/cos waves with per-particle phase offsets (JITTER_AMPLITUDE = 0.0004)
Head: homePosition + normal × (pulse × DISTORTION_FACTOR × 1.5)
Sphere: spherePosition + radial noise × pulse (diffusion + flare noise)
Head↔Sphere: THREE.MathUtils.lerp(head, sphere, morphRef)
Thinking: lerp(head/sphere, thinkingPos + breatheWave, thinkMorphRef)
Gear: lerp(thinking, gearPos, generateMorphRef)
Success/Failure: lerp(gear, successPos/failurePos + breatheWave, activeResultMorph)
Final: + jitter

Rotation logic:

Idle: gentle sway — sin(t × 0.3) × MAX_SWAY_RAD on Y, sin(t × 0.4) × 0.3 × MAX_SWAY_RAD on X
Thinking: continuous spin += delta × 0.22 (Y), += delta × 0.07 (X)
Generating: fast gear spin −= delta × 3.5 (Z), += delta × 1.5 (Y)
All lerped at rate 0.05 so no snapping during transitions

Tuning constants:

DISTORTION_FACTOR = 0.19   // how far normals push on voice amplitude
JITTER_AMPLITUDE  = 0.0004 // micro-jitter per particle
TARGET_SMOOTHING  = 0.15   // exponential smoothing rate
PARTICLE_SIZE     = 0.006  // WebGL point size
PARTICLE_OPACITY  = 0.75
MAX_SWAY_RAD      = 0.087  // radians (~5°)
SPHERE_RADIUS     = 0.25   // world units
SAMPLES_PER_FACE  = 3      // barycentric samples per triangle

Color states:

State	Color
Idle	White, brightness `1.0 + pulse × 1.5`
Thinking	Blue-white breathing `RGB(0.55→1, 0.72→1, 0.65→1)`
Success	Bright green/teal pulse `RGB(0.2→1, 0.8→1, 0.4→1)`
Failure	Hot red/orange `RGB(0.9→1, 0.2→1, 0.2→1)`
Generating	Oscillating light-blue (#87CEFA) ↔ soft-violet (#DDA0DD)

Material settings:

<pointsMaterial
  size={PARTICLE_SIZE}
  transparent
  opacity={PARTICLE_OPACITY}
  color="#ffffff"
  sizeAttenuation    // size scales with camera distance
  depthWrite={false} // prevents z-fighting with transparent window
/>

Voice Pulse Mapping

In home.jsx, a requestAnimationFrame loop updates pulseRef.current:

pulseRef.current = (isConnected && conversation.isSpeaking)
  ? 0.5 + Math.random() * 0.5
  : 0

This random 0.5–1.0 range is smoothed in GhostHead.useFrame to create organic pulsation. The pulseRef is a React ref (not state) to avoid re-renders on every frame.

Onboarding Pipeline

Implementation: renderer/components/onboarding/

State machine with 3 steps (voice → avatar → done) managed by useReducer.

Step 1: Voice Cloning (`StepVoiceCapture.jsx`)

Audio capture:

navigator.mediaDevices.getUserMedia({ audio: { echoCancellation: true, noiseSuppression: true, autoGainControl: true } })
MediaRecorder with mimeType: 'audio/webm;codecs=opus'
Chunk interval: 250ms
Duration: 10 seconds (auto-stop) or manual early stop
Level metering: ScriptProcessorNode (buffer: 2048 samples, 1 channel) with RMS computation — not AnalyserNode (returns zeros in Electron's audio context)

Pipeline (4 REST calls):

1. POST /v1/convai/agents/{BASE_AGENT_ID}/duplicate
   → { agent_id: newAgentId }

2. POST /v1/voices/add (multipart/form-data)
   Fields: name, files (voice-sample.webm), remove_background_noise=true, description
   → { voice_id }

3. PATCH /v1/convai/agents/{newAgentId}
   Body: { conversation_config: { tts: { voice_id } } }
   → 200 OK

4. window.ipc.invoke('set-user-profile', { agentId: newAgentId, voiceId })
   → persisted to electron-store

Step 2: Avatar Capture (`StepAvatarCapture.jsx`)

Image capture:

Webcam: getUserMedia({ video: { width: 512, height: 512, facingMode: 'user' } })
Canvas crop: square-centre crop at 512×512, mirrored horizontally (ctx.scale(-1, 1))
Output: PNG blob via canvas.toBlob()
File upload: <input type="file" accept="image/*"> with same canvas resize path

Pipeline (5 steps):

1. POST https://fal.run/fal-ai/storage/upload
   Body: image PNG blob
   Headers: Authorization: Key {FAL_KEY}
   → { url: publicImageUrl }

2. POST https://queue.fal.run/fal-ai/meshy/v6/image-to-3d
   Body: { input: { image_url: publicImageUrl } }
   → { request_id }

3. POLL GET .../requests/{request_id}/status every 3s (max 60 × 3s = 3min)
   States: IN_QUEUE (show position) → COMPLETED

4. GET .../requests/{request_id}
   → { model_glb: { url: glbUrl } }

5. window.ipc.invoke('download-and-replace-avatar', glbUrl)
   Main process: https.get(glbUrl) → fs.createWriteStream(avatarPath)
   Overwrites renderer/public/avatar.glb (dev) or app/avatar.glb (prod)

UI Portal animation: Camera viewfinder uses CSS clip-path: circle(0% → 50%) transition for a portal-open effect (cubic-bezier(0.25, 0.8, 0.25, 1), 800ms).

Exercise Generation Pipeline

ElevenLabs agent calls generate_code_exercise({ query, text })
Renderer fires window.ipc.send('generate-exercise', query) (non-blocking)
Returns text immediately so agent starts narrating
Main process calls generateCodeExerciseStream(query, onChunk) which calls getAI().models.generateContentStream
Each chunk.text is cleaned of markdown fences (```html) and sent via event.sender.send('exercise-chunk', cleaned)
Renderer accumulates chunks in exerciseCode state via ipc.on('exercise-chunk')
DigitalExerciseOverlay renders the accumulated HTML inside a <iframe srcDoc> (progressively updating)
exercise-done fires → loading state clears, gear morph resets

Visual Generation Pipeline

ElevenLabs agent calls generate_visual({ prompt, text })
Renderer opens WhiteboardOverlay (shows loading shimmer)
window.ipc.invoke('generate-visual', prompt) → main process
Main appends whiteboard style suffix, calls gemini-2.5-flash-image with responseModalities: ['IMAGE']
Extracts candidates[0].content.parts[].inlineData.data (base64)
Returns "data:image/png;base64,{data}" string
Renderer sets generatedImageUrl → <img> renders in overlay
Agent speaks the text parameter aloud

Window System

Click-Through Mechanism

The window uses setIgnoreMouseEvents to be transparent to mouse events when the cursor is not over interactive UI:

On mount: set-ignore-mouse-events true { forward: true } — window is click-through, events forwarded to OS
onMouseEnter any interactive element: set-ignore-mouse-events false — window captures events
onMouseLeave: set-ignore-mouse-events true { forward: true } — back to click-through

Drag Handle

Top-left 30×30px circle with WebkitAppRegion: 'drag' — allows window dragging without a title bar.

IPC Resize

Bottom-right resize handle uses Pointer Capture API (setPointerCapture/releasePointerCapture) to track drag delta and calls window.ipc.send('resize-window', { width, height }). Main process uses win.setBounds() with a minimum of 400×300.

Project Structure

nextronapp_before_gamma/
├── main/                          # Electron main process (Node.js / ESM)
│   ├── background.js              # Entry point, IPC handlers, BrowserWindow
│   ├── preload.js                 # contextBridge: exposes window.ipc
│   ├── helpers/
│   │   ├── create-window.js       # (nextron scaffold)
│   │   └── index.js
│   └── services/
│       ├── gemini.js              # Gemini SDK wrapper (embed, text, image, stream)
│       └── knowledge.js           # Vector store: add/retrieve (cosine similarity)
│
├── renderer/                      # Next.js app (renderer process)
│   ├── pages/
│   │   ├── _app.jsx               # Global Next.js app wrapper
│   │   ├── home.jsx               # Main page: ElevenLabs session, tool handlers
│   │   └── next.jsx               # (scaffold page)
│   ├── components/
│   │   ├── Scene.jsx              # R3F Canvas + OrbitControls + error boundary
│   │   ├── GhostHead.jsx          # Particle avatar: GLB sampling, 7-morph animation
│   │   ├── WhiteboardOverlay.jsx  # Fullscreen image display for generate_visual
│   │   ├── DigitalExerciseOverlay.jsx  # Iframe-based HTML game display
│   │   └── onboarding/
│   │       ├── OnboardingOverlay.jsx   # State machine (voice → avatar → done)
│   │       ├── StepVoiceCapture.jsx    # MediaRecorder + ElevenLabs voice clone API
│   │       └── StepAvatarCapture.jsx   # Webcam + fal-ai/meshy 3D pipeline
│   ├── styles/
│   │   └── globals.css
│   ├── public/
│   │   ├── avatar.glb             # Default 3D head (replaced after onboarding)
│   │   ├── thinking.glb           # Thought-bubble morph shape
│   │   ├── success.glb            # ✅ morph shape
│   │   ├── failure.glb            # ❌ morph shape
│   │   ├── gear.glb               # Gear morph shape (generating state)
│   │   └── images/logo.png
│   ├── next.config.js             # output: export, distDir: ../app (prod)
│   └── .env                       # API keys (gitignored)
│
├── agent_config.json              # Full ElevenLabs agent definition snapshot
├── package.json                   # Dependencies + scripts
├── electron-builder.yml           # Build config (output: dist/, resources/)
├── resources/
│   ├── icon.icns                  # macOS app icon
│   └── icon.ico                   # Windows app icon
├── scripts/
│   ├── test-active-recall.js      # Dev test for active recall pipeline
│   └── update_agent.js            # Script to push agent config changes to ElevenLabs
├── reset-knowledge.js             # Clears knowledge_base.json
└── test-knowledge.js              # Tests add/retrieve knowledge flow

Environment Variables

All env vars live in renderer/.env (loaded by Next.js in renderer; also manually parsed by main/services/gemini.js for main process access):

Variable	Required	Description
`NEXT_PUBLIC_GEMINI_API_KEY`	Yes	Google AI Studio API key
`NEXT_PUBLIC_ELEVENLABS_API_KEY`	Yes	ElevenLabs `xi-api-key` header for REST calls
`NEXT_PUBLIC_BASE_AGENT_ID`	Yes	Base agent to duplicate during onboarding
`NEXT_PUBLIC_ELEVENLABS_AGENT_ID`	Yes	Active agent ID for conversation sessions
`NEXT_PUBLIC_FAL_KEY`	Yes	fal-ai API key (`key_id:key_secret` format)

Scripts & Dev Tools

# Development (hot-reload Electron + Next.js)
yarn dev

# Production build (static export to app/ + Electron compile)
yarn build

# Install native Electron deps after npm install
yarn postinstall       # electron-builder install-app-deps

# Clear the local knowledge base (knowledge_base.json)
yarn reset-kb          # node reset-knowledge.js

# Test add/retrieve knowledge pipeline
yarn test-kb           # node test-knowledge.js

# Test the active recall two-stage loop
yarn test-active-recall  # node scripts/test-active-recall.js

In-app dev shortcuts:

Shortcut	Action
`Cmd/Ctrl+Shift+I`	Toggle DevTools (detached window in dev, toggleable in prod)
`Cmd/Ctrl+Shift+R`	Clear `electron-store` (reset onboarding) + reload renderer

Build & Distribution

Tool: electron-builder v24.13.3

Config (electron-builder.yml):

appId: com.example.nextron
productName: My Nextron App
directories:
  output: dist
  buildResources: resources
files:
  - from: .
    filter: [package.json, app]

Build artifacts (dist/):

My Nextron App-1.0.0-arm64.dmg — macOS installer (Apple Silicon)
My Nextron App-1.0.0-arm64-mac.zip — macOS zip (Apple Silicon)
.blockmap files for delta updates

Build pipeline:

nextron build triggers:
- next build inside renderer/ → static export to ../app/
- Webpack compiles main/background.js + main/preload.js → app/background.js + app/preload.js
electron-builder packages app/ + package.json into an Asar archive
Output: signed (or unsigned) .dmg + .zip

Production vs development loading:

// Production: custom protocol
serve({ directory: 'app' })
mainWindow.loadURL('app://./home')

// Development: Next.js dev server
mainWindow.loadURL(`http://localhost:${port}/home`)
mainWindow.webContents.openDevTools({ mode: 'detach' })

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.vscode		.vscode
main		main
renderer		renderer
resources		resources
scripts		scripts
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md
agent_config.json		agent_config.json
check_agent.js		check_agent.js
electron-builder.yml		electron-builder.yml
package-lock.json		package-lock.json
package.json		package.json
reset-knowledge.js		reset-knowledge.js
test-knowledge.js		test-knowledge.js
test_create_tool.js		test_create_tool.js
test_get_tools.js		test_get_tools.js
tools.json		tools.json

Folders and files

Latest commit

History

Repository files navigation

My Stardust

Table of Contents

Architecture Overview

Tech Stack

Runtime & Framework

3D Rendering

AI SDKs

Electron Utilities

UI Utilities

Build Tooling

External APIs & Services

1. ElevenLabs Conversational AI

2. Google Gemini (@google/genai)

3. fal-ai / Meshy (Image-to-3D)

Electron Process Model

Main Process (main/background.js)

Renderer Process (renderer/)

Preload Script (main/preload.js)

IPC Bridge Layer

ElevenLabs Conversational Agent

Client Tool Definitions

add_knowledge({ text })

retrieve_knowledge({ query, topK? })

question_user_recall({ topic })

check_user_recall_answer({ user_answer })

generate_visual({ prompt, text })

generate_code_exercise({ query, text })

Active Recall Pipeline

Knowledge Base (Vector Memory)

Google Gemini Integration

3D Particle Avatar System

Scene Setup (Scene.jsx)

Particle Geometry Construction (GhostHead.jsx — useMemo)

Animation Loop (useFrame — per frame)

Voice Pulse Mapping

Onboarding Pipeline

Step 1: Voice Cloning (StepVoiceCapture.jsx)

Step 2: Avatar Capture (StepAvatarCapture.jsx)

Exercise Generation Pipeline

Visual Generation Pipeline

Window System

Click-Through Mechanism

Drag Handle

IPC Resize

Project Structure

Environment Variables

Scripts & Dev Tools

Build & Distribution

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

2. Google Gemini (`@google/genai`)

Main Process (`main/background.js`)

Renderer Process (`renderer/`)

Preload Script (`main/preload.js`)

`add_knowledge({ text })`

`retrieve_knowledge({ query, topK? })`

`question_user_recall({ topic })`

`check_user_recall_answer({ user_answer })`

`generate_visual({ prompt, text })`

`generate_code_exercise({ query, text })`

Scene Setup (`Scene.jsx`)

Particle Geometry Construction (`GhostHead.jsx` — `useMemo`)

Animation Loop (`useFrame` — per frame)

Step 1: Voice Cloning (`StepVoiceCapture.jsx`)

Step 2: Avatar Capture (`StepAvatarCapture.jsx`)

Packages