feat(#187 partial): real llama.cpp + whisper.cpp backends#238
Merged
Conversation
Replaces NullEmbeddedTextPort/NullEmbeddedSttPort stubs with subprocess
backends spawning llama-cli and whisper-cli. Validated end-to-end against
Homebrew-installed binaries (/opt/homebrew/bin/{llama,whisper}-cli).
LlamaCppTextBackend: spawns llama-cli with --no-display-prompt + -no-cnv so
stdout contains only generated tokens; strips end-of-text markers; configurable
timeout (default 120s); rejects with last 5 stderr lines on non-zero exit.
WhisperCppSttBackend: writes audio to mkdtemp dir, runs whisper-cli with -oj
JSON output, joins transcription segments with single spaces, cleans up temp
dir in finally.
createEmbeddedTextPort/SttPort factories pick real vs Null based on
detectEmbeddedRuntimes(); model resolution: explicit override → SKYTWIN_*_MODEL
env var → first matching file in modelDir → Null fallback.
36 new unit tests; package total 58 tests, all passing. Tests mock spawn + fs
so they never touch real binaries.
Closes runtime portion of #187 AC#1 (text gen) and AC#3 (STT).
Bundling, downloader UI, Piper TTS, and prompt-eval parity remain open.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Adds real llama.cpp and whisper.cpp subprocess-backed implementations to @skytwin/embedded-llm, replacing the previous null/stub behavior and providing factories to select real vs. null ports based on runtime detection.
Changes:
- Introduces
LlamaCppTextBackendandWhisperCppSttBackendthat spawnllama-cli/whisper-cli, parse output, handle timeouts, and surface stderr tails on failures. - Adds
createEmbeddedTextPort()/createEmbeddedSttPort()factories with model resolution (override → env var → modelDir scan → null fallback). - Adds unit tests for the new backends/factory and documents the feature in
CHANGELOG.md.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 10 comments.
Show a summary per file
| File | Description |
|---|---|
| packages/embedded-llm/src/whisper-cpp-backend.ts | Implements whisper.cpp STT backend (temp files, spawn, JSON parse, cleanup) + model scan helper |
| packages/embedded-llm/src/llama-cpp-backend.ts | Implements llama.cpp text backend (spawn, marker stripping) + GGUF model scan helper |
| packages/embedded-llm/src/index.ts | Re-exports new backends, helpers, and factories |
| packages/embedded-llm/src/factory.ts | Adds factories to choose real vs. null ports and resolve model paths |
| packages/embedded-llm/src/tests/whisper-cpp-backend.test.ts | Adds unit tests for whisper backend + helpers |
| packages/embedded-llm/src/tests/llama-cpp-backend.test.ts | Adds unit tests for llama backend + helper |
| packages/embedded-llm/src/tests/factory.test.ts | Adds unit tests for factories and resolution precedence |
| CHANGELOG.md | Documents the new embedded backends and usage |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| export interface LlamaCppBackendOptions { | ||
| binaryPath: string; | ||
| modelPath: string; | ||
| contextWindow?: number; |
| this.capabilities = { | ||
| available: true, | ||
| modelName: basename(opts.modelPath), | ||
| contextWindow: opts.contextWindow ?? DEFAULT_CONTEXT_WINDOW, |
Comment on lines
+45
to
+53
| const args = [ | ||
| '-m', this.modelPath, | ||
| '-p', prompt, | ||
| '-n', String(maxTokens), | ||
| '--temp', String(temperature), | ||
| '--no-display-prompt', | ||
| '--no-warmup', | ||
| '-no-cnv', | ||
| ]; |
| } | ||
|
|
||
| const DEFAULT_TIMEOUT_MS = 120_000; | ||
| const SUPPORTED_FORMATS = ['wav', 'mp3', 'flac', 'ogg']; |
Comment on lines
+53
to
+55
| const audioPath = join(work, 'input.wav'); | ||
| const outBase = join(work, 'output'); | ||
| writeFileSync(audioPath, audio); |
Comment on lines
+64
to
+69
| const timer = setTimeout(() => { | ||
| if (settled) return; | ||
| settled = true; | ||
| child.kill('SIGKILL'); | ||
| reject(new Error(`llama-cli timed out after ${this.timeoutMs}ms`)); | ||
| }, this.timeoutMs); |
Comment on lines
+91
to
+96
| const timer = setTimeout(() => { | ||
| if (settled) return; | ||
| settled = true; | ||
| child.kill('SIGKILL'); | ||
| reject(new Error(`whisper-cli timed out after ${this.timeoutMs}ms`)); | ||
| }, this.timeoutMs); |
Comment on lines
+68
to
+70
| if (this.threads !== null) { | ||
| args.push('-t', String(this.threads)); | ||
| } |
Comment on lines
+91
to
+104
| const timer = setTimeout(() => { | ||
| if (settled) return; | ||
| settled = true; | ||
| child.kill('SIGKILL'); | ||
| reject(new Error(`whisper-cli timed out after ${this.timeoutMs}ms`)); | ||
| }, this.timeoutMs); | ||
|
|
||
| child.stderr?.on('data', (chunk) => { stderr += chunk.toString('utf8'); }); | ||
| child.on('error', (err) => { | ||
| if (settled) return; | ||
| settled = true; | ||
| clearTimeout(timer); | ||
| reject(new Error(`failed to spawn whisper-cli: ${err.message}`)); | ||
| }); |
Comment on lines
+13
to
+16
| export interface CreatePortOverrides { | ||
| binaryPath?: string; | ||
| modelPath?: string; | ||
| } |
This was referenced May 9, 2026
jayzalowitz
added a commit
that referenced
this pull request
May 9, 2026
…+ voice STT route Five real implementations: - High-contrast theme variant (4th in the dropdown): pure b/w with bold yellow/blue accents, 2px borders, 3px focus rings. WCAG AAA-aimed. - Text-scale slider: 100/125/150/200% via data-text-scale on <html>; every rem-based size reflows. - Reduced-motion override: respects prefers-reduced-motion by default, user can force on/off via Settings. - Voice-first toggle: persists to localStorage, sets body class so CSS surfaces mic affordances. - /api/voice/transcribe + /api/voice/capabilities/:userId backed by createEmbeddedSttPort() from #238. Real Whisper STT exposed as a reusable HTTP endpoint (mobile #179 will consume it once recording lands; desktop voice-first does today). Sweep: - Skip-link injected as first body child (CSS hides until focused). - Global focus-visible ring across buttons, links, inputs. - prefers-reduced-motion: reduce honored across all themes. - New Accessibility card in Settings with text-size, animations, voice. 8 new voice route tests; api total 423 passing (24 skipped). Out of scope: mobile recording UI (#179 hardware-blocked), axe-core CI gate, multi-language i18n (separate slice). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
jayzalowitz
added a commit
that referenced
this pull request
May 9, 2026
…+ voice STT route (#244) Five real implementations: - High-contrast theme variant (4th in the dropdown): pure b/w with bold yellow/blue accents, 2px borders, 3px focus rings. WCAG AAA-aimed. - Text-scale slider: 100/125/150/200% via data-text-scale on <html>; every rem-based size reflows. - Reduced-motion override: respects prefers-reduced-motion by default, user can force on/off via Settings. - Voice-first toggle: persists to localStorage, sets body class so CSS surfaces mic affordances. - /api/voice/transcribe + /api/voice/capabilities/:userId backed by createEmbeddedSttPort() from #238. Real Whisper STT exposed as a reusable HTTP endpoint (mobile #179 will consume it once recording lands; desktop voice-first does today). Sweep: - Skip-link injected as first body child (CSS hides until focused). - Global focus-visible ring across buttons, links, inputs. - prefers-reduced-motion: reduce honored across all themes. - New Accessibility card in Settings with text-size, animations, voice. 8 new voice route tests; api total 423 passing (24 skipped). Out of scope: mobile recording UI (#179 hardware-blocked), axe-core CI gate, multi-language i18n (separate slice). Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
NullEmbeddedTextPort/NullEmbeddedSttPortstubs in@skytwin/embedded-llmwith real subprocess backends spawningllama-cliandwhisper-cli./opt/homebrew/bin/{llama,whisper}-cli) — confirmed real binary spawn, arg construction, exit-code handling, stderr-tail propagation.child_process.spawnandfsso they never touch real binaries.What ships
LlamaCppTextBackend:--no-display-prompt -no-cnvso stdout contains only generated tokens; strips[end of text]/<|im_end|>/<|endoftext|>/</s>markers; configurabletimeoutMs(default 120s) andthreads; non-zero exit → reject with last 5 stderr lines.WhisperCppSttBackend: writes audio buffer tomkdtempdir, runswhisper-cliwith-ojJSON output, joinstranscription[].textwith single spaces, cleans up temp dir infinally(even on failure).createEmbeddedTextPort()/createEmbeddedSttPort()factories: pick real vs Null based on detection. Model resolution: explicit override →SKYTWIN_LLAMA_MODEL/SKYTWIN_WHISPER_MODELenv var → first matching file inmodelDir→ Null fallback.Closes for #187
Bundling default GGUF, downloader UI, Piper TTS, and prompt-eval parity remain open.
Test plan
pnpm --filter @skytwin/embedded-llm buildcleanpnpm --filter @skytwin/embedded-llm test— 58 passing/opt/homebrew/bin/llama-cli+ bogus model path: surfacesllama_model_load: error loading modelfrom binary stderr correctly🤖 Generated with Claude Code