feat(macos): ExecuTorch Parakeet-TDT STT for Talk Mode + model-plugin runtime#50051
feat(macos): ExecuTorch Parakeet-TDT STT for Talk Mode + model-plugin runtime#50051seyeong-han wants to merge 35 commits into
Conversation
Greptile SummaryThis PR adds optional on-device ExecuTorch Parakeet-TDT STT for macOS Talk Mode, refactors the executorch extension into a model-plugin architecture (registry, types, parakeet plugin), fixes the The overall architecture is solid — the Swift actor ( Key issues found:
Confidence Score: 2/5
Prompt To Fix All With AIThis is a comment left during a code review.
Path: extensions/executorch/index.ts
Line: 93-97
Comment:
**Bare `require()` in ESM module always throws `ReferenceError`**
`package.json` sets `"type": "module"`, so this file is treated as ESM and `require` is not available as a global. At runtime this will throw `ReferenceError: require is not defined`, which is caught by the surrounding `try/catch`, producing the misleading warning *"Native addon not available — on-device STT will not work until the addon is built"* even when the addon is properly built.
The practical consequence is that `openclaw executorch status` / `setup` always show the spurious warning, and the gateway-start log suggests the addon is broken even when it isn't. The addon still loads lazily on first transcription (because `RunnerManager.native` uses the already-imported `loadNativeExecuTorchAddon`), so STT works — but the diagnostics are always wrong.
`loadNativeExecuTorchAddon` is already imported (as a type-only import) from `./src/native-addon.js`. Fix by importing the value and calling it directly, or make the hook async and use `await import(...)`:
```suggestion
api.registerHook("gateway_start", async () => {
api.logger.info(
`[executorch] Registered embedded STT provider (modelPlugin=${modelPlugin.id}, backend=${backend}, library=${runtimeLibraryPath}, models=${modelDir})`,
);
try {
const { loadNativeExecuTorchAddon } = await import("./src/native-addon.js");
loadNativeExecuTorchAddon();
api.logger.info("[executorch] Native addon loaded successfully");
} catch {
api.logger.warn(
"[executorch] Native addon not available — on-device STT will not work until the addon is built. " +
"See extensions/executorch/README.md for setup instructions.",
);
}
});
```
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: extensions/executorch/src/runner-manager.ts
Line: 79-104
Comment:
**`readyPromise` guard bypassed by concurrent `ensureReady()` calls**
`launch()` calls `this.stop()` synchronously before its first `await`. `stop()` sets `this.readyPromise = null`. This means that once `launch()` suspends at `await this.validatePaths()`, any concurrent call to `ensureReady()` will see `this._state === "unloaded"` and `this.readyPromise === null`, bypassing the dedup guard and starting a second concurrent launch.
Sequence that triggers the bug:
1. Call A → `ensureReady()` → sets `this.readyPromise = this.launch()`, `launch()` calls `stop()` (nulls `readyPromise`), suspends at `await validatePaths()`.
2. Call B → `ensureReady()` → `readyPromise` is `null`, `_state` is `"unloaded"` → starts *another* `launch()`.
3. Call B's `launch()` calls `stop()` which destroys the handle being set up by call A, leaving both in an inconsistent state.
In the Talk Mode polling loop (`startOfflinePollTask`) and the finalize path (`forceFinalOfflineDecodeDelta`), concurrent `transcribe()` → `ensureReady()` calls can realistically interleave here.
Fix: capture the launch promise before mutating `readyPromise` so concurrent callers can share it, or use an internal flag that `stop()` does not clear:
```typescript
async ensureReady(): Promise<void> {
if (this._state === "ready" && this.isAlive) return;
if (this.readyPromise) return this.readyPromise;
const p = this.launch();
this.readyPromise = p; // assign BEFORE first await so stop() inside launch() can't clear it for new callers
try {
await p;
} finally {
if (this.readyPromise === p) this.readyPromise = null;
}
}
```
Additionally, `stop()` should not unconditionally null `readyPromise` — only `launch()` / `ensureReady()` should manage its lifecycle.
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: extensions/executorch/src/runner-manager.ts
Line: 153-191
Comment:
**Redundant `fs.access` checks for model and tokenizer paths**
`resolveFirstExisting` already calls `fs.access` on each candidate to find the first existing path. When a path is found, it is added to `required` and then `fs.access` is called on it *again* in the loop at lines 179–185. This doubles the filesystem calls for the common success path and can cause a TOCTOU-style false positive if the file is deleted between the two checks (it would be added to `missing` despite `resolvedModelPath` being truthy).
Consider only adding `this.runtimeLibraryPath` (which is not covered by `resolveFirstExisting`) to `required`, and skip the loop entries for already-resolved model/tokenizer paths:
```typescript
// Only check runtime library in the loop; model/tokenizer are already verified by resolveFirstExisting
const required = [this.runtimeLibraryPath];
if (this.dataPath) required.push(this.dataPath);
for (const p of required) {
try {
await fs.access(p);
} catch {
missing.push(p);
}
}
```
How can I resolve this? If you propose a fix, please make it concise.Last reviewed commit: "Talk Mode: show only..." |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 1f26cb9152
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 3bd0f391d3
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
Replace Voxtral-realtime wiring with an embedded Parakeet-TDT Metal runtime across the executorch plugin and macOS Talk Mode so setup, status, and on-device transcription run through one stable path with simpler artifact management. Made-with: Cursor
Treat overlapping Parakeet polling outputs as competing hypotheses instead of append-only tails so revised phrases do not concatenate into duplicated transcript lines, and cover the regression with focused merge tests. Made-with: Cursor
Gate finalize-time deltas on recent voiced activity and simplify finalization to a single vetted tail pass so silence and post-TTS boundaries no longer append phantom words. Made-with: Cursor
Remove the optional voice-agent demo and revert unrelated plugin/permission tweaks so this branch stays focused on required Parakeet STT integration and Talk Mode behavior.
Document the ExecuTorch Parakeet example repository alongside the model artifact URL so setup references are easier to trace.
Added a link to user attachments in the README.
…ctions into prompt
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: cdf01faef0
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| private fun ChatMessageBody(role: String, content: List<ChatMessageContent>, textColor: Color) { | ||
| Column(verticalArrangement = Arrangement.spacedBy(8.dp)) { | ||
| for (part in content) { | ||
| when (part.type) { |
There was a problem hiding this comment.
Strip Talk Mode preamble in Android user message rendering
ChatMessageBody now receives role but never uses it, so Android still renders raw user text for Talk Mode turns instead of the transcript-only display introduced elsewhere (TalkPromptBuilder.displayText). In practice, Talk Mode messages synced to Android will continue showing the full "Talk Mode active..." instruction block and directive hints, which is inconsistent with macOS/iOS behavior and defeats the UX change this commit is implementing.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 140491593f
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| std::string g_last_error; | ||
|
|
||
| void set_last_error(const std::string& message) { | ||
| g_last_error = message; |
There was a problem hiding this comment.
Eliminate global error buffer shared by async workers
createRunner now runs on libuv worker threads, but failures still write through the process-global g_last_error string. If two createRunner calls fail concurrently (for example from separate RunnerManager instances), unsynchronized writes to the same std::string are a data race and can corrupt memory or crash the process. Keep error state per work item (or make it thread-local/locked) instead of using shared mutable global storage.
Useful? React with 👍 / 👎.
|
|
||
| get isAlive(): boolean { | ||
| return this.handle !== null; | ||
| } |
There was a problem hiding this comment.
Relaunch after stop instead of awaiting stale ready promise
When stop() is called during an in-flight cold load, ensureReady() can still return the old readyPromise via this early return. A subsequent caller that arrives before that promise clears will await a launch that was intentionally invalidated by generation checks, then continue with handle === null and fail transcription with Runtime handle is not initialized instead of starting a fresh launch. ensureReady() should detect that the awaited promise did not produce a ready handle and retry launch.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 31293430e2
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
|
Codex review: needs real behavior proof before merge. Reviewed June 9, 2026, 12:55 AM ET / 04:55 UTC. Summary Reproducibility: unclear. The review failed before ClawSweeper could establish a reproduction path. Review metrics: none identified. Merge readiness Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch. Risk before merge
Maintainer options:
Next step before merge
Review detailsBest possible solution: Retry the Codex review after fixing the execution failure. Do we have a high-confidence way to reproduce the issue? Unclear. The review failed before ClawSweeper could establish a reproduction path. Is this the best way to solve the issue? Unclear. Retry the review first so ClawSweeper can evaluate the actual issue and fix direction. AGENTS.md: unclear because the file could not be read completely. Codex review notes: model gpt-5.5, reasoning high; reviewed against 6fcc9457020e. Label changesLabel changes:
Label justifications:
Evidence reviewedWhat I checked:
Likely related people:
What the crustacean ranks mean
Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics. How this review workflow works
|
|
ClawSweeper PR egg 🎁 Pass real behavior proof to wake the egg and unlock a hatchable treat. Where did the egg go?
|
|
This pull request has been automatically marked as stale due to inactivity. |
Summary
openclaw-parakeet-metal.mp4
UNUserNotificationCenter.current()was used outside an app bundle (e.g. Xcode run).TalkPromptBuilder.displayText(fromPrompt:). (4) PermissionManager guards notification APIs withcanQueryUserNotificationCenter(main bundle is.app).Change Type (select all)
Scope (select all touched areas)
Linked Issue/PR
User-visible / Behavior Changes
defaults write … openclaw.talkSttBackend executorch). CLI:openclaw executorch setup|status|transcribe.Security Impact (required)
Repro + Verification
Environment
plugins.entries.executorch.enabled,openclaw.talkSttBackend(macOS defaults).Steps
openclaw executorch setupthenopenclaw executorch status(all OK).defaults write <bundleId> openclaw.talkSttBackend executorch; relaunch app; use Talk Mode and speak.Expected
Actual
Evidence
Human Verification (required)
Review Conversations
Compatibility / Migration
openclaw.talkSttBackendon macOS).Failure Recovery (if this breaks)
openclaw.talkSttBackendto default/Apple; revert PR.plugins.entries.executorch.enabled false; remove or unsetopenclaw.talkSttBackend.extensions/executorch/README.mdsetup.Risks and Mitigations
executorch status; README troubleshooting; plugin can be disabled without affecting rest of app.