Skip to content

Skip empty sherpa structured transcripts#84667

Merged
clawsweeper[bot] merged 4 commits into
openclaw:mainfrom
TurboTheTurtle:fix/sherpa-empty-transcripts-84660
May 21, 2026
Merged

Skip empty sherpa structured transcripts#84667
clawsweeper[bot] merged 4 commits into
openclaw:mainfrom
TurboTheTurtle:fix/sherpa-empty-transcripts-84660

Conversation

@TurboTheTurtle

Copy link
Copy Markdown
Contributor

Summary

  • Treat structured sherpa-onnx-offline JSON with an empty text field as an empty transcript instead of falling back to raw stdout.
  • Preserve normal sherpa JSON extraction for non-empty transcript output, including final-line JSON after startup logs.
  • Add direct CLI-runner and auto-detected audio regression coverage.

Closes #84660.

Real behavior proof

Behavior or issue addressed:
Moonshine/sherpa can emit a structured JSON object where text is empty. Before this patch, OpenClaw treated that as unrecognized sherpa output, fell back to the raw JSON string, and could pass that JSON to the LLM as a voice transcript.

Real environment tested:
Local macOS worktree with bundled Node.js v24.14.0, repository dependencies, and a fake executable named sherpa-onnx-offline invoked through the real CLI media runner.

Exact steps or command run after this patch:

PATH=/Users/andy/.cache/codex-runtimes/codex-primary-runtime/dependencies/node/bin:$PATH node node_modules/tsx/dist/cli.mjs /private/tmp/openclaw-84660-proof/proof.ts
PATH=/Users/andy/.cache/codex-runtimes/codex-primary-runtime/dependencies/node/bin:$PATH node scripts/run-vitest.mjs src/media-understanding/runner.cli-audio.test.ts src/media-understanding/apply.test.ts
git diff --check upstream/main..HEAD
git log --format='%h %an <%ae> %s' upstream/main..HEAD

Evidence after fix:
The standalone runtime proof invokes the real runCliEntry() path with a fake sherpa-onnx-offline binary. Empty structured JSON no longer returns a transcript, while non-empty structured JSON still extracts normally:

{
  "emptySherpaJsonReturnedTranscript": false,
  "emptySherpaJsonText": null,
  "nonEmptySherpaJsonText": "hello from sherpa",
  "nonEmptySherpaJsonKind": "audio.transcription"
}

Targeted tests:

Test Files  2 passed (2)
Tests       52 passed (52)

Observed result after fix:
Empty sherpa/moonshine JSON is skipped before it can become ctx.Transcript or an audio.transcription output, so the serialized Discord voice lane will not enqueue an agent turn for that empty segment. Non-empty sherpa JSON remains accepted.

What was not tested:
I did not run a live Discord voice session or the full repository test suite. The proof uses the real media CLI runner with a fake sherpa executable and targeted media-understanding tests for the affected normalization and auto-detect paths.

Attribution

If this PR is squashed or reworked, please preserve the commit author attribution for Andy Ye 35905412+TurboTheTurtle@users.noreply.github.com, or include:

Co-authored-by: Andy Ye 35905412+TurboTheTurtle@users.noreply.github.com

@openclaw-barnacle openclaw-barnacle Bot added size: S proof: supplied External PR includes structured after-fix real behavior proof. labels May 20, 2026
@clawsweeper

clawsweeper Bot commented May 20, 2026

Copy link
Copy Markdown
Contributor

Codex review: passed.

Workflow note: Future ClawSweeper reviews update this same comment in place.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

Summary
The PR changes sherpa-onnx CLI audio parsing so structured JSON with an empty text field becomes no transcript, while preserving non-empty JSON extraction and adding direct plus auto-detect regression coverage.

Reproducibility: yes. Source inspection on current main shows empty sherpa structured JSON misses extraction, falls back to raw stdout, and remains non-empty before Discord voice can skip empty transcripts; I did not run a live Discord reproduction in this read-only review.

PR rating
Overall: 🦞 diamond lobster
Proof: 🦞 diamond lobster
Patch quality: 🦞 diamond lobster
Summary: Focused bug-fix PR with sufficient terminal proof, targeted regression coverage, and no blocking finding from this review.

What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

Real behavior proof
Sufficient (terminal): The PR body includes after-fix terminal proof through the real CLI media runner path with a fake sherpa executable, showing empty JSON returns no transcript while non-empty JSON still extracts normally.

Risk before merge

  • No live Discord voice/moonshine session or full repository suite is attached; exact-head CI should still gate broader regressions before merge.

Maintainer options:

  1. Decide the mitigation before merge
    Land this shared CLI media-understanding parser fix after exact-head CI/automerge gates, and leave broader voice queue-depth or stale-segment policy as a separate follow-up if maintainers want that resilience work.
  2. Pause or close
    Do not merge this PR until maintainers decide whether the risk is worth taking.

Next step before merge
No repair lane is needed because the exact-head PR has no discrete actionable blocking finding; CI, automerge, and normal maintainer review should gate the merge.

Security
Cleared: The diff changes media-output parsing, focused tests, a changelog entry, and a package override guardrail assertion only; it adds no dependencies, scripts, workflow permissions, secret handling, or new code-execution surface.

Review details

Best possible solution:

Land this shared CLI media-understanding parser fix after exact-head CI/automerge gates, and leave broader voice queue-depth or stale-segment policy as a separate follow-up if maintainers want that resilience work.

Do we have a high-confidence way to reproduce the issue?

Yes. Source inspection on current main shows empty sherpa structured JSON misses extraction, falls back to raw stdout, and remains non-empty before Discord voice can skip empty transcripts; I did not run a live Discord reproduction in this read-only review.

Is this the best way to solve the issue?

Yes. Updating the shared sherpa CLI output normalization before runCliEntry() emits an audio transcription is the narrowest maintainable fix, and the PR covers both direct CLI and auto-detected audio paths.

Label justifications:

  • P1: The linked bug affects an active Discord voice workflow by letting empty STT segments consume serialized agent turns and delay useful replies.
  • rating: 🦞 diamond lobster: Current PR rating is 🦞 diamond lobster because proof is 🦞 diamond lobster, patch quality is 🦞 diamond lobster, and Focused bug-fix PR with sufficient terminal proof, targeted regression coverage, and no blocking finding from this review.
  • status: 🚀 automerge armed: This PR is in ClawSweeper's automerge lane. Sufficient (terminal): The PR body includes after-fix terminal proof through the real CLI media runner path with a fake sherpa executable, showing empty JSON returns no transcript while non-empty JSON still extracts normally.
  • proof: sufficient: Contributor real behavior proof is sufficient. The PR body includes after-fix terminal proof through the real CLI media runner path with a fake sherpa executable, showing empty JSON returns no transcript while non-empty JSON still extracts normally.

What I checked:

  • Current main bug path: On current main, extractSherpaOnnxText() returns null when a sherpa JSON object's text field is an empty string, and resolveCliOutput() then falls back to params.stdout.trim(), so the raw JSON remains a non-empty transcript candidate. (src/media-understanding/runner.entries.ts:99, c0312748c4f7)
  • Voice path impact: Discord voice transcription only skips after normalizeOptionalString(result.text) is falsy, and segment processing returns early only for a falsy transcript; a raw JSON string therefore reaches the agent-turn path. (extensions/discord/src/voice/tts.ts:88, c0312748c4f7)
  • PR implementation: At PR head, sherpa output parsing distinguishes matched from non-empty text, so a structured empty text value returns an empty string; runCliEntry() trims that to empty and returns null instead of an audio transcription output. (src/media-understanding/runner.entries.ts:99, ac03171cfc3d)
  • Regression coverage: The PR adds a direct CLI-runner test for empty sherpa structured JSON and a final-line JSON preservation test, plus an auto-detect test that verifies empty sherpa output leaves Transcript unset and appliedAudio false. (src/media-understanding/runner.cli-audio.test.ts:85, ac03171cfc3d)
  • Auto-detect contract: OpenClaw docs and source both list sherpa-onnx-offline as the first local CLI audio auto-detect option when SHERPA_ONNX_MODEL_DIR has encoder/decoder/joiner/tokens, which matches the linked issue's runtime surface. Public docs: docs/nodes/audio.md. (docs/nodes/audio.md:19, c0312748c4f7)
  • History provenance: git blame shows the current sherpa parser, local sherpa auto-detect, and Discord voice transcription path all date to befb0f3d39bd33410cee85d0e193834af96d2773 (feat(discord): follow configured users in voice), while earlier media auto-detect and runner-entry refactors were carried by 2dfbd1c1f6a6ce197396f2b92b157bd6a3914f6a and 2a1f8b2615b1acbd0084bdd1d1c126d80d39284c. (src/media-understanding/runner.entries.ts:99, befb0f3d39bd)

Likely related people:

  • fuller-stack-dev: git blame ties the current sherpa parser, local sherpa auto-detect, and Discord voice STT bridge to befb0f3d39bd33410cee85d0e193834af96d2773. (role: introduced current behavior; confidence: high; commits: befb0f3d39bd; files: src/media-understanding/runner.entries.ts, src/media-understanding/runner.ts, extensions/discord/src/voice/tts.ts)
  • steipete: Recent media auto-detect and runner-entry history includes the main auto-detect feature/refactor commits and a release-test alignment commit touching this surface. (role: recent area contributor; confidence: high; commits: 2dfbd1c1f6a6, 2a1f8b2615b1, 384a161bbc2e; files: src/media-understanding/runner.ts, src/media-understanding/runner.entries.ts, docs/nodes/audio.md)
  • vincentkoc: History around transcribeAudioFile shows adjacent media-runtime contract work in 6e74d77a42e9a682fbbba6644c74e6435491c24f, relevant to the shared runtime boundary used by Discord voice. (role: adjacent owner; confidence: medium; commits: 6e74d77a42e9; files: src/media-understanding, extensions/discord/src/voice)

Codex review notes: model gpt-5.5, reasoning high; reviewed against c0312748c4f7.

@clawsweeper clawsweeper Bot added proof: sufficient ClawSweeper judged the real behavior proof convincing. rating: 🦞 diamond lobster Very strong PR readiness with only minor maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. P1 High-priority user-facing bug, regression, or broken workflow. labels May 20, 2026
@clawsweeper

clawsweeper Bot commented May 20, 2026

Copy link
Copy Markdown
Contributor

ClawSweeper PR egg

✨ Hatched: 🥚 common Pearl Signal Puff

Hatch command

Comment @clawsweeper hatch when this PR is hatchable.

Hatchability rules:

  • Merged PRs are hatchable.
  • Open PRs are hatchable when they are status: 👀 ready for maintainer look, status: 🚀 automerge armed, or labeled clawsweeper:automerge.
  • Closed unmerged PRs are hatchable only when one of those hatchable labels is still present in the durable record.

Rarity: 🥚 common.
Trait: collects tiny proofs.
Image traits: location branch lighthouse; accessory CI status badge; palette seafoam, black, and opal; mood focused; pose nestled inside a glowing shell; shell polished stone shell; lighting warm desk-lamp glow; background smooth stones and checkmarks.
Share on X: post this hatch
Copy: My PR egg hatched a 🥚 common Pearl Signal Puff in ClawSweeper.

What is this egg doing here?
  • Eggs appear after the PR passes real-behavior proof. It is here for vibes, not verdicts: it does not change labels, ratings, merge decisions, or automation.
  • The shell reacts to review momentum: open follow-up work warms it up, re-review makes it wobble, and a clean final review lets it hatch.
  • Hatchability usually comes from sufficient real-behavior proof, no blocking P0/P1/P2 findings, no security attention needed, and clean correctness. A merged PR is already final, so merge makes the egg hatchable independently.
  • The hatch is seeded from this repository and PR number, so the same PR keeps the same creature; the reviewed head SHA can only change safe visual details.
  • Rarity is just collectible sparkle: 🥚 common, 🌱 uncommon, 💎 rare, ✨ glimmer, and 🌈 legendary.

@openclaw-barnacle openclaw-barnacle Bot added scripts Repository scripts commands Command implementations extensions: openrouter labels May 21, 2026
@TurboTheTurtle

Copy link
Copy Markdown
Contributor Author

Pushed 3d6911bd44 to clear the stale CI blockers on the latest head.

What changed:

  • Updated the control UI i18n Pi fallback package version to match the workspace dependency (0.75.3).
  • Made the Bedrock runtime override guardrail compare against the Amazon provider package dependency instead of a stale hard-coded SDK version.
  • Removed now-linted empty object fallbacks in OpenRouter provider-routing spreads.
  • Updated the status summary redaction fixture for the current model-selection fields.

Local validation:

  • pnpm exec vitest run src/media-understanding/apply.test.ts src/media-understanding/runner.cli-audio.test.ts --reporter=verbose — 2 files / 52 tests passed
  • pnpm exec vitest run src/scripts/control-ui-i18n.test.ts test/scripts/root-package-overrides.test.ts — 2 files / 6 tests passed
  • pnpm exec vitest run src/commands/status.summary.redaction.test.ts — passed
  • pnpm run lint — passed
  • pnpm run lint:extensions:bundled — passed
  • pnpm run check:test-types — passed
  • pnpm exec oxfmt --check extensions/openrouter/provider-routing.ts scripts/control-ui-i18n.ts src/commands/status.summary.redaction.test.ts test/scripts/root-package-overrides.test.ts — passed
  • git diff --check — passed

@clawsweeper re-review

@clawsweeper

clawsweeper Bot commented May 21, 2026

Copy link
Copy Markdown
Contributor

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

@openclaw-barnacle openclaw-barnacle Bot added cli CLI command changes and removed scripts Repository scripts commands Command implementations extensions: openrouter labels May 21, 2026
@TurboTheTurtle

Copy link
Copy Markdown
Contributor Author

Merged current origin/main into the branch and pushed 521a6cd310; #84667 should no longer be blocked by the dirty merge state once GitHub refreshes.

Conflict resolution:

  • Kept upstream status.summary.redaction.test.ts fixture values while preserving the required model-selection fields.
  • Kept the dynamic Bedrock override guardrail so it tracks the Amazon provider package dependency instead of drifting again.

Post-merge local validation:

  • focused PR/stale-CI tests: 5 files / 59 tests passed
  • pnpm run lint — passed
  • pnpm run check:test-types — passed
  • git diff --check — passed

@clawsweeper re-review

@clawsweeper

clawsweeper Bot commented May 21, 2026

Copy link
Copy Markdown
Contributor

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

@Takhoffman

Copy link
Copy Markdown
Contributor

@clawsweeper automerge

@clawsweeper clawsweeper Bot added the clawsweeper:automerge Maintainer opted this PR into bounded ClawSweeper-reviewed automerge label May 21, 2026
@clawsweeper

clawsweeper Bot commented May 21, 2026

Copy link
Copy Markdown
Contributor

🦞🧹
ClawSweeper automerge is enabled.

  • Head: ac03171cfc3d
  • Label: clawsweeper:automerge
  • Action: exact-head review queued (workflow sweep.yml, event repository_dispatch).
  • Flow: review this head, repair/rebase only if needed, then re-review the exact repaired head before merge.

Draft PRs stay fix-only until GitHub marks them ready for review. Pause with /clawsweeper stop.

Automerge progress:

  • 2026-05-21 04:05:00 UTC review passed 9d131a689461 (structured ClawSweeper verdict: pass (sha=9d131a6894619ae82023377876999080c0c13...)
  • 2026-05-21 04:15:32 UTC review queued ac03171cfc3d (after repair)
  • 2026-05-21 04:23:49 UTC review passed ac03171cfc3d (structured ClawSweeper verdict: pass (sha=ac03171cfc3d230a241c85deed74f9e25eadc...)
  • 2026-05-21 04:24:02 UTC merged ac03171cfc3d (merged by ClawSweeper automerge)
  • 2026-05-21 04:24:05 UTC review queued ac03171cfc3d (queued)

@clawsweeper clawsweeper Bot force-pushed the fix/sherpa-empty-transcripts-84660 branch from 521a6cd to 9d131a6 Compare May 21, 2026 03:57
@openclaw-barnacle openclaw-barnacle Bot removed the cli CLI command changes label May 21, 2026
@clawsweeper clawsweeper Bot added status: 🚀 automerge armed This PR is in ClawSweeper's automerge lane. and removed status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. labels May 21, 2026
@clawsweeper clawsweeper Bot force-pushed the fix/sherpa-empty-transcripts-84660 branch from 9d131a6 to ac03171 Compare May 21, 2026 04:15
@clawsweeper clawsweeper Bot merged commit 46030f5 into openclaw:main May 21, 2026
97 checks passed
@TurboTheTurtle TurboTheTurtle deleted the fix/sherpa-empty-transcripts-84660 branch May 21, 2026 04:44
SebTardif pushed a commit to SebTardif/openclaw that referenced this pull request May 24, 2026
Summary:
- The PR changes sherpa-onnx CLI audio parsing so structured JSON with an empty `text` field becomes no transcript, while preserving non-empty JSON extraction and adding direct plus auto-detect regression coverage.
- Reproducibility: yes. Source inspection on current main shows empty sherpa structured JSON misses extraction ... scord voice can skip empty transcripts; I did not run a live Discord reproduction in this read-only review.

Automerge notes:
- PR branch already contained follow-up commit before automerge: Fix stale CI guardrails for sherpa transcript PR
- PR branch already contained follow-up commit before automerge: Skip empty sherpa structured transcripts

Validation:
- ClawSweeper review passed for head ac03171.
- Required merge gates passed before the squash merge.

Prepared head SHA: ac03171
Review: openclaw#84667 (comment)

Co-authored-by: Andy Ye <35905412+TurboTheTurtle@users.noreply.github.com>
Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com>
Co-authored-by: clawsweeper[bot] <274271284+clawsweeper[bot]@users.noreply.github.com>
Approved-by: takhoffman
Co-authored-by: takhoffman <781889+takhoffman@users.noreply.github.com>
SebTardif pushed a commit to SebTardif/openclaw that referenced this pull request May 24, 2026
Summary:
- The PR changes sherpa-onnx CLI audio parsing so structured JSON with an empty `text` field becomes no transcript, while preserving non-empty JSON extraction and adding direct plus auto-detect regression coverage.
- Reproducibility: yes. Source inspection on current main shows empty sherpa structured JSON misses extraction ... scord voice can skip empty transcripts; I did not run a live Discord reproduction in this read-only review.

Automerge notes:
- PR branch already contained follow-up commit before automerge: Fix stale CI guardrails for sherpa transcript PR
- PR branch already contained follow-up commit before automerge: Skip empty sherpa structured transcripts

Validation:
- ClawSweeper review passed for head ac03171.
- Required merge gates passed before the squash merge.

Prepared head SHA: ac03171
Review: openclaw#84667 (comment)

Co-authored-by: Andy Ye <35905412+TurboTheTurtle@users.noreply.github.com>
Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com>
Co-authored-by: clawsweeper[bot] <274271284+clawsweeper[bot]@users.noreply.github.com>
Approved-by: takhoffman
Co-authored-by: takhoffman <781889+takhoffman@users.noreply.github.com>
SebTardif pushed a commit to SebTardif/openclaw that referenced this pull request May 24, 2026
Summary:
- The PR changes sherpa-onnx CLI audio parsing so structured JSON with an empty `text` field becomes no transcript, while preserving non-empty JSON extraction and adding direct plus auto-detect regression coverage.
- Reproducibility: yes. Source inspection on current main shows empty sherpa structured JSON misses extraction ... scord voice can skip empty transcripts; I did not run a live Discord reproduction in this read-only review.

Automerge notes:
- PR branch already contained follow-up commit before automerge: Fix stale CI guardrails for sherpa transcript PR
- PR branch already contained follow-up commit before automerge: Skip empty sherpa structured transcripts

Validation:
- ClawSweeper review passed for head ac03171.
- Required merge gates passed before the squash merge.

Prepared head SHA: ac03171
Review: openclaw#84667 (comment)

Co-authored-by: Andy Ye <35905412+TurboTheTurtle@users.noreply.github.com>
Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com>
Co-authored-by: clawsweeper[bot] <274271284+clawsweeper[bot]@users.noreply.github.com>
Approved-by: takhoffman
Co-authored-by: takhoffman <781889+takhoffman@users.noreply.github.com>
github-actions Bot pushed a commit to Desicool/openclaw that referenced this pull request May 24, 2026
Summary:
- The PR changes sherpa-onnx CLI audio parsing so structured JSON with an empty `text` field becomes no transcript, while preserving non-empty JSON extraction and adding direct plus auto-detect regression coverage.
- Reproducibility: yes. Source inspection on current main shows empty sherpa structured JSON misses extraction ... scord voice can skip empty transcripts; I did not run a live Discord reproduction in this read-only review.

Automerge notes:
- PR branch already contained follow-up commit before automerge: Fix stale CI guardrails for sherpa transcript PR
- PR branch already contained follow-up commit before automerge: Skip empty sherpa structured transcripts

Validation:
- ClawSweeper review passed for head ac03171.
- Required merge gates passed before the squash merge.

Prepared head SHA: ac03171
Review: openclaw#84667 (comment)

Co-authored-by: Andy Ye <35905412+TurboTheTurtle@users.noreply.github.com>
Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com>
Co-authored-by: clawsweeper[bot] <274271284+clawsweeper[bot]@users.noreply.github.com>
Approved-by: takhoffman
Co-authored-by: takhoffman <781889+takhoffman@users.noreply.github.com>
galiniliev pushed a commit to galiniliev/openclaw that referenced this pull request May 25, 2026
Summary:
- The PR changes sherpa-onnx CLI audio parsing so structured JSON with an empty `text` field becomes no transcript, while preserving non-empty JSON extraction and adding direct plus auto-detect regression coverage.
- Reproducibility: yes. Source inspection on current main shows empty sherpa structured JSON misses extraction ... scord voice can skip empty transcripts; I did not run a live Discord reproduction in this read-only review.

Automerge notes:
- PR branch already contained follow-up commit before automerge: Fix stale CI guardrails for sherpa transcript PR
- PR branch already contained follow-up commit before automerge: Skip empty sherpa structured transcripts

Validation:
- ClawSweeper review passed for head ac03171.
- Required merge gates passed before the squash merge.

Prepared head SHA: ac03171
Review: openclaw#84667 (comment)

Co-authored-by: Andy Ye <35905412+TurboTheTurtle@users.noreply.github.com>
Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com>
Co-authored-by: clawsweeper[bot] <274271284+clawsweeper[bot]@users.noreply.github.com>
Approved-by: takhoffman
Co-authored-by: takhoffman <781889+takhoffman@users.noreply.github.com>
SebTardif pushed a commit to SebTardif/openclaw that referenced this pull request May 26, 2026
Summary:
- The PR changes sherpa-onnx CLI audio parsing so structured JSON with an empty `text` field becomes no transcript, while preserving non-empty JSON extraction and adding direct plus auto-detect regression coverage.
- Reproducibility: yes. Source inspection on current main shows empty sherpa structured JSON misses extraction ... scord voice can skip empty transcripts; I did not run a live Discord reproduction in this read-only review.

Automerge notes:
- PR branch already contained follow-up commit before automerge: Fix stale CI guardrails for sherpa transcript PR
- PR branch already contained follow-up commit before automerge: Skip empty sherpa structured transcripts

Validation:
- ClawSweeper review passed for head ac03171.
- Required merge gates passed before the squash merge.

Prepared head SHA: ac03171
Review: openclaw#84667 (comment)

Co-authored-by: Andy Ye <35905412+TurboTheTurtle@users.noreply.github.com>
Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com>
Co-authored-by: clawsweeper[bot] <274271284+clawsweeper[bot]@users.noreply.github.com>
Approved-by: takhoffman
Co-authored-by: takhoffman <781889+takhoffman@users.noreply.github.com>
SebTardif pushed a commit to SebTardif/openclaw that referenced this pull request May 26, 2026
Summary:
- The PR changes sherpa-onnx CLI audio parsing so structured JSON with an empty `text` field becomes no transcript, while preserving non-empty JSON extraction and adding direct plus auto-detect regression coverage.
- Reproducibility: yes. Source inspection on current main shows empty sherpa structured JSON misses extraction ... scord voice can skip empty transcripts; I did not run a live Discord reproduction in this read-only review.

Automerge notes:
- PR branch already contained follow-up commit before automerge: Fix stale CI guardrails for sherpa transcript PR
- PR branch already contained follow-up commit before automerge: Skip empty sherpa structured transcripts

Validation:
- ClawSweeper review passed for head ac03171.
- Required merge gates passed before the squash merge.

Prepared head SHA: ac03171
Review: openclaw#84667 (comment)

Co-authored-by: Andy Ye <35905412+TurboTheTurtle@users.noreply.github.com>
Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com>
Co-authored-by: clawsweeper[bot] <274271284+clawsweeper[bot]@users.noreply.github.com>
Approved-by: takhoffman
Co-authored-by: takhoffman <781889+takhoffman@users.noreply.github.com>
SebTardif pushed a commit to SebTardif/openclaw that referenced this pull request May 26, 2026
Summary:
- The PR changes sherpa-onnx CLI audio parsing so structured JSON with an empty `text` field becomes no transcript, while preserving non-empty JSON extraction and adding direct plus auto-detect regression coverage.
- Reproducibility: yes. Source inspection on current main shows empty sherpa structured JSON misses extraction ... scord voice can skip empty transcripts; I did not run a live Discord reproduction in this read-only review.

Automerge notes:
- PR branch already contained follow-up commit before automerge: Fix stale CI guardrails for sherpa transcript PR
- PR branch already contained follow-up commit before automerge: Skip empty sherpa structured transcripts

Validation:
- ClawSweeper review passed for head ac03171.
- Required merge gates passed before the squash merge.

Prepared head SHA: ac03171
Review: openclaw#84667 (comment)

Co-authored-by: Andy Ye <35905412+TurboTheTurtle@users.noreply.github.com>
Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com>
Co-authored-by: clawsweeper[bot] <274271284+clawsweeper[bot]@users.noreply.github.com>
Approved-by: takhoffman
Co-authored-by: takhoffman <781889+takhoffman@users.noreply.github.com>
jameslcowan pushed a commit to jameslcowan/openclaw that referenced this pull request Jun 2, 2026
Summary:
- The PR changes sherpa-onnx CLI audio parsing so structured JSON with an empty `text` field becomes no transcript, while preserving non-empty JSON extraction and adding direct plus auto-detect regression coverage.
- Reproducibility: yes. Source inspection on current main shows empty sherpa structured JSON misses extraction ... scord voice can skip empty transcripts; I did not run a live Discord reproduction in this read-only review.

Automerge notes:
- PR branch already contained follow-up commit before automerge: Fix stale CI guardrails for sherpa transcript PR
- PR branch already contained follow-up commit before automerge: Skip empty sherpa structured transcripts

Validation:
- ClawSweeper review passed for head ac03171.
- Required merge gates passed before the squash merge.

Prepared head SHA: ac03171
Review: openclaw#84667 (comment)

Co-authored-by: Andy Ye <35905412+TurboTheTurtle@users.noreply.github.com>
Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com>
Co-authored-by: clawsweeper[bot] <274271284+clawsweeper[bot]@users.noreply.github.com>
Approved-by: takhoffman
Co-authored-by: takhoffman <781889+takhoffman@users.noreply.github.com>
SYU8384 pushed a commit to SYU8384/openclaw that referenced this pull request Jun 3, 2026
Summary:
- The PR changes sherpa-onnx CLI audio parsing so structured JSON with an empty `text` field becomes no transcript, while preserving non-empty JSON extraction and adding direct plus auto-detect regression coverage.
- Reproducibility: yes. Source inspection on current main shows empty sherpa structured JSON misses extraction ... scord voice can skip empty transcripts; I did not run a live Discord reproduction in this read-only review.

Automerge notes:
- PR branch already contained follow-up commit before automerge: Fix stale CI guardrails for sherpa transcript PR
- PR branch already contained follow-up commit before automerge: Skip empty sherpa structured transcripts

Validation:
- ClawSweeper review passed for head ac03171.
- Required merge gates passed before the squash merge.

Prepared head SHA: ac03171
Review: openclaw#84667 (comment)

Co-authored-by: Andy Ye <35905412+TurboTheTurtle@users.noreply.github.com>
Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com>
Co-authored-by: clawsweeper[bot] <274271284+clawsweeper[bot]@users.noreply.github.com>
Approved-by: takhoffman
Co-authored-by: takhoffman <781889+takhoffman@users.noreply.github.com>
sablehead pushed a commit to sablehead/openclaw that referenced this pull request Jun 10, 2026
Summary:
- The PR changes sherpa-onnx CLI audio parsing so structured JSON with an empty `text` field becomes no transcript, while preserving non-empty JSON extraction and adding direct plus auto-detect regression coverage.
- Reproducibility: yes. Source inspection on current main shows empty sherpa structured JSON misses extraction ... scord voice can skip empty transcripts; I did not run a live Discord reproduction in this read-only review.

Automerge notes:
- PR branch already contained follow-up commit before automerge: Fix stale CI guardrails for sherpa transcript PR
- PR branch already contained follow-up commit before automerge: Skip empty sherpa structured transcripts

Validation:
- ClawSweeper review passed for head ac03171.
- Required merge gates passed before the squash merge.

Prepared head SHA: ac03171
Review: openclaw#84667 (comment)

Co-authored-by: Andy Ye <35905412+TurboTheTurtle@users.noreply.github.com>
Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com>
Co-authored-by: clawsweeper[bot] <274271284+clawsweeper[bot]@users.noreply.github.com>
Approved-by: takhoffman
Co-authored-by: takhoffman <781889+takhoffman@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

clawsweeper:automerge Maintainer opted this PR into bounded ClawSweeper-reviewed automerge P1 High-priority user-facing bug, regression, or broken workflow. proof: sufficient ClawSweeper judged the real behavior proof convincing. proof: supplied External PR includes structured after-fix real behavior proof. rating: 🦞 diamond lobster Very strong PR readiness with only minor maintainer review expected. size: S status: 🚀 automerge armed This PR is in ClawSweeper's automerge lane.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] Voice STT: empty moonshine transcripts passed as raw JSON to LLM, clogging serialized processing queue

2 participants