feat(media): add parakeet-mlx CLI output support#9177
Merged
steipete merged 1 commit intoopenclaw:mainfrom Mar 2, 2026
Merged
Conversation
bfc1ccb to
f92900f
Compare
|
This pull request has been automatically marked as stale due to inactivity. |
This comment was marked as spam.
This comment was marked as spam.
1a04ecc to
ff21091
Compare
Contributor
|
Landed via temp rebase onto main.
Thanks @mac-110! |
Contributor
dawi369
pushed a commit
to dawi369/davis
that referenced
this pull request
Mar 3, 2026
OWALabuy
pushed a commit
to kcinzgg/openclaw
that referenced
this pull request
Mar 4, 2026
zooqueen
pushed a commit
to hanzoai/bot
that referenced
this pull request
Mar 6, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Add native support for reading parakeet-mlx output files in
resolveCliOutput.Problem
Parakeet-mlx is a fast, local speech-to-text model (based on NVIDIA Parakeet) that runs on Apple Silicon via MLX. It writes transcripts to
--output-dir/filename.txt, but OpenClaw'sresolveCliOutputonly supported whisper, whisper-cli, gemini, and sherpa-onnx-offline.Without this fix, users need a wrapper script that outputs to stdout instead.
Solution
Add
resolveParakeetOutputPath()helper function that:--output-dirargument (note: hyphen, not underscore like whisper)outputDir/mediaBasename.txtConfig Example
{ "tools": { "media": { "audio": { "models": [{ "type": "cli", "command": "parakeet-mlx", "args": ["{{MediaPath}}", "--output-format", "txt", "--output-dir", "{{OutputDir}}"] }] } } } }Note
Previous issue #7552 was incorrectly auto-closed as a duplicate of #7536 (Windows path bug), which is unrelated. This PR addresses the original feature request.
Greptile Overview
Greptile Summary
This PR extends
resolveCliOutputinsrc/media-understanding/runner.tsto supportparakeet-mlxby resolving the transcript file path from CLI arguments (using--output-dir/--output-format) and reading{{OutputDir}}/{{mediaBasename}}.txtwhen present.The change fits the existing CLI-provider pattern already used for
whisper/whisper-cli, where transcript output may be written to a temp output directory and then ingested back into the media understanding pipeline.Confidence Score: 5/5
(4/5) You can add custom instructions or style guidelines for the agent here!