Skip to content

feat(webchat): add server-side dictation#76021

Merged
clawsweeper[bot] merged 9 commits intomainfrom
codex/webchat-server-dictation
May 2, 2026
Merged

feat(webchat): add server-side dictation#76021
clawsweeper[bot] merged 9 commits intomainfrom
codex/webchat-server-dictation

Conversation

@steipete
Copy link
Copy Markdown
Contributor

@steipete steipete commented May 2, 2026

Summary

Adds server-side microphone dictation for Control UI/WebChat chat drafts.

  • Adds authenticated chat.transcribeAudio Gateway RPC that decodes browser audio, writes a temporary local file, runs the existing tools.media.audio transcription pipeline, and returns draft text without starting an agent run.
  • Adds a composer mic button using MediaRecorder, with safe stop/cancel handling on disconnect and client-side oversized-audio rejection before WebSocket send.
  • Documents the WebChat dictation path and adds changelog coverage for [Feature Request] Add server-side STT option for webchat voice input #47311.

Validation

  • pnpm test src/gateway/server-methods/chat-transcribe-audio.test.ts
  • pnpm --dir ui test src/ui/app-chat.test.ts src/ui/views/chat.test.ts
  • pnpm tsgo:test:ui
  • pnpm tsgo:prod
  • Testbox: OPENCLAW_TESTBOX=1 pnpm check:changed

Fixes #47311

@openclaw-barnacle openclaw-barnacle Bot added docs Improvements or additions to documentation app: web-ui App: web-ui gateway Gateway runtime size: L maintainer Maintainer-authored PR labels May 2, 2026
@steipete steipete force-pushed the codex/webchat-server-dictation branch from 05c682a to 520e7fe Compare May 2, 2026 09:36
@steipete
Copy link
Copy Markdown
Contributor Author

steipete commented May 2, 2026

@clawsweeper automerge

@clawsweeper
Copy link
Copy Markdown
Contributor

clawsweeper Bot commented May 2, 2026

Codex review: passed.

Summary
This PR adds WebChat server-side dictation through a new authenticated chat.transcribeAudio Gateway RPC, MediaRecorder composer controls, docs/changelog updates, and focused gateway/UI tests.

Reproducibility: yes. Current main reproduces the missing feature by inspection: the Gateway method list, write scopes, docs, and WebChat voice-control test have no chat.transcribeAudio server-dictation path.

Next step before merge
No repair job is needed: the automerge-labeled PR has no actionable diff finding; the remaining step is the normal exact-head CI and mergeability gate.

Security
Cleared: The diff adds no dependencies, lockfile changes, workflow changes, or broader permissions; the new audio path is authenticated, size-bounded, temp-file scoped, and cleans up after transcription.

Review details

Best possible solution:

Land this PR through the automerge path after exact-head CI is green, then let the linked WebChat STT feature request close through the merge.

Do we have a high-confidence way to reproduce the issue?

Yes. Current main reproduces the missing feature by inspection: the Gateway method list, write scopes, docs, and WebChat voice-control test have no chat.transcribeAudio server-dictation path.

Is this the best way to solve the issue?

Yes, pending CI. Reusing the existing tools.media.audio pipeline on the Gateway is the narrow maintainable path because it keeps provider credentials server-side, avoids browser Web Speech, bounds uploads, and returns draft text without starting an agent run.

What I checked:

Likely related people:

  • steipete: Current-main history ties this person to browser realtime Talk transports, Gateway method surfaces, WebChat state helpers, and media-provider diagnostics touched by the PR. (role: recent maintainer and adjacent owner; confidence: high; commits: 93bbbe5e3732, b22926601fca, fa7de462610f; files: ui/src/ui/views/chat.ts, ui/src/ui/app.ts, src/gateway/server-methods-list.ts)
  • BunsDev: Recent current-main commits touch the same chat composer, session continuity, slash-menu accessibility, and app state surfaces where dictation UI is added. (role: recent Control UI chat maintainer; confidence: high; commits: 20cbc1f21683, 37aebf612b83, cb9d7884cca3; files: ui/src/ui/views/chat.ts, ui/src/ui/app.ts, ui/src/ui/app-chat.ts)
  • vincentkoc: Recent current-main work split media-understanding runtime contracts and maintained WebChat attachment surfaces used by the new transcription path. (role: media/runtime boundary maintainer; confidence: medium; commits: 6e74d77a42e9, 37bde69c17c6, 6b6dcafcee9c; files: src/media-understanding/runtime.ts, src/media-understanding/runtime-types.ts, ui/src/ui/views/chat.ts)

Remaining risk / open question:

  • Exact-head CI still had check-additional-boundaries in progress when checked, so merge should wait for that result.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 15bbf4f2f304.

@clawsweeper clawsweeper Bot added the clawsweeper:automerge Maintainer opted this PR into bounded ClawSweeper-reviewed automerge label May 2, 2026
@clawsweeper
Copy link
Copy Markdown
Contributor

clawsweeper Bot commented May 2, 2026

ClawSweeper 🐠 automerge status

ClawSweeper took another look; no safe branch change was available on this pass.

Executor outcome: no planned fix actions.
Worker summary: this PR is already merged into current main at 68359ca, and linked issue #47311 is already closed by that fix path. No repair PR, close, or merge action remains for this cluster.

Worker actions:

  • keep_closed on this PR: skipped - Already merged and closed; blocked_actions also prohibit merge, so no mutating action is valid.
  • keep_closed on #47311: skipped - Already closed by the merged canonical PR; no close action is valid for an already-closed issue.

No push, rebase, replacement PR, merge, or ClawSweeper re-review happened on this pass.

fish notes: model gpt-5.5, reasoning high.

Automerge progress:

  • 2026-05-02 09:36:30 UTC review queued [`1d2dd49601bb`](https://github.com/openclaw/openclaw/commit/1d2dd49601bba498eafeb7b0fdcfa0f79a92f344) (queued)
  • 2026-05-02 21:43:23 UTC repair queued [`1d2dd49601bb`](https://github.com/openclaw/openclaw/commit/1d2dd49601bba498eafeb7b0fdcfa0f79a92f344) (autonomous) Run: https://github.com/openclaw/clawsweeper/actions/runs/25262609195
  • 2026-05-02 21:44:16 UTC review passed [`1d2dd49601bb`](https://github.com/openclaw/openclaw/commit/1d2dd49601bba498eafeb7b0fdcfa0f79a92f344) (structured ClawSweeper verdict: pass (sha=1d2dd49601bba498eafeb7b0fdcfa0f79a92f...)
  • 2026-05-02 21:47:29 UTC review queued [`1d2dd49601bb`](https://github.com/openclaw/openclaw/commit/1d2dd49601bba498eafeb7b0fdcfa0f79a92f344) (queued)
  • 2026-05-02 21:58:07 UTC repair completed [`2513eab6e00a`](https://github.com/openclaw/openclaw/commit/2513eab6e00ad0df70fee2f96a4a307208702329) (branch updated) in 12m 26s Run: https://github.com/openclaw/clawsweeper/actions/runs/25262609195 initial automerge rebase is delegated to Codex repair
  • 2026-05-02 21:58:07 UTC review queued [`2513eab6e00a`](https://github.com/openclaw/openclaw/commit/2513eab6e00ad0df70fee2f96a4a307208702329) (after repair)
  • 2026-05-02 09:36:30 UTC review queued [`2513eab6e00a`](https://github.com/openclaw/openclaw/commit/2513eab6e00ad0df70fee2f96a4a307208702329) (queued)
  • 2026-05-02 22:01:12 UTC repair queued [`2513eab6e00a`](https://github.com/openclaw/openclaw/commit/2513eab6e00ad0df70fee2f96a4a307208702329) (autonomous) Run: https://github.com/openclaw/clawsweeper/actions/runs/25262945998
  • 2026-05-02 22:02:03 UTC review passed [`2513eab6e00a`](https://github.com/openclaw/openclaw/commit/2513eab6e00ad0df70fee2f96a4a307208702329) (structured ClawSweeper verdict: pass (sha=2513eab6e00ad0df70fee2f96a4a307208702...)
  • 2026-05-02 22:49:19 UTC review queued [`2513eab6e00a`](https://github.com/openclaw/openclaw/commit/2513eab6e00ad0df70fee2f96a4a307208702329) (queued)
  • 2026-05-02 22:49:41 UTC repair queued [`2513eab6e00a`](https://github.com/openclaw/openclaw/commit/2513eab6e00ad0df70fee2f96a4a307208702329) (autonomous) Run: https://github.com/openclaw/clawsweeper/actions/runs/25263839465
  • 2026-05-02 23:04:51 UTC repair completed [`850571380af0`](https://github.com/openclaw/openclaw/commit/850571380af047343b61a3dc58a0bc1d85773201) (branch updated) in 12m 34s Run: https://github.com/openclaw/clawsweeper/actions/runs/25263839465 initial automerge rebase is delegated to Codex repair
  • 2026-05-02 23:04:51 UTC review queued [`850571380af0`](https://github.com/openclaw/openclaw/commit/850571380af047343b61a3dc58a0bc1d85773201) (after repair)
  • 2026-05-02 23:09:10 UTC review passed [`850571380af0`](https://github.com/openclaw/openclaw/commit/850571380af047343b61a3dc58a0bc1d85773201) (structured ClawSweeper verdict: pass (sha=850571380af047343b61a3dc58a0bc1d85773...)
  • 2026-05-02 23:09:26 UTC merged [`850571380af0`](https://github.com/openclaw/openclaw/commit/850571380af047343b61a3dc58a0bc1d85773201) (merged by ClawSweeper automerge)
  • 2026-05-02 23:11:34 UTC repair completed (no branch change) in 1s Run: https://github.com/openclaw/clawsweeper/actions/runs/25264114218 no planned fix actions

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 05c682ae3d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread ui/src/ui/app.ts Outdated
@clawsweeper clawsweeper Bot force-pushed the codex/webchat-server-dictation branch 3 times, most recently from 89ae12f to 1d2dd49 Compare May 2, 2026 21:40
@clawsweeper clawsweeper Bot requested a review from a team as a code owner May 2, 2026 21:40
@openclaw-barnacle openclaw-barnacle Bot added the scripts Repository scripts label May 2, 2026
@clawsweeper clawsweeper Bot force-pushed the codex/webchat-server-dictation branch from 1d2dd49 to 2513eab Compare May 2, 2026 21:58
@openclaw-barnacle openclaw-barnacle Bot added the cli CLI command changes label May 2, 2026
@clawsweeper clawsweeper Bot force-pushed the codex/webchat-server-dictation branch from 2513eab to 8505713 Compare May 2, 2026 23:04
@openclaw-barnacle openclaw-barnacle Bot removed the cli CLI command changes label May 2, 2026
@clawsweeper clawsweeper Bot merged commit 68359ca into main May 2, 2026
103 checks passed
@clawsweeper clawsweeper Bot deleted the codex/webchat-server-dictation branch May 2, 2026 23:09
lxe pushed a commit to lxe/openclaw that referenced this pull request May 6, 2026
Summary:
- This PR adds WebChat server-side dictation through a new authenticated `chat.transcribeAudio` Gateway RPC, MediaRecorder composer controls, docs/changelog updates, and focused gateway/UI tests.
- Reproducibility: yes. Current main reproduces the missing feature by inspection: the Gateway method list, write scopes, docs, and WebChat voice-control test have no `chat.transcribeAudio` server-dictation path.

ClawSweeper fixups:
- Included follow-up commit: feat(webchat): add server-side dictation
- Included follow-up commit: fix(clawsweeper): address review for automerge-openclaw-openclaw-7602…

Validation:
- ClawSweeper review passed for head 8505713.
- Required merge gates passed before the squash merge.

Prepared head SHA: 8505713
Review: openclaw#76021 (comment)

Co-authored-by: Peter Steinberger <steipete@gmail.com>
Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com>
github-actions Bot pushed a commit to Desicool/openclaw that referenced this pull request May 9, 2026
Summary:
- This PR adds WebChat server-side dictation through a new authenticated `chat.transcribeAudio` Gateway RPC, MediaRecorder composer controls, docs/changelog updates, and focused gateway/UI tests.
- Reproducibility: yes. Current main reproduces the missing feature by inspection: the Gateway method list, write scopes, docs, and WebChat voice-control test have no `chat.transcribeAudio` server-dictation path.

ClawSweeper fixups:
- Included follow-up commit: feat(webchat): add server-side dictation
- Included follow-up commit: fix(clawsweeper): address review for automerge-openclaw-openclaw-7602…

Validation:
- ClawSweeper review passed for head 8505713.
- Required merge gates passed before the squash merge.

Prepared head SHA: 8505713
Review: openclaw#76021 (comment)

Co-authored-by: Peter Steinberger <steipete@gmail.com>
Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

app: web-ui App: web-ui clawsweeper:automerge Maintainer opted this PR into bounded ClawSweeper-reviewed automerge docs Improvements or additions to documentation gateway Gateway runtime maintainer Maintainer-authored PR scripts Repository scripts size: L

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature Request] Add server-side STT option for webchat voice input

1 participant