Skip to content

feat(ios): clarify talk realtime fallback#91201

Merged
ngutman merged 2 commits into
mainfrom
improve-ios-talk-runtime-transparency
Jun 7, 2026
Merged

feat(ios): clarify talk realtime fallback#91201
ngutman merged 2 commits into
mainfrom
improve-ios-talk-runtime-transparency

Conversation

@ngutman

@ngutman ngutman commented Jun 7, 2026

Copy link
Copy Markdown
Member

Summary

  • Clarifies the iOS Talk experience when gateway realtime mode fails and the app falls back to native iOS speech recognition/TTS.
  • Adds a single generic gateway/runtime issue for realtime startup failures, preserving the safe provider message and diagnostics without classifying root cause in the app.
  • Shows configured vs active talk mode in iOS, plus a small fallback banner with details and diagnostics when realtime fails over while the gateway is still connected.
  • Keeps lifecycle events from being dropped in the gateway relay so iOS can reliably distinguish realtime ready/error/close outcomes.
  • Out of scope: changing provider credentials, adding new talk modes, or changing the normal successful realtime audio path beyond startup readiness tracking.

Linked context

Requested by maintainer in this thread after testing the confusing fallback cases on iOS.

Real behavior proof (required for external PRs)

  • Behavior or issue addressed: iOS now makes realtime talk fallback visible instead of silently switching to native iOS speech/TTS when configured gateway realtime fails.
  • Real environment tested: Local iOS build installed and launched on Nimrod's iPhone; temporary gateway failure was tested against the running gateway during development and then the original gateway config was restored.
  • Exact steps or command run after this patch:
    • pnpm test src/gateway/talk-realtime-relay.test.ts src/gateway/server-methods/talk.test.ts
    • xcodebuild -project apps/ios/OpenClaw.xcodeproj -scheme OpenClaw -destination "platform=iOS Simulator,name=iPhone 17" -configuration Debug -derivedDataPath apps/ios/build/simulator -only-testing:OpenClawTests/RealtimeTalkRelaySessionTests -only-testing:OpenClawTests/TalkModeManagerTests test
    • swiftformat --lint --config config/swiftformat apps/ios/Sources/Voice/RealtimeTalkRelaySession.swift apps/ios/Tests/RealtimeTalkRelaySessionTests.swift
    • swiftlint on touched iOS source files via script input files
    • git diff --check
    • codex review --uncommitted before commit
    • .agents/skills/autoreview/scripts/autoreview --mode local after simplifying fallback issues
    • xcodebuild -project apps/ios/OpenClaw.xcodeproj -scheme OpenClaw -destination "id=00008140-000848A92EE3001C" -configuration Debug -derivedDataPath apps/ios/build/phone build
    • xcrun devicectl device install app --device 6A19D82B-4EA6-5D67-B3A7-0AB3B71B550C apps/ios/build/phone/Build/Products/Debug-iphoneos/OpenClaw.app
    • xcrun devicectl device process launch --device 6A19D82B-4EA6-5D67-B3A7-0AB3B71B550C ai.openclaw.ios.test.guti-gzs353x62e
  • Evidence after fix: The failure banner was manually verified on the physical phone during temporary gateway failover testing; the signed app then built, installed, and launched successfully on the same phone after the branch was pushed.
  • Observed result after fix: The app displays the native speech fallback state and a visible runtime issue banner instead of silently hiding gateway realtime failure.
  • What was not tested: Full end-to-end successful provider audio on every realtime provider/model combination.
  • Proof limitations or environment constraints: The first phone launch attempt was denied because the device was locked; after unlocking, launch succeeded.
  • Before evidence: Previously, failed gateway realtime startup could immediately fall back to native speech/TTS without clear in-app indication.

Tests and validation

  • Gateway regression tests passed: 136 tests.
  • iOS selected simulator tests passed: 25 tests across RealtimeTalkRelaySessionTests and TalkModeManagerTests.
  • SwiftFormat lint: clean.
  • SwiftLint on touched iOS files: 0 violations.
  • git diff --check: clean.
  • Codex/autoreview: no actionable correctness issues after accepted fixes were applied and after the generic fallback simplification.
  • Physical-device iOS Debug build, install, and launch succeeded.

Regression coverage added/updated:

  • Gateway relay tests now cover generic talk issue events and non-droppable lifecycle events.
  • iOS relay tests now cover startup cancellation and startup timeout budget behavior.
  • iOS config parsing tests cover generic talk issue parsing and fallback banner state.

Risk checklist

Did user-visible behavior change? (Yes/No)

Yes.

Did config, environment, or migration behavior change? (Yes/No)

No.

Did security, auth, secrets, network, or tool execution behavior change? (Yes/No)

Yes, only in error reporting for talk provider/auth failures. No secrets are surfaced; diagnostics use a generic issue code plus safe issue metadata.

What is the highest-risk area?

Realtime relay startup sequencing and generic failure propagation between gateway events and iOS fallback.

How is that risk mitigated?

The iOS relay now subscribes before session creation, waits for provider ready before reporting realtime startup success, preserves the first realtime issue, and the gateway keeps lifecycle events non-droppable. Focused gateway and iOS tests cover these cases.

Current review state

What is the next action?

Maintainer review and GitHub checks.

What is still waiting on author, maintainer, CI, or external proof?

Waiting on PR CI and maintainer review. No known author-side loose ends.

Which bot or reviewer comments were addressed?

Local Codex review findings were addressed before this PR: startup cancellation hang, droppable lifecycle events, and startup wait budget shorter than gateway provider connect budget. Maintainer feedback to remove semantic failure classification was addressed by collapsing fallback reporting to one generic realtime_unavailable issue.

Agent Transcript

Redacted codex session transcript
source: [LOCAL_SESSION]
redaction: local paths, emails, phone-shaped strings, token-shaped strings, auth headers, auth query params
omitted: raw tool outputs, system/developer prompts, local paths, secrets, browser/session/auth details
stats: {"agent":"codex","entries":27,"user":13,"assistant":13,"toolCalls":580,"toolOutputsDropped":579,"web":0,"redactions":10,"omittedUnsafe":0,"rawEntries":423}

[user]
I want to improve the iOS talk experience -
Today when several talk options are configured on the gateway - if one of them fails the experience is not clear in the iOS app. For example if I configured gpt realtime 2.0 in gateway relay mode but it failed and the iOS falls back into regular speech recognition and tts - it's not clear. Also if the API key fails in the gateway and the realtime mode closes the socket immediately there's no indication on the app. Please help spec and plan improvements to this

[assistant]
I’ll ground the plan in the current iOS/gateway talk flow first, then turn it into a concrete product/technical spec with implementation phases and verification points.

The broad search is too noisy, but it confirms there is already a `src/talk` event/diagnostics layer and mobile voice E2E scaffolding. I’m narrowing to iOS app files and gateway protocol/talk event contracts now.

[instructions recap omitted; policy/config text, not task dialogue]

The current protocol already has `talk.session.create` and `TalkEvent` schemas, and iOS advertises `talk` plus speech-recognition permission state. The likely gap is not capability discovery, but explicit session outcome reporting and a UI state machine that can distinguish “realtime active”, “trying realtime”, “fell back to local STT/TTS”, and “failed before fallback.”

A concrete thing from the server path: realtime relay creation is all-or-error during `talk.session.create`; if provider resolution or bridge creation throws, the gateway returns `UNAVAILABLE`. After creation, later provider/socket failure is only represented through session events, so the iOS side needs to surface those instead of silently choosing another path.

The iOS relay session currently sets status to `Connecting realtime…`, then `Listening (Realtime)` only after create, event subscription, audio contract, and mic pump succeed. The unclear path likely happens one level up in `TalkModeManager`, where realtime start failure falls through to native speech without preserving a visible degraded reason.

I have enough context to draft this without changing files. I’m going to frame it around one canonical “Talk runtime state” instead of more status strings, because the current state is spread across `statusText`, configured mode labels, and relay session callbacks.

I’d spec this as a **Talk runtime transparency** improvement: the app should always show the configured voice path, the active voice path, and any degradation reason.

Current behavior from the code: realtime start can fail and return `false`, then `TalkModeManager` starts the native speech path and overwrites status with plain `Listening` at `apps/ios/Sources/Voice/TalkModeManager.swift:301` and `apps/ios/Sources/Voice/TalkModeManager.swift:320`. Relay errors also flow through a plain string callback, while relay close becomes `Ready` at `apps/ios/Sources/Voice/RealtimeTalkRelaySession.swift:332`.

**Spec**
- Add an explicit iOS `TalkRuntimeState`, separate from `statusText`:
  - `off`
  - `starting(requestedMode)`
  - `active(mode, provider, model, transport)`
  - `degraded(requestedMode, activeMode, reason, retryAvailable)`
  - `failed(requestedMode, reason, recoveryAction)`
- Derive `statusText` from that state instead of using ad hoc strings.
- When realtime fails and native STT/TTS starts, show:
  - Runtime: `Listening (iOS Speech fallback)`
  - Banner/chip: `GPT Realtime 2.0 unavailable. Using iOS speech.`
  - Detail: `Gateway realtime failed: invalid API key` or classified reason.
- When gateway relay closes before ready, do not show `Ready`; show:
  - `Realtime failed before connecting`
  - If fallback is allowed and speech permissions are available, transition to degraded fallback.
  - If fallback is unavailable, keep a visible failed state with `Retry` and `Open Settings`.
- Keep “configured mode” and “active mode” separate in UI. Example:
  - Configured: `GPT Realtime 2.0 • Gateway Relay`
  - Active now: `iOS Speech + system TTS fallback`
- Add a small `Last realtime issue` row in Talk settings/debug, with timestamp, provider, model, transport, and reason. No secrets.

**Gateway/API Plan**
1. Use the existing `TalkEvent` contract for structured failure events. It already supports `session.error`, `session.closed`, and `health.changed` in `src/talk/talk-events.ts`, and the protocol schema accepts provider/mode/transport context at `packages/gateway-protocol/src/schema/channels.ts:148`.
2. Make realtime relay emit a structured `session.error` before closing when provider auth, model, websocket, or startup fails.
3. Classify gateway/provider failures into stable codes:
   - `credential_invalid`
   - `credential_missing`
   - `provider_unavailable`
   - `model_unavailable`
   - `gateway_config_invalid`
   - `network_error`
   - `provider_closed_before_ready`
   - `unknown`
4. Include safe metadata only: `provider`, `model`, `transport`, `phase`, `message`, `code`.
5. For `talk.session.create`, return the same classified error details when creation fails around provider resolution / bridge creation at `src/gateway/server-methods/talk-session.ts:294`.

**iOS Implementation Plan**
1. Add `TalkRuntimeState` and `TalkRuntimeIssue` in the voice module.
2. Change `startRealtimeIfAvailable()` and `startRealtimeRelayIfAvailable()` to return a structured result instead of `Bool`, for example:
   - `.started`
   - `.unavailable(issue)`
   - `.ignored`
3. When fallback starts, preserve the realtime issue and set degraded state instead of overwriting status with plain `Listening`.
4. Replace `RealtimeTalkRelaySession`’s `onStatus(String)` with a structured callback:
   - `.ready`
   - `.transcript(role, final)`
   - `.outputStarted`
   - `.error(issue)`
   - `.closed(reason, wasReady)`
5. Update TalkPro UI and compact chat orb to render active/degraded/failed state clearly.
6. Add retry behavior:
   - `Retry Realtime` attempts only the configured realtime path.
   - Toggle off/on clears the issue.
   - Successful realtime start clears degraded state.

**Acceptance Criteria**
- Bad gateway OpenAI API key: app shows realtime failure reason, then either degraded iOS speech fallback or failed state.
- Gateway relay socket closes immediately: app does not show `Ready`; it shows `Realtime failed before connecting`.
- Realtime configured but native fallback active: Talk UI clearly says fallback is active.
- Successful realtime session shows `Listening (Realtime)` and active provider/model.
- No secrets are rendered or logged.

**Verification**
- Swift tests: `TalkModeManager` fallback state, relay close-before-ready, relay error event.
- Gateway tests: `talk.session.create` auth/provider failure classification, relay emits `session.error` before close.
- Manual E2E: configured `gpt-realtime-2` with invalid key, confirm visible degraded/failed state on iOS.

[user]
implement this in a new branch

[assistant]
I’ll create a feature branch first, then make the iOS state changes and the gateway/protocol failure classification changes together so the app can consume structured failure data when it exists.

Using `openclaw-testing` for the validation path; this is a normal checkout, so I’ll keep proof targeted to the touched Swift and gateway protocol/tests unless the diff expands.

Branch `improve-ios-talk-runtime-transparency` is active. I’m reading the generated protocol structs and existing iOS tests now so the patch fits the current decoding patterns instead of adding a parallel protocol layer.

I found the key iOS issue: realtime start failure only logs and returns `false`, so the fallback path replaces it with `Listening`. I’m going to add a small issue/result model and keep the previous config labels intact while exposing the active degraded runtime.

Implementation plan:
- Add a typed iOS `TalkRuntimeState`/`TalkRuntimeIssue` and expose active/degraded labels to the Talk UI.
- Change realtime start helpers to return `started` or `unavailable(issue)` so native fallback keeps the realtime failure visible.
- Make relay error/close callbacks structured, especially close-before-ready.
- Add gateway relay failure metadata to `talk.event` error payloads and request errors where practical.
- Add targeted Swift and TypeScript tests around state formatting and relay error classification.

I’m starting with the iOS model because that fixes the confusing fallback UX even before every gateway error has perfect metadata. The gateway patch will enrich the reason text where available.

I’m editing the existing Swift voice files rather than adding a new file, so the Xcode project does not need source-list churn. The visible UI will get “Configured mode” and “Active now” rows plus a last issue detail.

Next I’m wiring the manager: realtime start failures will become a `TalkRuntimeIssue`, and the native path will render as an explicit iOS Speech fallback instead of plain listening.

I’m changing only the decision boundary around realtime startup: if realtime fails, the fallback can still start, but it carries the realtime failure forward into UI state and diagnostics.

Now I’m updating the two realtime start helpers and adding a few small helpers for active/degraded labels. This is intentionally local to Talk mode state, not a new config surface.

The helper methods are going in the existing descriptor section, where configured voice labels are already normalized. That keeps the active/degraded labels using the same provider/model naming rules.

I’m adding the UI rows now: configured mode remains visible, active mode is separate, and a last-issue row appears only when there is an actual degraded/failure reason.

I’m avoiding new icons where possible to reduce SF Symbol compatibility risk across iOS targets; the functional change is the active/degraded text, not icon art.

I’m also adding the same active/runtime visibility in Settings, since that’s where someone debugging an API-key or gateway relay issue will naturally check.

The settings rows need compact computed strings, so I’m placing those next to the existing Talk settings helper properties rather than threading more view state.

I’m adding those Settings helper properties in `SettingsProTabActions.swift`, which already owns the Talk config display helpers.

The new Settings values will collapse empty subtitles, so normal users see one extra useful row and only see “Last Voice Issue” after a real degradation.

I’m adding reset behavior now: toggling Talk off clears the degraded issue, while a fallback issue remains visible while Talk stays enabled.

I’m checking the WebRTC delegate path too, so direct realtime and relay realtime report active mode consistently.

The WebRTC delegate currently only mirrors raw provider status. I’m making the “Listening” realtime status use the same active-mode helper so direct WebRTC doesn’t look like native speech.

When realtime ends normally, I’m clearing the active mode back to “Not active” so Settings doesn’t imply an ended session is still live.

I’m adding test-only hooks for the new state formatting rather than trying to drive AVAudio/Speech permissions in unit tests. That keeps tests deterministic.

The tests will pin two user-visible behaviors: auth-style errors classify as credential issues, and fallback state renders as iOS Speech fallback with a retained issue summary.

I’m enriching the gateway relay events now. The existing relay already emits `session.error`; I’m adding stable error codes and making the connect-time rejection include the Talk event payload instead of only a raw message.

The gateway side now has a relay issue classifier. I’m wiring it into provider error, close-before-ready, and bridge connect failure paths so iOS can show a stable reason.

I’m checking the request error helper before changing it; if the gateway already preserves `ErrorShape.details`, I’ll add classified details only at this Talk creation boundary.

`talk-session.ts` has its own response helper, so I’m patching that local helper rather than widening generic gateway error behavior.

For request-time failures, I’m adding structured `details.talkIssue` to the `UNAVAILABLE` response. Existing clients still get the same top-level message; newer iOS can classify from the message even if it cannot read details yet.

I’m looking for the existing Talk gateway tests before adding new ones, so the added assertions land in the current suite instead of a new standalone harness.

I’m continuing from the implementation branch and checking the gateway test surface before running focused proof. The remaining risk is making sure the new failure metadata is covered by tests and does not break existing relay behavior.

The iOS side now keeps both the configured mode and the actual active mode, so fallback is visible instead of silently looking like the selected realtime option is running. I’m adding/adjusting gateway tests now so auth and early-close failures carry structured issue codes to the app.

The existing relay tests already exercise provider close paths, so I can extend that coverage rather than create a separate harness. I’m also going to cover the `talk.session.create` failure response because that is the request-time path the iOS app sees before a relay is established.

I found one cleanup issue in my own gateway change: request-time errors were annotated, but provider/model context is only available inside the realtime create branch. I’m going to keep the public detail shape narrow while adding the missing focused tests.

I’m editing the tests and keeping the new behavior scoped to the existing Talk relay/session modules. No commits or pushes will happen unless you explicitly ask.

Relay error coverage is in place for both connect failures and pre-ready closes. Next I’m adding the request-time `talk.session.create` coverage so iOS can surface a failed API key even when the socket never gets a usable relay session.

The UI diff is intentionally simple: the Talk screen now shows “Configured” separately from “Active now”, and Settings mirrors the active voice plus the last failure detail. I’m checking the supporting JSON model and build syntax next because these Swift paths are sensitive to small type issues.

`git diff --check` is clean. I’m running focused gateway tests first, then I’ll run the iOS test target for the Talk config/relay changes if the simulator setup cooperates.

The first gateway test command failed because `--runInBand` is a Jest flag and Vitest rejects it here. I’m rerunning the same focused test set with the repo wrapper’s supported flags only.

The focused gateway tests exposed a real parser error in `talk-realtime-relay.ts`, so I’m fixing that before continuing. The failure is in the edited callback object around the new `onError` block.

That parser issue was just a stray comma after an `emit(...)` call. I’ve corrected it and am rerunning the focused gateway tests.

The rerun shows the existing gateway error formatter includes the `Error:` prefix. I’m updating the new assertion to match that contract rather than changing the formatter behavior.

The new gateway assertions now follow the existing formatted error contract. I’m rerunning the focused gateway tests again.

Gateway proof is green now: 132 focused tests passed. I’m checking the iOS project schemes so I can run the narrow Swift test target for the Talk changes.

The project has an `OpenClaw` scheme with `OpenClawTests`. I’m checking available simulators so the narrow Talk test run uses an installed destination.

I found available iOS simulators and am running only the Talk-related XCTest/Swift Testing suites on `iPhone 16e` for iOS 18.6.

The iOS build did not reach tests because the project’s SwiftFormat lint caught two edited files. I’m applying the formatter to just those Swift files, then I’ll rerun the same narrow iOS tests.

The lint errors are all formatter style issues in my edits. I’m running SwiftFormat on only `TalkModeGatewayConfig.swift` and `TalkModeManager.swift`.

SwiftFormat updated only the two files it flagged. I’m quickly reviewing the changed hunks, then I’ll rerun the same iOS test command.

The formatter changes are clean and `git diff --check` still passes. I’m rerunning the same iOS Talk test command now.

The formatter issue is fixed, but the iOS app target currently fails earlier on unrelated missing `AppleReviewDemo*` symbols before the Talk tests can run. I’m checking whether there is a narrower logic-test target or build flag that avoids that app-target failure without touching unrelated code.

The app-target iOS test is still blocked by existing unrelated `AppleReviewDemoMode` / `AppleReviewDemoChatTransport` build errors. A logic-test scheme runs but does not include these app Talk tests, so I’m treating app-target proof as blocked and will keep the successful gateway proof plus SwiftFormat lint proof for the touched iOS files.

The touched Swift files pass the project SwiftFormat lint now. I’m doing a final status/numstat pass so the handoff includes exact scope and remaining proof gap.

The implementation is complete enough for handoff, but repo policy requires a fresh autoreview before closeout for code changes. I’m running that review pass now and will address any concrete findings it produces.

Using the repo-local `$autoreview` workflow now: `.agents/skills/autoreview/scripts/autoreview --mode local`. This may take a while; I’ll keep the review output actionable rather than dumping noise.

The autoreview helper is still running. I’m letting it continue; no findings have returned yet.

Autoreview is healthy and reporting heartbeat at 60 seconds. I’m continuing to wait for the structured result.

Autoreview found a real bug: after a classified relay `error`, iOS would process the following `close` and overwrite the specific issue with a generic “closed before ready.” I’m fixing that by tracking whether an error was already received and preserving the original issue on close.

I’m adding a narrow regression test around the exact event sequence from the review finding: classified `error`, then `close` before ready. The expected behavior is that the credential issue remains the only reported issue.

The runtime fix is in: a close event no longer replaces a classified error if one already arrived. I’m exposing a test-only event hook and adding the regression assertion.

The regression test now exercises the reviewed failure sequence. I’m formatting the touched Swift files and then rerunning the relevant focused proof.

SwiftFormat made no further changes. Since the app target is already blocked by unrelated missing Apple Review demo symbols, I’m rerunning lint for touched Swift files and the gateway tests; then I’ll rerun autoreview as required after the fix.

Focused gateway tests and SwiftFormat lint are green after the autoreview fix. I’m rerunning autoreview now to confirm there are no remaining accepted/actionable findings.

The second autoreview pass is running. I’m waiting for the structured result.

Autoreview heartbeat is healthy at 60 seconds; still waiting.

The second review found the same class on the gateway side: `onError` followed by provider `onClose` could emit two error events. I’m adding a server-side `failureEmitted` guard and a test that only the actionable provider error is sent.

The gateway now tracks whether it already emitted a provider failure. I’m adding the regression test for `onError` followed by `onClose` so this doesn’t regress.

The gateway-side regression test is in. I’m rerunning the focused gateway suite and diff checks before a third autoreview pass.

Gateway tests are green again, now 136 tests with the added regression coverage. I’m rerunning autoreview once more and will stop when it returns clean.

Third autoreview pass is running. I’m waiting for the structured result.

Autoreview heartbeat is healthy at 60 seconds; continuing to wait.

The review is still running without a new finding payload. I’m giving it more time under the helper’s expected window.

Autoreview is still healthy at 120 seconds. I’m continuing to wait for completion.

Autoreview is clean now. I’m collecting final status and scope so the closeout is precise.

Implemented on new branch `improve-ios-talk-runtime-transparency`. No commit or push was made.

What changed:
- Gateway relay now emits structured Talk issue details for realtime failures: credential/model/provider/network/pre-ready close codes in `src/gateway/talk-realtime-relay.ts`.
- `talk.session.create` failures now include `details.talkIssue` so iOS can surface request-time failures like bad API keys in `src/gateway/server-methods/talk-session.ts`.
- iOS now separates configured voice mode from active runtime mode and shows fallback clearly in `apps/ios/Sources/Design/TalkProTab.swift` and settings.
- iOS preserves the actionable realtime failure when falling back to native speech/TTS, including the error-after-close sequencing fixes in `apps/ios/Sources/Voice/RealtimeTalkRelaySession.swift`.
- Added regression coverage in gateway tests and iOS Talk tests.

Verification:
- Passed: `pnpm test src/gateway/talk-realtime-relay.test.ts src/gateway/server-methods/talk.test.ts` (`136` tests).
- Passed: `swiftformat --lint` on touched iOS files.
- Passed: `git diff --check`.
- Passed: `.agents/skills/autoreview/scripts/autoreview --mode local` clean.

iOS app-target test note:
- `xcodebuild test ... OpenClawTests/TalkModeManagerTests ...` is currently blocked before Talk tests run by unrelated existing missing symbols: `AppleReviewDemoMode` and `AppleReviewDemoChatTransport`.

[user]
$codex-review

[assistant]
Using `$codex-review` as requested. Since this is still uncommitted local work, I’m running the bundled helper in local mode and disabling its broad auto-check so the review target stays on the dirty patch we already tested.

The helper exists but is not executable in this install. I’m checking the script wrapper and then I’ll run it through its interpreter rather than changing permissions unless needed.

The helper is a Bash script without the executable bit, so I’m invoking it with `bash` directly.

Codex review is still inspecting the patch. It has read the changed iOS and gateway paths and is continuing.

Codex review is still running and has not returned findings yet. I’m continuing to let it complete.

I’m continuing from the codex-review run that was already in progress. I’ll wait for its verdict, then only change code if it raises a finding that holds up against the current implementation.

The review is doing a fairly broad read, including gateway session handling and iOS runtime state. It has already run the focused gateway tests successfully from inside the review process; I’m waiting for the final findings block.

No final verdict yet. I’m keeping the review process alive rather than starting another overlapping check, since this helper shells out to Codex and can take a while on a multi-surface diff.

Still running without new stdout. If it returns actionable findings, I’ll verify them against the code before patching; otherwise I’ll close out with the clean review result and the proof that did run.

The review has been quiet for a few polling intervals, which usually means the nested reviewer is still reasoning rather than blocked on a prompt. I’m giving it more time before considering it hung.

It remains active but silent. I’m going to poll with a longer wait once; if there’s still no output, I’ll inspect whether the nested process is genuinely still doing work or stuck.

The long poll produced no new output, so I’m checking the process tree now. That will tell me whether the nested review is still consuming work or has stalled behind a child process.

There are two active `codex-review` helper trees from this checkout, likely from the interrupted/resumed turn plus the current rerun. I’m checking their activity before deciding whether to wait or rely on a fresh single run.

The active review process is still consuming CPU, so it is not hung. I’ll let that complete rather than starting a third reviewer.

Codex Review raised one actionable finding: a fast relay error during startup can be recorded and then immediately cleared by `markRealtimeActive()` before the gateway ever sends `ready`. I’m fixing that by treating relay `ready` as the only signal that makes realtime active, and by returning the captured startup issue to the fallback path.

I’m making a narrow fix in `TalkModeManager`: relay startup will no longer mark realtime active after local setup. It will only become active when the relay emits `ready`, and if an issue arrived during startup it returns `.unavailable(issue)` so the existing fallback state path can keep the reason visible.

To keep the regression covered without standing up a fake gateway socket, I’m extracting the relay status handling into a small private method and exposing it only through the existing DEBUG test surface. That lets the test prove a relay issue survives until the actual `ready` status arrives.

I’m adding the minimal test hook now, scoped to DEBUG alongside the other TalkModeManager test-only methods.

The test will cover the exact state transition behind the finding: a captured startup issue is visible while relay status is still connecting, and only a real `Listening (Realtime)` ready status clears it.

The code change is in place. I’m running formatting and the focused gateway test lane again, then I’ll rerun Codex Review because the accepted finding changed code.

Focused checks passed after the fix. I’m checking for leftover nested review processes before the required rerun, because the interrupted review left more than one helper process in the tree earlier.

No active review processes remain. I’m rerunning the helper cleanly now with auto tests disabled, since the focused proof already ran and the helper’s default full check would be too broad for this dirty iOS/gateway patch.

The rerun is active and reading the updated diff. I’ll wait for its final findings before making any more changes.

The rerun is looking at the relay start path now. One thing I’m watching closely is that `RealtimeTalkRelaySession.start()` currently emits a local “Listening (Realtime)” after microphone setup; if the reviewer flags that as still pre-ready, I’ll adjust that status source rather than broadening the state machine.

The reviewer has inspected the shared gateway request error path and the full relay lifecycle. I’m still waiting for the verdict; no change will be made until it returns.

The review is still active. It is checking two likely edge paths: request-time gateway error details and relay cleanup after a pre-ready failure. I’ll keep the rerun single-threaded and wait for the final output.

The rerun found a real stale-state bug: after a healthy relay closes, the new “Active now” field can keep showing realtime. I’m fixing that and also tightening the related readiness signal so only the gateway `ready` event marks relay realtime as active.

Now I’m clearing both the visible active state and the closed relay object when the relay reports `Ready` after a close, so the next start cannot mistake a closed relay for an active one.

I’m extending the iOS state test to cover the close path Codex flagged: active realtime after `Listening (Realtime)`, then inactive after `Ready`.

The accepted finding is patched. I’m rerunning the same focused checks, then I’ll rerun Codex Review again as required.

The focused checks are green again. I’m starting the final Codex Review rerun; if this one exits cleanly, I’ll stop there per the skill contract.

The final rerun is still active. It has reviewed the updated close-state handling and relay readiness path; I’m waiting for the final clean-or-findings line.

The latest finding is also valid: I reused `pendingRealtimeIssue` both as “last fallback reason” and “this relay attempt failed,” so a retry could inherit an old failure. I’m separating the per-attempt relay issue from the displayed last issue, while preserving the UI detail from the earlier fallback.

The first patch context was stale because the descriptor initializer had a different local shape. I’ve read the current lines and am applying the same fix against the actual file.

I’m adding a tiny helper for the retry preparation so the behavior is explicit and testable: clear the retry-only issue, clear the fallback trigger, but keep the displayed last issue text.

I’m wiring the helper into the DEBUG test surface and adding a regression for stale fallback issues blocking retry.

The regression keeps the visible “last issue” text intact but proves the retry trigger is no longer stale.

The stale retry issue is patched. I’m rerunning formatting, diff check, and the focused gateway tests before another review pass.

Focused checks are green after the retry fix. I’m rerunning Codex Review again; this has to end clean before I hand off.

The review is running through the updated retry path. It has not returned findings yet; I’m waiting for the final helper result.

The reviewer is right: returning `.started` before the provider `ready` event still bypasses fallback for async bridge failures. I’m changing relay startup to wait for `ready`, a classified pre-ready error/close, or a bounded startup timeout before `startRealtimeRelayIfAvailable()` reports success.

Continuing from the review loop. I’m going to finish the last accepted finding first: the relay startup path now waits for gateway readiness, but the manager still needs to preserve the structured issue captured during that wait.

The branch is already in place with only the intended files modified. I found the exact manager catch path that can still throw away the startup issue; I’m patching that narrowly now.

There are no scoped iOS instructions under `apps/ios`. I’m also checking the gateway scoped guides before final review because this change touches `src/gateway` and a server-method subdirectory.

The gateway guide doesn’t add any extra constraints for this patch beyond using the focused gateway tests, which are already the right proof path. I’m applying the manager fix now.

The structured relay issue is now preserved through the thrown startup path, so native fallback should receive the real reason instead of a generic start error. I’m running the focused formatting and test proof next.

Focused gateway tests and Swift formatting are green. I’m doing a whitespace diff check, then I’ll rerun the local Codex review pass to verify the accepted finding is actually closed.

The diff check passed. I’m using the `codex-review` skill now and disabling its broad auto-test hook so it reviews the actual dirty branch without kicking off a repo-wide check.

The review is still running. One thing visible in the streamed diff is a duplicate `code` field in a gateway payload helper; I’ll wait for the review to finish, then fix that along with any accepted findings it reports.

The review is digging through the gateway event subscription path now. So far it verified the duplicate `code` field is not present in the actual file, only an artifact of the earlier streamed diff, so there’s nothing to patch there.

The reviewer is now checking lifecycle interactions around start/stop and subscriptions. I’m letting it finish so we have a complete accepted/rejected finding list instead of pre-empting with partial output.

The nested review reran the focused gateway tests itself as part of its verification path; they passed again. It has started the final review command and is still active.

No new output yet; the final nested review process is still running.

The review found a real race: the readiness wait subscribes after `talk.session.create`, while gateway events are not replayed. I’m fixing it by registering the event stream before creating the relay session, then starting the pump only after the returned relay id is known so buffered events are filtered correctly.

The subscription now exists before the gateway can emit `ready`/`error`, while the event pump still starts after `relaySessionId` is set. I’m rerunning the focused proof and then the review.

Formatting, focused gateway tests, and diff whitespace are green after the race fix. I’m rerunning Codex review now; if it comes back clean I’ll stop there per the review skill contract.

The review is running again and has the race fix in its diff. I’m waiting for the final accepted/rejected finding list.

The review is still inspecting lifecycle and gateway request details. No accepted finding has been emitted yet in this pass.

The review is still active. It is spending time validating the Swift gateway error propagation and event stream behavior; I’ll keep polling until it exits.

Still running. The current pass has not produced a new accepted finding in the visible output; it is validating against provider `onReady` contracts and stop/reset behavior.

The helper has reached its final nested `codex review --uncommitted` call again. I’m waiting for that to finish; no code edits while it’s still running.

No new output yet; final nested review is still running.

Still waiting on the nested review process.

The review process is taking longer than the previous pass, but it is still alive and producing no findings yet.

Still running. I’m not starting any parallel commands against the worktree while the review owns it.

No output yet from the nested review. I’m giving it more time rather than interrupting, since the review skill asks for the clean helper result.

The second review found another real lifecycle bug: an intentional stop while startup is waiting can leave the timeout armed and later report a fake realtime failure. I’m fixing the close path to complete the startup wait as cancelled, so stop/pause cannot mutate the UI afterward.

The close path now resumes any pending startup wait as `cancelled`, and the timeout explicitly no-ops once closed. I’m rerunning the focused proof before another review pass.

Focused proof is green after the cancellation fix. I’m rerunning Codex review again and will continue only if it reports another accepted finding.

The latest review pass is running and currently reading the diff and tests. I’m waiting for the final result.

The review is validating the gateway protocol schema around the new structured issue details. Still no final accepted finding in this pass yet.

The review is now down in the Talk event controller and relay emission helper. Still waiting for completion.

The review has again reached its nested `codex review --uncommitted` verification step. Waiting for the result.

No output yet from the final nested review.

Still waiting on the nested review.

The nested review emitted a very large search result while checking `session.error` consumers; it is still running.

No new output; still waiting for the review closeout.

The review process is still active with no new output.

The next finding is also valid: subscribing before create fixed the event race but left no owner to cancel the stream if `talk.session.create` itself fails. I’m changing that shape so the event pump owns the stream immediately, buffers pre-id events locally, and then replays them after the relay id is known. That keeps the race fix without leaking subscriptions.

I’m moving `createRelaySession()` inside the cleanup `do/catch` and starting the pump immediately after subscription. Pre-id events get buffered instead of processed, then drained once `relaySessionId` exists and the local audio path is ready.

The event subscription now has an owning task even if create fails, and early events are bounded to the same 200-event window as the stream. I’m rerunning formatting, focused gateway tests, and diff checks.

Focused checks are green after the subscription ownership fix. I’m rerunning Codex review again.

The review is running with the revised event-pump ownership model. It is still reading the core start paths and has not emitted a finding yet.

The review has started checking the new pre-relay event buffering against the tests. I’m waiting for the final verdict.

The nested reviewer tried a Vitest `--runInBand` flag, which this repo’s Vitest rejects; it then started the correct focused `pnpm test` command. I’m waiting for the review result rather than treating that internal attempt as our verification.

The review caught that the Swift relay event test no longer reaches the event branch because pre-id events are now buffered. I’m adding a narrow test hook to seed the relay id and updating that test.

The test now seeds `relaySessionId` before dispatching synthetic relay events, so it exercises the classified-error path instead of the pre-id buffer. I’m rerunning the focused checks.

Checks are green after the test fix. I’m rerunning Codex review; this is the closeout gate now.

I’m continuing from the review pass after the latest relay test hook fix. I’ll rerun the Codex review helper now and only close out once any accepted findings are handled.

The review is still running. It has already dug into the relay startup path, so I’m waiting for the actual findings rather than reacting to partial output.

The nested reviewer is checking the current uncommitted diff directly now. I’m watching specifically for findings it accepts so I can verify them against the code before editing.

Still waiting on the review command. No final finding output yet.

The review is still active and quiet, which usually means the reviewer is in the final diff pass. I’ll let it finish rather than interrupt and lose the result.

The helper has been running for a few minutes with no new output. I’m keeping the session open until it exits so we have a valid clean-or-findings result.

Codex review completed clean with no accepted/actionable findings. I’m doing one final status check so the closeout accurately reflects the branch and dirty files.

Implemented on branch `improve-ios-talk-runtime-transparency`. Changes are uncommitted and not pushed.

What changed:
- iOS Talk now distinguishes configured mode vs active mode in `apps/ios/Sources/Design/TalkProTab.swift` and Settings.
- Realtime gateway relay failures now surface structured issue text instead of silently falling back.
- iOS fallback now shows `iOS Speech fallback` plus the last realtime/gateway issue.
- Gateway realtime relay now emits classified failures like invalid credentials, missing credentials, unavailable model/provider, network errors, and provider closed before ready.
- Relay startup now waits for provider readiness before marking realtime active, so immediate socket/API-key failures fall back clearly.

Review/proof:
- `CODEX_REVIEW_AUTO_TESTS=0 bash [HOME_PATH] --mode local`
  - Result: clean, no accepted/actionable findings.
  - Final findings accepted: none.
  - Final findings rejected: none.
- `pnpm test src/gateway/talk-realtime-relay.test.ts src/gateway/server-methods/talk.test.ts`
- `swiftformat --lint --config config/swiftformat ...`
- `git diff --check`

One proof gap: direct `xcodebuild` iOS app-target tests are still blocked by unrelated existing symbols `AppleReviewDemoMode` / `AppleReviewDemoChatTransport`. The focused gateway tests and Swift formatting checks passed.

[user]
run on my phone
how would I test this?

[assistant]
I’ll try to install/run the iOS app on the connected device first, then I’ll give you a concrete manual test script for the new Talk failure states.

I found the paired physical device: `Nimrod’s iPhone` (`iPhone 16 Pro`). I’m going to build the Debug app for that device and install/launch it with `devicectl`; the repo’s `ios:run` script is simulator-oriented, so I’ll use the physical-device path directly.

Signing and version config are prepared. Next I’m regenerating the Xcode project, then I’ll build against the phone’s device id.

The project regenerated cleanly. I’m starting the device build now; if it hits the known unrelated Apple review demo compile issue, I’ll report that clearly instead of masking it.

The device build is still running and compiling dependencies/extensions for `iphoneos`. I’ll poll until it either produces the app bundle or fails with actionable diagnostics.

The physical-device build succeeded. I’m installing the generated `OpenClaw.app` onto the paired iPhone and launching it.

The app installed successfully, but iOS refused the launch because the phone is locked. Please unlock `Nimrod’s iPhone` and leave it on the home screen; I’ll retry the launch.

It’s still locked from `devicectl`’s perspective. The build is installed, so once the phone is unlocked I can launch it remotely, or you can tap the OpenClaw icon directly.

The device is connected now; I’m retrying launch once more.

The phone disconnected during the launch request. I’ll check whether it came back before I stop; the app is already installed even if remote launch is flaky.

The phone is back to paired/available, not actively connected. I’m going to stop retrying remote launch for now because the install succeeded; you can open the newly installed local build from the phone.

I’m checking the worktree once after the phone build because Xcode generation/build scripts can touch generated files; I want the handoff to be precise.

Built and installed the Debug app on `Nimrod’s iPhone`.

Bundle installed: `ai.openclaw.ios.test.guti-gzs353x62e`

Remote launch did not complete because the phone was locked first, then the device disconnected during the retry. The app is installed, so open it from the phone. If you unlock it and want me to retry remote launch, I can.

How to test this:

1. Start with a valid realtime gateway config.
   - Pair/connect the iPhone to the gateway.
   - Open Talk.
   - Expected: the card shows `Configured` as the selected gateway voice mode.
   - Start Talk.
   - Expected: `Active now` becomes the realtime mode, status is `Listening (Realtime)`, and `Last issue` is empty.

2. Test bad API key / immediate realtime failure.
   - Put an invalid or missing OpenAI API key in the gateway realtime config.
   - Restart/reload the gateway if needed.
   - Reconnect the iPhone, open Talk, start Talk.
   - Expected: app falls back to native iOS Speech/TTS.
   - Expected UI:
     - `Configured`: OpenAI realtime/gateway relay
     - `Active now`: `iOS Speech fallback`
     - `Last issue`: credential/missing-key style diagnostic
     - status: `Listening (iOS Speech fallback)`

3. Test bad realtime model.
   - Configure a bogus realtime model.
   - Start Talk.
   - Expected: fallback to `iOS Speech fallback`, with `Last issue` showing model/provider failure detail.

4. Test recovery.
   - Restore a valid API key/model.
   - Restart/reload gateway, reconnect app, start Talk again.
   - Expected: realtime becomes active; stale fallback issue should not block retry, and successful realtime clears `Last issue`.

You can also verify the same fields in Settings under the Talk/Gateway voice section: `Active Voice` and `Last Voice Issue`.

[user]
can we make the fallback clearer with a small banner like we have when the gateway connection is not working? spec and plan

[assistant]
I’ll look at the existing gateway connection problem banner/component first so the plan fits the current iOS design instead of inventing a parallel warning style.

Yes. I’d make the fallback state a first-class scoped warning, not just a row in the Voice mode card.

**Spec**
Add a Talk-only banner shown when configured realtime/gateway relay is unavailable and the app is currently using native iOS Speech/TTS fallback.

Placement:
- In `TalkProTab`, directly under the Talk header and above the hero card.
- In Settings, keep the current `Last Voice Issue` row, but optionally add the same compact banner at the top of the Talk/Gateway voice section if fallback is active.

Copy:
- Title: `Using iOS Speech fallback`
- Message: `Realtime voice is unavailable. Talk is still running with iOS speech recognition and TTS.`
- Detail text from `TalkRuntimeIssue.displayMessage`, for example:
  - `OpenAI API key is invalid or missing.`
  - `Realtime closed before it became ready.`
  - `Configured realtime model is unavailable.`
- Owner label: `Fix on gateway` for credential/model/provider/config issues, `Check network` for network/socket issues.

Actions:
- Primary: `Open Settings`
  - Opens Voice & Talk settings / gateway voice config surface.
- Secondary: `Details`
  - Shows diagnostic summary: provider, model, transport, phase, issue code, message.
- Optional later: `Retry realtime`
  - Only when Talk is still enabled and gateway is connected. Internally restart the Talk runtime, but not needed for v1.

Behavior:
- Banner appears when `gatewayTalkActiveModeTitle == "iOS Speech fallback"` or a stronger explicit `gatewayTalkFallbackIssue` exists.
- Banner clears when realtime successfully becomes active.
- Banner clears when Talk is stopped.
- Banner does not appear for normal native iOS Speech + TTS mode when that was the configured mode.
- Banner should not replace the existing gateway connection banner; if gateway is offline, the gateway banner remains the primary problem.

**Implementation Plan**
1. Add a dedicated `TalkRuntimeIssueBanner` SwiftUI component, modeled visually after `GatewayProblemBanner` in `apps/ios/Sources/Gateway/GatewayProblemView.swift`, but smaller and scoped to Talk.

2. Extend `TalkRuntimeIssue` with UI metadata:
   - `bannerTitle`
   - `ownerLabel`
   - `tint`
   - maybe `technicalDetails`
   Keep classification in `apps/ios/Sources/Voice/TalkModeGatewayConfig.swift`.

3. Expose a structured current fallback issue from `TalkModeManager`, not only `gatewayTalkLastIssueText`.
   - Current text is fine for rows, but the banner/details sheet should avoid reparsing strings.
   - Example: `gatewayTalkCurrentFallbackIssue: TalkRuntimeIssue?`.

4. Render the banner in `TalkProTab`.
   - Place between `header` and `voiceHeroCard`.
   - Use `Open Settings` for the primary action.
   - Add a local `@State showTalkIssueDetails`.

5. Add `TalkRuntimeIssueDetailsSheet`.
   - Keep it simpler than gateway connection details.
   - Include issue message, configured provider/model, transport, phase, and code.
   - Add copyable diagnostic text if useful.

6. Tests:
   - Swift unit test that fallback issue produces banner-visible state.
   - Swift unit test that successful realtime clears the fallback issue.
   - Existing gateway tests remain unchanged.
   - Manual phone test: bad API key should show banner plus active fallback; valid API key should remove banner.

[user]
let's do that

[assistant]
I’ll implement this as a Talk-scoped banner and details sheet, reusing the existing gateway problem visual language but keeping the state owned by `TalkModeManager`. After edits I’ll run Swift formatting/lint and a device build if feasible.

I found the existing gateway banner and the current Talk UI. I’ll add a small reusable Talk issue banner component under `apps/ios/Sources/Gateway` or `Design`, then wire it to a structured fallback issue on the manager instead of only the diagnostic string.

I’m going to keep the structured issue as optional manager state named `gatewayTalkCurrentFallbackIssue`. That makes the banner only appear for real fallback, while `gatewayTalkLastIssueText` can stay as historical/diagnostic text in the rows.

Next I’m wiring that issue state into the manager so the UI can tell “fallback active now” apart from “there was an earlier realtime issue.”

I’m adding the banner/details views now. They reuse the same material, border, icon, and small action-button pattern as the gateway banner, but with Talk-specific copy and diagnostics.

I’m placing the banner above the Talk hero and hiding it when the gateway itself is offline, so the global gateway problem banner remains the top-level failure indicator.

I’m also adding the same compact banner to the Settings Talk voice card, since that’s where someone will naturally go after tapping “Open Settings.”

I’m adding a second details sheet state in Settings and using the same Talk fallback details sheet there.

The Settings view already has multiple sheets for gateway details and QR scanning, so I’m following that pattern rather than introducing a generic sheet router for this small addition.

To avoid nesting a card-like banner inside the Settings card, I’m making it a sibling immediately above the Talk voice settings card.

I’ll gate the Settings banner on the same gateway-connected check used by Settings status, so an offline gateway still points users to the connection problem first.

I’m extending the focused Talk tests to lock the banner state: fallback sets a structured current issue, realtime success clears it, and retry preparation clears stale fallback state.

The implementation and tests are in place. I’m formatting the touched Swift files before running focused checks.

Formatting passed. I’m regenerating the iOS project so the new Swift file is included, then I’ll run the same physical-device Debug build path.

The Xcode project regenerated cleanly. I’m running a device build now to catch SwiftUI or symbol issues in the new banner.

The updated app builds successfully for the physical iPhone. I’m installing this new build onto the phone now.

The new build is installed. I’m going to try the focused Talk unit tests on the device; if the phone is locked or test launching is denied, I’ll report that as a device-state blocker rather than treating it as a code failure.

The device build passed and install passed. The focused test build exposed an existing test compile issue in `RealtimeTalkRelaySessionTests`: it references `EventFrame` without importing the module that defines it. I’ll fix that now because it blocks the Talk test target.

That test compile blocker was just a missing `OpenClawProtocol` import. I’ve added it and will rerun formatting plus the focused Talk test command.

The focused Talk manager tests now pass on the iPhone

...[transcript truncated to 50000 chars]

@openclaw-barnacle openclaw-barnacle Bot added app: ios App: ios gateway Gateway runtime size: XL maintainer Maintainer-authored PR labels Jun 7, 2026
@ngutman ngutman marked this pull request as ready for review June 7, 2026 15:42
@clawsweeper

clawsweeper Bot commented Jun 7, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs maintainer review before merge. Reviewed June 7, 2026, 12:09 PM ET / 16:09 UTC.

Summary
The PR adds structured gateway realtime-talk issue metadata plus iOS Talk UI/state/tests that show configured versus active voice mode and a fallback banner when realtime startup fails.

PR surface: Source +117, Tests +239, Other +860. Total +1216 across 15 files.

Reproducibility: yes. by source inspection: current main falls back to native speech with plain Listening status and can set Ready after relay close without preserving a structured realtime failure issue.

Review metrics: 1 noteworthy metric.

  • Changed realtime-talk surfaces: 2 runtime surfaces plus iOS UI/tests changed. The PR spans gateway relay error metadata and iOS fallback state/UI, so maintainers should review the cross-boundary behavior as one user-visible flow.

Merge readiness
Overall: 🐚 platinum hermit
Proof: 🦞 diamond lobster
Patch quality: 🐚 platinum hermit
Result: ready for maintainer review.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

  • none.

Risk before merge

  • [P2] Realtime startup sequencing now waits for a ready/error/timeout outcome, so maintainers should be comfortable that slow but valid provider starts do not fall back too aggressively.
  • [P1] The PR proof does not cover successful provider audio across every realtime provider/model combination, which leaves some cross-provider behavior to CI or maintainer follow-up.

Maintainer options:

  1. Decide the mitigation before merge
    Land after maintainer review confirms the realtime startup timing and safe gateway issue messaging are acceptable across supported relay providers.
  2. Pause or close
    Do not merge this PR until maintainers decide whether the risk is worth taking.

Next step before merge

  • [P2] Protected maintainer-labeled MEMBER PR should stay with human maintainer review rather than an automated close or repair lane.

Security
Cleared: No concrete security or supply-chain concern was found; the diff adds no dependencies or workflows and the new diagnostics reuse client-visible gateway error details.

Review details

Best possible solution:

Land after maintainer review confirms the realtime startup timing and safe gateway issue messaging are acceptable across supported relay providers.

Do we have a high-confidence way to reproduce the issue?

Yes by source inspection: current main falls back to native speech with plain Listening status and can set Ready after relay close without preserving a structured realtime failure issue.

Is this the best way to solve the issue?

Yes: additive gateway issue metadata plus iOS-owned runtime/fallback state is the right layer, and no existing config, docs, or CLI setting solves the visibility problem.

AGENTS.md: found and applied where relevant.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 66b91d78feb3.

Label changes

Label justifications:

  • P2: This is a normal-priority user-facing improvement to a limited iOS realtime-talk fallback workflow, with focused tests and physical-device proof.
  • rating: 🐚 platinum hermit: Overall readiness is 🐚 platinum hermit; proof is 🦞 diamond lobster and patch quality is 🐚 platinum hermit.
  • status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Sufficient (terminal): The PR body includes after-fix physical iPhone build/install/launch proof and manual verification that the fallback banner appears during temporary gateway failover.
  • proof: sufficient: Contributor real behavior proof is sufficient. The PR body includes after-fix physical iPhone build/install/launch proof and manual verification that the fallback banner appears during temporary gateway failover.
Evidence reviewed

PR surface:

Source +117, Tests +239, Other +860. Total +1216 across 15 files.

View PR surface stats
Area Files Added Removed Net
Source 2 132 15 +117
Tests 2 242 3 +239
Docs 0 0 0 0
Config 0 0 0 0
Generated 0 0 0 0
Other 11 920 60 +860
Total 15 1294 78 +1216

Acceptance criteria:

  • [P1] Review PR CI for the exact head SHA b6fd32e.
  • [P1] Confirm iOS simulator tests and gateway relay tests remain green after any maintainer-requested changes.
  • [P2] Optionally exercise one successful realtime provider path to complement the failure/fallback proof.

What I checked:

  • Repository review policy applied: Root AGENTS.md requires full-source review, scoped AGENTS checks, protected maintainer handling, dependency-backed contract proof, and careful treatment of fallback/protocol behavior. (AGENTS.md:1, 66b91d78feb3)
  • Current main loses realtime failure context on fallback: On current main, realtime start returns a Bool and a false result falls through to native speech, which sets the visible status to plain Listening without preserving why realtime failed. (apps/ios/Sources/Voice/TalkModeManager.swift:301, 66b91d78feb3)
  • Current main can make relay close-before-ready look clean: Realtime relay event handling forwards raw error text, but close handling sets Ready and closes regardless of whether the relay ever became ready. (apps/ios/Sources/Voice/RealtimeTalkRelaySession.swift:332, 66b91d78feb3)
  • PR adds iOS structured fallback state: The PR replaces the Bool realtime-start path with a structured started/unavailable/ignored result, records a TalkRuntimeIssue, and keeps fallback status/active-mode fields visible when native speech takes over. (apps/ios/Sources/Voice/TalkModeManager.swift:313, b6fd32ed6ed1)
  • PR adds non-droppable structured gateway relay lifecycle issues: The gateway relay now emits realtime_unavailable issue metadata for connect/error/close-before-ready paths and keeps ready/error/close lifecycle events non-droppable while leaving audio-like events droppable. (src/gateway/talk-realtime-relay.ts:159, b6fd32ed6ed1)
  • Protocol contract supports additive error details: Gateway errorShape accepts optional details and the Swift client preserves GatewayResponseError.details, so the added talkIssue metadata is additive rather than a protocol-breaking shape change. (packages/gateway-protocol/src/schema/error-codes.ts:24, 66b91d78feb3)

Likely related people:

  • @ngutman: Authored the current PR and appears in recent iOS Talk realtime/fallback work touching TalkModeManager, RealtimeTalkRelaySession, and TalkPro UI. (role: feature owner and recent area contributor; confidence: high; commits: 6897711d1991, 6b84b980d37b, b6fd32ed6ed1; files: apps/ios/Sources/Voice/TalkModeManager.swift, apps/ios/Sources/Voice/RealtimeTalkRelaySession.swift, apps/ios/Sources/Design/TalkProTab.swift)
  • @vincentkoc: Recent gateway talk relay/session lifecycle history is relevant to the changed relay issue event and talk.session.create behavior. (role: gateway talk relay contributor; confidence: high; commits: 66b91d78feb3, 548785581580; files: src/gateway/talk-realtime-relay.ts, src/gateway/server-methods/talk-session.ts)
  • @Solvely-Colin: Prior nearby iOS Pro and gateway-flow work is adjacent to the visible TalkPro/Settings surfaces changed by this PR. (role: adjacent iOS Pro/gateway-flow contributor; confidence: medium; commits: f6e51ff99af4; files: apps/ios/Sources/Design/TalkProTab.swift, apps/ios/Sources/Design/SettingsProTabSections.swift)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

@clawsweeper clawsweeper Bot added proof: sufficient ClawSweeper judged the real behavior proof convincing. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. P2 Normal backlog priority with limited blast radius. labels Jun 7, 2026
@ngutman ngutman merged commit 47dbc67 into main Jun 7, 2026
167 checks passed
@ngutman ngutman deleted the improve-ios-talk-runtime-transparency branch June 7, 2026 17:21
@ngutman

ngutman commented Jun 7, 2026

Copy link
Copy Markdown
Member Author

Merged via squash.

  • Prepared head SHA: b6fd32ed6ed1f97f124c1de52f134ff0e399e391
  • Merge commit: 47dbc675e95363828c0537fd21b0cfdd93052ece
  • Local prep note: pnpm build passed. pnpm check hit the npm shrinkwrap guard because @anthropic-ai/sdk@0.100.1 is no longer resolvable before 2026-05-24T20:18:43Z; the same shrinkwrap guard failure reproduces on current origin/main at 66b91d78fe, and this PR does not touch dependency manifests or lockfiles.

Thanks @ngutman!

github-actions Bot pushed a commit to Desicool/openclaw that referenced this pull request Jun 8, 2026
Merged via squash.

Prepared head SHA: b6fd32e

Local prep note: pnpm build passed. pnpm check hit the npm shrinkwrap guard because @anthropic-ai/sdk@0.100.1 is no longer resolvable before 2026-05-24T20:18:43Z; the same shrinkwrap guard failure reproduces on current origin/main at ecd11a4, and this PR does not touch dependency manifests or lockfiles.

Co-authored-by: ngutman <1540134+ngutman@users.noreply.github.com>
Reviewed-by: @ngutman
wangmiao0668000666 pushed a commit to wangmiao0668000666/openclaw that referenced this pull request Jun 9, 2026
Merged via squash.

Prepared head SHA: b6fd32e

Local prep note: pnpm build passed. pnpm check hit the npm shrinkwrap guard because @anthropic-ai/sdk@0.100.1 is no longer resolvable before 2026-05-24T20:18:43Z; the same shrinkwrap guard failure reproduces on current origin/main at 66b91d7, and this PR does not touch dependency manifests or lockfiles.

Co-authored-by: ngutman <1540134+ngutman@users.noreply.github.com>
Reviewed-by: @ngutman
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

app: ios App: ios gateway Gateway runtime maintainer Maintainer-authored PR P2 Normal backlog priority with limited blast radius. proof: sufficient ClawSweeper judged the real behavior proof convincing. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. size: XL status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant