fix(ios): stabilize chat streaming layout and session flow by eulicesl · Pull Request #50483 · openclaw/openclaw

eulicesl · 2026-03-19T15:11:39Z

Summary

Problem: iOS chat streaming could attach to a synthetic placeholder lane, churn during initial history/bootstrap, and reparse streaming content in ways that caused flicker or session drift.
Why it matters: active conversations can show unstable streaming UI, lose the correct lane/session association, or visibly flicker while tokens arrive.
What changed: scoped streaming to the active session key, hardened stream acceptance/initial load behavior, and stabilized layout parsing/visibility updates.
What did NOT change (scope boundary): no gateway protocol changes, no new model behavior. This fix lives in the shared OpenClawChatUI layer used by both iOS and macOS. The bug (cross-run UI drift during concurrent session-key agent events) affects any client using ChatViewModel; macOS uses the same chat view and benefits equally, but the fix is not platform-specific — it guards shared streaming state.
AI-assisted: yes (AI-assisted implementation, contributor-reviewed).
Testing level: targeted shared-package tests plus iOS build validation.

Change Type (select all)

Scope (select all touched areas)

Linked Issue/PR

Closes #
Related fix(ios): use plain Text during streaming to eliminate markdown flicker [AI-assisted] #40985
This PR fixes a bug or regression

Root Cause (if applicable)

Root cause: streaming state was not consistently tied to the active session key, and downstream UI/layout updates re-parsed or re-evaluated visibility in ways that amplified bootstrap churn into visible instability.
Missing detection / guardrail: there was insufficient coverage around session-scoped streaming updates plus layout stability under rapid token updates/history refresh.
Prior context (git blame, prior PR, issue, or refactor if known): this branch supersedes fix(ios): use plain Text during streaming to eliminate markdown flicker [AI-assisted] #40985 by fixing the broader lane/acceptance/layout causes instead of only the earlier visible symptom.
Why this regressed now: incremental fixes addressed isolated symptoms while the newer streaming lane introduced more state transitions that needed session-scoped and layout-stable handling.
If unknown, what was ruled out: the issue was not isolated to one text-rendering widget; it involved session scoping, bootstrap timing, and layout parsing together.

Regression Test Plan (if applicable)

Coverage level that should have caught this:
- Unit test
- Seam / integration test
- End-to-end test
- Existing coverage already sufficient
Target test or file:
- apps/shared/OpenClawKit/Tests/OpenClawKitTests/ChatViewModelTests.swift
- apps/shared/OpenClawKit/Tests/OpenClawKitTests/AssistantTextParserTests.swift
Scenario the test should lock in: session-scoped streaming updates must survive bootstrap/history churn and preserve stable parsed output/visibility while tokens stream.
Why this is the smallest reliable guardrail: the bug lives in shared chat state/parsing logic, so package-level tests cover the core behavior without requiring a full end-to-end harness.
Existing test that already covers this (if any): targeted parser/view-model tests were added/expanded for this branch.
If no new test is added, why not: N/A.

User-visible / Behavior Changes

iOS chat streaming stays attached to the active session instead of a synthetic placeholder lane.
Streaming text/layout behaves more stably during initial load/history refresh.
Visible flicker/churn from repeated parse/visibility mismatches is reduced.

Diagram (if applicable)

N/A

Security Impact (required)

New permissions/capabilities? (Yes/No): No
Secrets/tokens handling changed? (Yes/No): No
New/changed network calls? (Yes/No): No
Command/tool execution surface changed? (Yes/No): No
Data access scope changed? (Yes/No): No
If any Yes, explain risk + mitigation: N/A

Repro + Verification

Environment

OS: macOS local dev environment
Runtime/container: shared OpenClawKit package + iOS app build
Model/provider: N/A
Integration/channel (if any): iOS chat UI / streaming flow
Relevant config (redacted): local iOS simulator build

Steps

Build the shared package and iOS app from this branch.
Open a chat session and trigger a streaming reply.
Observe initial history/bootstrap plus active token streaming.

Expected

Streaming binds to the active session.
Bootstrap/history churn does not destabilize the active stream.
Layout remains visually stable while tokens arrive.

Actual

Matched expected in targeted tests and local build validation.

Evidence

Attach at least one:

Failing test/log before + passing after
Trace/log snippets
Screenshot/recording
Perf numbers (if relevant)

Human Verification (required)

Verified scenarios:
- swift test --package-path apps/shared/OpenClawKit --filter ChatViewModelTests
- swift test --package-path apps/shared/OpenClawKit --filter AssistantTextParserTests
- xcodegen generate
- xcodebuild -project apps/ios/OpenClaw.xcodeproj -scheme OpenClaw -destination 'generic/platform=iOS Simulator' CODE_SIGNING_ALLOWED=NO build
Edge cases checked:
- stream acceptance during bootstrap/history load
- parser stability under repeated streaming updates
What you did not verify:
- full repository CI matrix
- non-iOS clients

Review Conversations

I replied to or resolved every bot review conversation I addressed in this PR.
- All 19 review threads (P1/P2 from chatgpt-codex-connector[bot] + copilot-pull-request-reviewer) resolved in this update.
I left unresolved only the conversations that still need reviewer or maintainer judgment.

If a bot review conversation is addressed by this PR, resolve that conversation yourself. Do not leave bot review conversation cleanup for maintainers.

Compatibility / Migration

Backward compatible? (Yes/No): Yes
Config/env changes? (Yes/No): No
Migration needed? (Yes/No): No
If yes, exact upgrade steps: N/A

Risks and Mitigations

Risk: chat-state timing bugs can reappear under bootstrapping or external-final sync edge cases.
- Mitigation: this patch scopes state to the active session and adds targeted parser/view-model coverage around the churn paths.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a1f9853bc9

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

greptile-apps · 2026-03-19T15:23:57Z

Greptile Summary

This PR addresses three root causes behind iOS chat instability: session mis-scoping, stream acceptance / initial-load churn, and repeated layout re-parses. The changes are well-motivated and the majority of the implementation is sound — the stable AssistantTextSegment IDs, the shouldAcceptAgentEvent / matchesCurrentSessionKey routing logic, the load() idempotency guard, the typing-indicator visibility fix, and the streaming scroll debounce all look correct and are backed by good test coverage.

Key findings:

Logic issue — errored run can permanently append partial content without history reconciliation: In handleChatEvent, the "error" state shares the same case arm as "final" and "aborted". When a run errors mid-stream, appendFinalAssistantMessage falls through to streamedAssistantMessage(), which synthesizes a message from the partial streamingAssistantText. Because appendFinalAssistantMessage returns true, refreshHistoryAfterRun() is never called. The partial, incomplete response is now permanently in messages with no server reconciliation — on next reload it would silently vanish, creating a visible inconsistency. The "error" branch should be handled separately and should always trigger a history refresh.
Style note — hardcoded session-key alias: matchesCurrentSessionKey only handles the "agent:main:main" ↔ "main" alias pair. Other configurations following the same agent:<name>:<name> ↔ <name> convention would silently drop events.

Confidence Score: 3/5

Mostly safe to merge, but the error-state partial-message bug can cause silent client/server state drift on run failures with partial streaming content.
The bulk of the PR is well-implemented and tested. The one logic issue — errored runs bypassing refreshHistoryAfterRun() when partial streaming content exists — is a behavioural regression that creates data inconsistency, though it only fires on the specific path of (run errors) + (partial visible streaming text buffered). All other changes are safe and the new tests cover the happy paths well.
apps/shared/OpenClawKit/Sources/OpenClawChatUI/ChatViewModel.swift — specifically the "error" state arm of handleChatEvent (around line 874) and the matchesCurrentSessionKey alias logic.

Comments Outside Diff (2)

apps/shared/OpenClawKit/Sources/OpenClawChatUI/ChatViewModel.swift, line 874-889 (link)

Errored run skips history refresh when partial streaming content exists

For the "error" state, appendFinalAssistantMessage falls through to streamedAssistantMessage() when chat.message is nil (typical for errored runs). If any partial streaming text was buffered, streamedAssistantMessage() creates a synthesized message and returns true. This causes appendFinalAssistantMessage to return true, which suppresses refreshHistoryAfterRun().

The result: partial, incomplete content from a failed run is permanently appended to self.messages as if it were a finalized assistant message, and the client never reconciles with the server. On next reload the partial message would disappear (since the server history for an errored run won't include it), creating a visible inconsistency.

The "final" and "aborted" cases are intentional here — those runs legitimately complete. "error" is different: the server hasn't produced a final message, so skipping the history reconciliation is incorrect.

Consider separating the error branch so it always triggers a history refresh, even when partial content exists:

case "final", "aborted":
    let appendedFinalMessage = self.appendFinalAssistantMessage(from: chat)
    self.pendingToolCallsById = [:]
    self.streamingAssistantText = nil
    if !appendedFinalMessage {
        Task { await self.refreshHistoryAfterRun() }
    }
case "error":
    self.errorText = chat.errorMessage ?? "Chat failed"
    if let runId = chat.runId {
        self.clearPendingRun(runId)
    } else if self.pendingRuns.count <= 1 {
        self.clearPendingRuns(reason: nil)
    }
    self.pendingToolCallsById = [:]
    self.streamingAssistantText = nil
    Task { await self.refreshHistoryAfterRun() }

Prompt To Fix With AI

This is a comment left during a code review.
Path: apps/shared/OpenClawKit/Sources/OpenClawChatUI/ChatViewModel.swift
Line: 874-889

Comment:
**Errored run skips history refresh when partial streaming content exists**

For the `"error"` state, `appendFinalAssistantMessage` falls through to `streamedAssistantMessage()` when `chat.message` is nil (typical for errored runs). If any partial streaming text was buffered, `streamedAssistantMessage()` creates a synthesized message and returns `true`. This causes `appendFinalAssistantMessage` to return `true`, which suppresses `refreshHistoryAfterRun()`.

The result: partial, incomplete content from a failed run is permanently appended to `self.messages` as if it were a finalized assistant message, and the client never reconciles with the server. On next reload the partial message would disappear (since the server history for an errored run won't include it), creating a visible inconsistency.

The `"final"` and `"aborted"` cases are intentional here — those runs legitimately complete. `"error"` is different: the server hasn't produced a final message, so skipping the history reconciliation is incorrect.

Consider separating the error branch so it always triggers a history refresh, even when partial content exists:
```swift
case "final", "aborted":
    let appendedFinalMessage = self.appendFinalAssistantMessage(from: chat)
    self.pendingToolCallsById = [:]
    self.streamingAssistantText = nil
    if !appendedFinalMessage {
        Task { await self.refreshHistoryAfterRun() }
    }
case "error":
    self.errorText = chat.errorMessage ?? "Chat failed"
    if let runId = chat.runId {
        self.clearPendingRun(runId)
    } else if self.pendingRuns.count <= 1 {
        self.clearPendingRuns(reason: nil)
    }
    self.pendingToolCallsById = [:]
    self.streamingAssistantText = nil
    Task { await self.refreshHistoryAfterRun() }
```

How can I resolve this? If you propose a fix, please make it concise.

apps/shared/OpenClawKit/Sources/OpenClawChatUI/ChatViewModel.swift, line 895-908 (link)

Hardcoded alias pair may not generalize to other session configurations

matchesCurrentSessionKey only handles one specific canonical/alias pair: "agent:main:main" ↔ "main". Any other gateway session key following a similar convention (e.g., "agent:work:work" ↔ "work", or "agent:staging:staging" ↔ "staging") would fail the match and silently drop events.

If the alias pattern is always agent:<name>:<name> ↔ <name>, a general rule would be more resilient:

// Generalize: "agent:<name>:<name>" is the canonical form of alias "<name>"
let parts = incomingNormalized.split(separator: ":", maxSplits: 2)
if parts.count == 3 && parts[0] == "agent" && parts[1] == parts[2] && parts[2] == currentNormalized {
    return true
}
let currentParts = currentNormalized.split(separator: ":", maxSplits: 2)
if currentParts.count == 3 && currentParts[0] == "agent" && currentParts[1] == currentParts[2] && currentParts[2] == incomingNormalized {
    return true
}

If "main" truly is the only valid alias in production today, a comment to that effect would help future maintainers understand the constraint.

Prompt To Fix With AI

This is a comment left during a code review.
Path: apps/shared/OpenClawKit/Sources/OpenClawChatUI/ChatViewModel.swift
Line: 895-908

Comment:
**Hardcoded alias pair may not generalize to other session configurations**

`matchesCurrentSessionKey` only handles one specific canonical/alias pair: `"agent:main:main"` ↔ `"main"`. Any other gateway session key following a similar convention (e.g., `"agent:work:work"` ↔ `"work"`, or `"agent:staging:staging"` ↔ `"staging"`) would fail the match and silently drop events.

If the alias pattern is always `agent:<name>:<name>` ↔ `<name>`, a general rule would be more resilient:
```swift
// Generalize: "agent:<name>:<name>" is the canonical form of alias "<name>"
let parts = incomingNormalized.split(separator: ":", maxSplits: 2)
if parts.count == 3 && parts[0] == "agent" && parts[1] == parts[2] && parts[2] == currentNormalized {
    return true
}
let currentParts = currentNormalized.split(separator: ":", maxSplits: 2)
if currentParts.count == 3 && currentParts[0] == "agent" && currentParts[1] == currentParts[2] && currentParts[2] == incomingNormalized {
    return true
}
```
If `"main"` truly is the only valid alias in production today, a comment to that effect would help future maintainers understand the constraint.

How can I resolve this? If you propose a fix, please make it concise.

Prompt To Fix All With AI

This is a comment left during a code review.
Path: apps/shared/OpenClawKit/Sources/OpenClawChatUI/ChatViewModel.swift
Line: 874-889

Comment:
**Errored run skips history refresh when partial streaming content exists**

For the `"error"` state, `appendFinalAssistantMessage` falls through to `streamedAssistantMessage()` when `chat.message` is nil (typical for errored runs). If any partial streaming text was buffered, `streamedAssistantMessage()` creates a synthesized message and returns `true`. This causes `appendFinalAssistantMessage` to return `true`, which suppresses `refreshHistoryAfterRun()`.

The result: partial, incomplete content from a failed run is permanently appended to `self.messages` as if it were a finalized assistant message, and the client never reconciles with the server. On next reload the partial message would disappear (since the server history for an errored run won't include it), creating a visible inconsistency.

The `"final"` and `"aborted"` cases are intentional here — those runs legitimately complete. `"error"` is different: the server hasn't produced a final message, so skipping the history reconciliation is incorrect.

Consider separating the error branch so it always triggers a history refresh, even when partial content exists:
```swift
case "final", "aborted":
    let appendedFinalMessage = self.appendFinalAssistantMessage(from: chat)
    self.pendingToolCallsById = [:]
    self.streamingAssistantText = nil
    if !appendedFinalMessage {
        Task { await self.refreshHistoryAfterRun() }
    }
case "error":
    self.errorText = chat.errorMessage ?? "Chat failed"
    if let runId = chat.runId {
        self.clearPendingRun(runId)
    } else if self.pendingRuns.count <= 1 {
        self.clearPendingRuns(reason: nil)
    }
    self.pendingToolCallsById = [:]
    self.streamingAssistantText = nil
    Task { await self.refreshHistoryAfterRun() }
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: apps/shared/OpenClawKit/Sources/OpenClawChatUI/ChatViewModel.swift
Line: 895-908

Comment:
**Hardcoded alias pair may not generalize to other session configurations**

`matchesCurrentSessionKey` only handles one specific canonical/alias pair: `"agent:main:main"` ↔ `"main"`. Any other gateway session key following a similar convention (e.g., `"agent:work:work"` ↔ `"work"`, or `"agent:staging:staging"` ↔ `"staging"`) would fail the match and silently drop events.

If the alias pattern is always `agent:<name>:<name>` ↔ `<name>`, a general rule would be more resilient:
```swift
// Generalize: "agent:<name>:<name>" is the canonical form of alias "<name>"
let parts = incomingNormalized.split(separator: ":", maxSplits: 2)
if parts.count == 3 && parts[0] == "agent" && parts[1] == parts[2] && parts[2] == currentNormalized {
    return true
}
let currentParts = currentNormalized.split(separator: ":", maxSplits: 2)
if currentParts.count == 3 && currentParts[0] == "agent" && currentParts[1] == currentParts[2] && currentParts[2] == incomingNormalized {
    return true
}
```
If `"main"` truly is the only valid alias in production today, a comment to that effect would help future maintainers understand the constraint.

How can I resolve this? If you propose a fix, please make it concise.

_{Last reviewed commit: "fix(ios): stabilize ..."}

Re-review progress:

State: Complete
Detail: The targeted re-review finished, the durable review comment was updated, and the synced verdict was routed.
Run: https://github.com/openclaw/clawsweeper/actions/runs/25408115124
Updated: 2026-05-05T23:37:58.279Z

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1a08fd3040

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 21aad88f23

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5df7c10821

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

P1 review thread fixes for openclaw#50483: 1. History sync after local finals: appendFinalAssistantMessage now always calls refreshHistoryAfterRun() instead of only when the append failed. This ensures tool results from local runs are persisted to canonical history. 2. Multi-client reconciliation: When external client finals arrive while this client has a pending run, immediately append the message AND schedule history refresh instead of deferring refresh until pendingRuns.isEmpty. This prevents cross-client history gaps. Added test cases: - appendsFinalAssistantMessageImmediatelyAndRefreshesHistory: validates history sync after local finals - externalFinalMessageRefreshesHistoryEvenDuringLocalPendingRun: validates multi-client reconciliation

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: aa25cd5e5c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

eulicesl · 2026-04-06T17:55:18Z

Maintainer Review Update — 2026-04-06

All open review threads resolved. CI is passing.

CI Status

✅ All checks passing

Validation

✅ iOS Simulator build (Swift 6.3 / Xcode 26.4)
✅ iOS device build (iPhone 17 Pro Max, iOS 26.2.0)

Ready for maintainer review.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6f520a890b

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

P1 review thread fixes for openclaw#50483: 1. History sync after local finals: appendFinalAssistantMessage now always calls refreshHistoryAfterRun() instead of only when the append failed. This ensures tool results from local runs are persisted to canonical history. 2. Multi-client reconciliation: When external client finals arrive while this client has a pending run, immediately append the message AND schedule history refresh instead of deferring refresh until pendingRuns.isEmpty. This prevents cross-client history gaps. Added test cases: - appendsFinalAssistantMessageImmediatelyAndRefreshesHistory: validates history sync after local finals - externalFinalMessageRefreshesHistoryEvenDuringLocalPendingRun: validates multi-client reconciliation

Address copilot-pull-request-reviewer feedback on openclaw#50483: - Track optimistically-appended assistant messages by UUID in pendingOptimisticMessageIds so they are not dropped during the race window between append and history refresh. - reconcileRunRefreshMessages now accepts pendingOptimisticIds and preserves any previous messages matching those IDs that are absent from incoming history. - refreshHistoryAfterRun clears pendingOptimisticIds for messages now confirmed by the incoming history payload. - bootstrap also passes pendingOptimisticIds to reconcileRunRefreshMessages so bootstrap history loads are consistent with the same logic. - appendFinalAssistantMessage tracks new message IDs on append.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8bfa960895

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1768568561

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: bbd9bbd46c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 89dab2506c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 151405b3ff

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-27T13:56:19Z

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ff1ecc4249

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f5a032f220

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-27T14:44:41Z

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

clawsweeper · 2026-04-28T02:20:46Z

Codex review: needs real behavior proof before merge. Reviewed June 8, 2026, 1:18 AM ET / 05:18 UTC.

Summary
The PR changes shared OpenClawChatUI Swift chat streaming/session handling, assistant text parsing, chat view scroll/typing behavior, related Swift tests, and CHANGELOG.md.

PR surface: Docs +1, Other +396. Total +397 across 7 files.

Reproducibility: yes. for the review blockers: PR-head source shows the recovered-bootstrap flag gap and the stale-refresh assistant-final path. The original iOS flicker/session-flow symptom still lacks an inspectable runtime artifact in this review.

Review metrics: 1 noteworthy metric.

Release-owned changelog entries: 1 added. Normal PRs should not edit CHANGELOG.md because release generation owns that file.

Merge readiness
Overall: 🧂 unranked krab
Proof: 🧂 unranked krab
Patch quality: 🦪 silver shellfish
Result: blocked until real behavior proof is added.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

Rebase on current main and preserve the newer ChatViewModel session/history flow.
[P1] Fix the recovered-bootstrap and optimistic-assistant-final refresh defects.
Attach redacted real behavior proof for the iOS streaming/session scenario.

Proof guidance:

[P1] Needs real behavior proof before merge: No inspectable after-fix runtime artifact is attached; add a redacted recording, terminal output, copied live output, runtime logs, or linked artifact, then update the PR body to trigger re-review or ask a maintainer to comment @clawsweeper re-review.

Mantis proof suggestion
The user-visible iOS streaming/layout behavior would benefit from a short real app visual proof after the branch is rebased. A maintainer can ask Mantis to capture proof by posting a new PR comment that starts with the OpenClaw Mantis account mention, followed by:

visual task: verify iOS chat streaming stays attached to the active session and remains visually stable during bootstrap/history refresh.

Risk before merge

[P1] The branch is dirty against current main, and current main has newer generation-scoped ChatViewModel/session handling that conflict resolution must preserve.
[P1] Missing inspectable after-fix runtime proof is still a merge gate for this external user-visible iOS chat behavior.
[P1] The diff changes event acceptance, bootstrap, refresh, and single-lane streaming state, so a bad merge can regress session association or message delivery.
[P1] The PR edits CHANGELOG.md even though release generation owns that file.

Maintainer options:

Rebase and repair before merge (recommended)
Refresh the branch on current main, resolve ChatViewModel conflicts in favor of the newer session/history flow, and fix the two state defects before review continues.
Require real iOS proof
Keep the PR unmerged until a redacted recording, terminal output, runtime log, or linked artifact shows the after-fix streaming/session behavior in a real app setup.
Close if the stale branch cannot be narrowed
If rebasing would discard most of the branch, maintainers can close this PR and ask for a new narrow PR with only the still-missing UI/parser fixes.

Next step before merge

[P1] Human/contributor follow-up is needed because the branch is dirty, real behavior proof is missing, and conflict resolution must preserve current main's newer ChatViewModel behavior.

Security
Cleared: The diff is limited to Swift chat UI/model/tests and a changelog entry; no concrete security or supply-chain concern was found.

Review findings

[P2] Mark recovered bootstraps as loaded — apps/shared/OpenClawKit/Sources/OpenClawChatUI/ChatViewModel.swift:112-113
[P2] Preserve optimistic assistant finals through refresh — apps/shared/OpenClawKit/Sources/OpenClawChatUI/ChatViewModel.swift:1130-1136
[P3] Remove the changelog entry — CHANGELOG.md:92

Review details

Best possible solution:

Rebase onto current main, preserve the newer ChatViewModel session/history flow, fix the remaining stale-refresh and load-recovery defects, remove the changelog entry, and attach redacted real behavior proof for the iOS streaming scenario.

Do we have a high-confidence way to reproduce the issue?

Yes for the review blockers: PR-head source shows the recovered-bootstrap flag gap and the stale-refresh assistant-final path. The original iOS flicker/session-flow symptom still lacks an inspectable runtime artifact in this review.

Is this the best way to solve the issue?

No: the branch is a plausible bug fix, but not the best merge shape against current main until it is rebased, preserves current main's newer session/history implementation, and fixes the remaining state defects.

Full review comments:

[P2] Mark recovered bootstraps as loaded — apps/shared/OpenClawKit/Sources/OpenClawChatUI/ChatViewModel.swift:112-113
After an initial bootstrap failure, a successful manual refresh() calls bootstrap() but never flips hasLoadedInitialState back to true. The next onAppear can run load() again and start a full bootstrap, which clears pending runs, tool calls, and streamingAssistantText; set the flag after any successful bootstrap, not only before load() starts one.
Confidence: 0.9
[P2] Preserve optimistic assistant finals through refresh — apps/shared/OpenClawKit/Sources/OpenClawChatUI/ChatViewModel.swift:1130-1136
appendFinalAssistantMessage appends a decoded or streamed assistant final, and handleChatEvent immediately refreshes history. If that first history response is stale but nonempty, reconcileRunRefreshMessages carries forward only local user messages, so the assistant final can appear and disappear until a later refresh; track or merge optimistic assistant finals until history confirms them.
Confidence: 0.86
[P3] Remove the changelog entry — CHANGELOG.md:92
CHANGELOG.md is release-owned in this repo. Keep the release-note context in the PR body or squash message instead, otherwise this normal PR can create release-generation churn and conflicts.
Confidence: 0.96

Overall correctness: patch is incorrect
Overall confidence: 0.86

AGENTS.md: found and applied where relevant.

Codex review notes: model gpt-5.5, reasoning high; reviewed against b8adc11977ab.

Label changes

Label justifications:

P2: The PR targets a normal-priority shared chat UI/session-state bug with user-visible streaming instability but limited blast radius.
merge-risk: 🚨 session-state: The diff changes bootstrap, refresh, session switching, and pending-run state in the shared ChatViewModel.
merge-risk: 🚨 message-delivery: The diff changes which chat and agent streaming/final events are accepted, ignored, appended, or reconciled.
rating: 🧂 unranked krab: Overall readiness is 🧂 unranked krab; proof is 🧂 unranked krab and patch quality is 🦪 silver shellfish.
status: 📣 needs proof: The PR needs real behavior proof before ClawSweeper can clear the contributor ask. Needs real behavior proof before merge: No inspectable after-fix runtime artifact is attached; add a redacted recording, terminal output, copied live output, runtime logs, or linked artifact, then update the PR body to trigger re-review or ask a maintainer to comment @clawsweeper re-review.

Evidence reviewed

PR surface:

Docs +1, Other +396. Total +397 across 7 files.

View PR surface stats

Area	Files	Added	Removed	Net
Source	0	0	0	0
Tests	0	0	0	0
Docs	1	1	0	+1
Config	0	0	0	0
Generated	0	0	0	0
Other	6	415	19	+396
Total	7	416	19	+397

Acceptance criteria:

[P1] swift test --package-path apps/shared/OpenClawKit --filter ChatViewModelTests.
[P1] swift test --package-path apps/shared/OpenClawKit --filter AssistantTextParserTests.
[P1] xcodebuild -project apps/ios/OpenClaw.xcodeproj -scheme OpenClaw -destination 'generic/platform=iOS Simulator' CODE_SIGNING_ALLOWED=NO build.

What I checked:

Repository policy applied: Root AGENTS.md was read fully; no scoped apps/AGENTS.md exists for the touched Swift paths, and the changelog/proof/review guidance applies here. (AGENTS.md:33, b8adc11977ab)
Release-owned changelog policy: Repository policy says CHANGELOG.md is release-only and normal PRs should keep release-note context in the PR body or commit message instead. (AGENTS.md:230, b8adc11977ab)
PR head load/recovery path: At PR head, load sets hasLoadedInitialState, refresh calls bootstrap without setting it, and bootstrap only clears the flag on failure while clearing pending runs/tools/streaming at the start. (apps/shared/OpenClawKit/Sources/OpenClawChatUI/ChatViewModel.swift:110, 29f6bd56d938)
PR head stale-refresh path: appendFinalAssistantMessage appends decoded or streamed assistant finals before refreshHistoryAfterRun, while reconcileRunRefreshMessages only carries forward local user messages from stale nonempty history responses. (apps/shared/OpenClawKit/Sources/OpenClawChatUI/ChatViewModel.swift:1129, 29f6bd56d938)
Current main has newer session flow: Current main has generation-scoped bootstrap/history handling and session-message handling that are materially newer than the PR branch, so conflict resolution must preserve current main behavior. (apps/shared/OpenClawKit/Sources/OpenClawChatUI/ChatViewModel.swift:385, b8adc11977ab)
History provenance sample: Local history shows related current-main work around canonical session keys, reconnect resync, model/session UI, and recent ChatViewModel edits. (apps/shared/OpenClawKit/Sources/OpenClawChatUI/ChatViewModel.swift:385, b8adc11977ab)

Likely related people:

steipete: Current-line blame for the newer ChatViewModel bootstrap/session handling in this checkout points to Peter Steinberger on the grafted main commit. (role: recent area contributor; confidence: medium; commits: bab18d567b0c; files: apps/shared/OpenClawKit/Sources/OpenClawChatUI/ChatViewModel.swift)
Mariano: Local history shows canonical main-session alias and reconnect-resync work in the same ChatViewModel area. (role: adjacent feature contributor; confidence: medium; commits: fe3f0759b5c4, 42d11a3ec5f4; files: apps/shared/OpenClawKit/Sources/OpenClawChatUI/ChatViewModel.swift, apps/shared/OpenClawKit/Tests/OpenClawKitTests/ChatViewModelTests.swift)
Nachx639: Local history shows canonical session-key and reset-trigger changes touching ChatViewModel, ChatTransport, and tests shortly before this PR. (role: adjacent feature contributor; confidence: medium; commits: e951a42bcb86; files: apps/shared/OpenClawKit/Sources/OpenClawChatUI/ChatViewModel.swift, apps/shared/OpenClawKit/Sources/OpenClawChatUI/ChatTransport.swift, apps/shared/OpenClawKit/Tests/OpenClawKitTests/ChatViewModelTests.swift)

What the crustacean ranks mean

🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works

ClawSweeper keeps one durable marker-backed review comment per issue or PR.
Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
Maintainers can also comment @clawsweeper review to request a fresh review only.
Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

eulicesl · 2026-05-05T23:33:55Z

Rebuilt this branch on top of current main and force-pushed a clean refresh.

What changed:

kept the chat/session-flow fixes
dropped the unrelated Windows installer / CI retrigger commits
added the missing changelog entry

Local verification:

swift test in apps/shared/OpenClawKit ✅

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 29f6bd56d9

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-05T23:37:33Z

            await self.fetchModels()
            self.errorText = nil
        } catch {
+            self.hasLoadedInitialState = false


Mark recovered refreshes as loaded

When the initial load() bootstrap fails, this resets hasLoadedInitialState, but a later successful manual refresh() calls bootstrap() without ever setting the flag back to true. In that recovery path, the next view onAppear will run load() again and start a full bootstrap, which clears pending runs, tool calls, and streamingAssistantText; navigating away and back during a new in-flight reply after recovering from a startup error can therefore drop the active stream. Set the flag on any successful bootstrap, not only before load() starts one.

Useful? React with 👍 / 👎.

clawsweeper · 2026-05-19T23:51:37Z

ClawSweeper PR egg

🎁 Pass real behavior proof to wake the egg and unlock a hatchable treat.

Where did the egg go?

The egg game starts only after the PR passes the real-behavior proof check.
Before that, no creature or rarity is rolled. The treat waits for real proof.
This is still just collectible flavor: proof affects review readiness, not creature quality.

openclaw-barnacle · 2026-06-08T05:08:56Z

This pull request has been automatically marked as stale due to inactivity.
Please add updates or it will be closed.

eulicesl mentioned this pull request Mar 19, 2026

fix(ios): use plain Text during streaming to eliminate markdown flicker [AI-assisted] #40985

Closed

openclaw-barnacle Bot added the size: M label Mar 19, 2026

chatgpt-codex-connector Bot reviewed Mar 19, 2026

View reviewed changes

Comment thread apps/shared/OpenClawKit/Sources/OpenClawChatUI/ChatViewModel.swift Outdated

chatgpt-codex-connector Bot reviewed Mar 19, 2026

View reviewed changes

Comment thread apps/shared/OpenClawKit/Sources/OpenClawChatUI/ChatViewModel.swift Outdated

Comment thread apps/shared/OpenClawKit/Sources/OpenClawChatUI/ChatViewModel.swift Outdated

chatgpt-codex-connector Bot reviewed Mar 19, 2026

View reviewed changes

Comment thread apps/shared/OpenClawKit/Sources/OpenClawChatUI/ChatViewModel.swift Outdated

eulicesl force-pushed the upstream/fix-ios-streaming-layout-flicker branch from 21aad88 to 5df7c10 Compare March 19, 2026 18:19

chatgpt-codex-connector Bot reviewed Mar 19, 2026

View reviewed changes

Comment thread apps/shared/OpenClawKit/Sources/OpenClawChatUI/ChatViewModel.swift

Comment thread apps/shared/OpenClawKit/Sources/OpenClawChatUI/ChatViewModel.swift

eulicesl force-pushed the upstream/fix-ios-streaming-layout-flicker branch from 5df7c10 to aa25cd5 Compare March 19, 2026 20:52

chatgpt-codex-connector Bot reviewed Mar 19, 2026

View reviewed changes

Comment thread apps/shared/OpenClawKit/Sources/OpenClawChatUI/ChatViewModel.swift Outdated

eulicesl force-pushed the upstream/fix-ios-streaming-layout-flicker branch from aa25cd5 to 830fb0c Compare March 20, 2026 13:03

eulicesl force-pushed the upstream/fix-ios-streaming-layout-flicker branch from 830fb0c to ad77e13 Compare March 20, 2026 15:01

eulicesl force-pushed the upstream/fix-ios-streaming-layout-flicker branch from ad77e13 to 3d5e074 Compare March 21, 2026 21:10

eulicesl force-pushed the upstream/fix-ios-streaming-layout-flicker branch from 0c32438 to 01acd67 Compare March 23, 2026 02:52

eulicesl force-pushed the upstream/fix-ios-streaming-layout-flicker branch from 55779a7 to 6f520a8 Compare April 14, 2026 12:44

chatgpt-codex-connector Bot reviewed Apr 14, 2026

View reviewed changes

Comment thread apps/shared/OpenClawKit/Sources/OpenClawChatUI/ChatViewModel.swift Outdated

eulicesl force-pushed the upstream/fix-ios-streaming-layout-flicker branch from 6f520a8 to 8bfa960 Compare April 17, 2026 12:29

chatgpt-codex-connector Bot reviewed Apr 17, 2026

View reviewed changes

Comment thread apps/shared/OpenClawKit/Sources/OpenClawChatUI/ChatViewModel.swift Outdated

chatgpt-codex-connector Bot reviewed Apr 26, 2026

View reviewed changes

Comment thread apps/shared/OpenClawKit/Sources/OpenClawChatUI/ChatViewModel.swift

chatgpt-codex-connector Bot reviewed Apr 26, 2026

View reviewed changes

Comment thread apps/shared/OpenClawKit/Sources/OpenClawChatUI/ChatViewModel.swift

chatgpt-codex-connector Bot reviewed Apr 26, 2026

View reviewed changes

Comment thread apps/shared/OpenClawKit/Sources/OpenClawChatUI/ChatViewModel.swift

chatgpt-codex-connector Bot reviewed Apr 26, 2026

View reviewed changes

Comment thread apps/shared/OpenClawKit/Sources/OpenClawChatUI/ChatViewModel.swift Outdated

chatgpt-codex-connector Bot reviewed Apr 27, 2026

View reviewed changes

Comment thread apps/shared/OpenClawKit/Tests/OpenClawKitTests/ChatViewModelTests.swift Outdated

chatgpt-codex-connector Bot reviewed Apr 27, 2026

View reviewed changes

Comment thread apps/shared/OpenClawKit/Tests/OpenClawKitTests/ChatViewModelTests.swift Outdated

eulicesl added 2 commits May 5, 2026 19:31

fix(ios): stabilize chat streaming layout and session flow

369420a

chore: add changelog entry for iOS chat streaming fix

29f6bd5

chatgpt-codex-connector Bot reviewed May 5, 2026

View reviewed changes

This was referenced May 15, 2026

App freezes when text input bar changes vertical size #50551

Closed

fix(ios): picker adaptive onChange loop + OSLog string concatenation #40878

Closed

Uh oh!

Conversation

eulicesl commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Change Type (select all)

Scope (select all touched areas)

Linked Issue/PR

Root Cause (if applicable)

Regression Test Plan (if applicable)

User-visible / Behavior Changes

Diagram (if applicable)

Security Impact (required)

Repro + Verification

Environment

Steps

Expected

Actual

Evidence

Human Verification (required)

Review Conversations

Compatibility / Migration

Risks and Mitigations

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

greptile-apps Bot commented Mar 19, 2026 • edited by clawsweeper Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 3/5

Comments Outside Diff (2)

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

eulicesl commented Apr 6, 2026

Maintainer Review Update — 2026-04-06

CI Status

Validation

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

eulicesl commented Mar 19, 2026 •

edited

Loading

greptile-apps Bot commented Mar 19, 2026 •

edited by clawsweeper Bot

Loading

clawsweeper Bot commented Apr 28, 2026 •

edited

Loading

clawsweeper Bot commented May 19, 2026 •

edited

Loading