Skip to content

fix(ios): stabilize chat streaming layout and session flow#50483

Open
eulicesl wants to merge 2 commits into
openclaw:mainfrom
eulicesl:upstream/fix-ios-streaming-layout-flicker
Open

fix(ios): stabilize chat streaming layout and session flow#50483
eulicesl wants to merge 2 commits into
openclaw:mainfrom
eulicesl:upstream/fix-ios-streaming-layout-flicker

Conversation

@eulicesl

@eulicesl eulicesl commented Mar 19, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Problem: iOS chat streaming could attach to a synthetic placeholder lane, churn during initial history/bootstrap, and reparse streaming content in ways that caused flicker or session drift.
  • Why it matters: active conversations can show unstable streaming UI, lose the correct lane/session association, or visibly flicker while tokens arrive.
  • What changed: scoped streaming to the active session key, hardened stream acceptance/initial load behavior, and stabilized layout parsing/visibility updates.
  • What did NOT change (scope boundary): no gateway protocol changes, no new model behavior. This fix lives in the shared OpenClawChatUI layer used by both iOS and macOS. The bug (cross-run UI drift during concurrent session-key agent events) affects any client using ChatViewModel; macOS uses the same chat view and benefits equally, but the fix is not platform-specific — it guards shared streaming state.
  • AI-assisted: yes (AI-assisted implementation, contributor-reviewed).
  • Testing level: targeted shared-package tests plus iOS build validation.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor required for the fix
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

Root Cause (if applicable)

  • Root cause: streaming state was not consistently tied to the active session key, and downstream UI/layout updates re-parsed or re-evaluated visibility in ways that amplified bootstrap churn into visible instability.
  • Missing detection / guardrail: there was insufficient coverage around session-scoped streaming updates plus layout stability under rapid token updates/history refresh.
  • Prior context (git blame, prior PR, issue, or refactor if known): this branch supersedes fix(ios): use plain Text during streaming to eliminate markdown flicker [AI-assisted] #40985 by fixing the broader lane/acceptance/layout causes instead of only the earlier visible symptom.
  • Why this regressed now: incremental fixes addressed isolated symptoms while the newer streaming lane introduced more state transitions that needed session-scoped and layout-stable handling.
  • If unknown, what was ruled out: the issue was not isolated to one text-rendering widget; it involved session scoping, bootstrap timing, and layout parsing together.

Regression Test Plan (if applicable)

  • Coverage level that should have caught this:
    • Unit test
    • Seam / integration test
    • End-to-end test
    • Existing coverage already sufficient
  • Target test or file:
    • apps/shared/OpenClawKit/Tests/OpenClawKitTests/ChatViewModelTests.swift
    • apps/shared/OpenClawKit/Tests/OpenClawKitTests/AssistantTextParserTests.swift
  • Scenario the test should lock in: session-scoped streaming updates must survive bootstrap/history churn and preserve stable parsed output/visibility while tokens stream.
  • Why this is the smallest reliable guardrail: the bug lives in shared chat state/parsing logic, so package-level tests cover the core behavior without requiring a full end-to-end harness.
  • Existing test that already covers this (if any): targeted parser/view-model tests were added/expanded for this branch.
  • If no new test is added, why not: N/A.

User-visible / Behavior Changes

  • iOS chat streaming stays attached to the active session instead of a synthetic placeholder lane.
  • Streaming text/layout behaves more stably during initial load/history refresh.
  • Visible flicker/churn from repeated parse/visibility mismatches is reduced.

Diagram (if applicable)

N/A

Security Impact (required)

  • New permissions/capabilities? (Yes/No): No
  • Secrets/tokens handling changed? (Yes/No): No
  • New/changed network calls? (Yes/No): No
  • Command/tool execution surface changed? (Yes/No): No
  • Data access scope changed? (Yes/No): No
  • If any Yes, explain risk + mitigation: N/A

Repro + Verification

Environment

  • OS: macOS local dev environment
  • Runtime/container: shared OpenClawKit package + iOS app build
  • Model/provider: N/A
  • Integration/channel (if any): iOS chat UI / streaming flow
  • Relevant config (redacted): local iOS simulator build

Steps

  1. Build the shared package and iOS app from this branch.
  2. Open a chat session and trigger a streaming reply.
  3. Observe initial history/bootstrap plus active token streaming.

Expected

  • Streaming binds to the active session.
  • Bootstrap/history churn does not destabilize the active stream.
  • Layout remains visually stable while tokens arrive.

Actual

  • Matched expected in targeted tests and local build validation.

Evidence

Attach at least one:

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)

Human Verification (required)

  • Verified scenarios:
    • swift test --package-path apps/shared/OpenClawKit --filter ChatViewModelTests
    • swift test --package-path apps/shared/OpenClawKit --filter AssistantTextParserTests
    • xcodegen generate
    • xcodebuild -project apps/ios/OpenClaw.xcodeproj -scheme OpenClaw -destination 'generic/platform=iOS Simulator' CODE_SIGNING_ALLOWED=NO build
  • Edge cases checked:
    • stream acceptance during bootstrap/history load
    • parser stability under repeated streaming updates
  • What you did not verify:
    • full repository CI matrix
    • non-iOS clients

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
    • All 19 review threads (P1/P2 from chatgpt-codex-connector[bot] + copilot-pull-request-reviewer) resolved in this update.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

If a bot review conversation is addressed by this PR, resolve that conversation yourself. Do not leave bot review conversation cleanup for maintainers.

Compatibility / Migration

  • Backward compatible? (Yes/No): Yes
  • Config/env changes? (Yes/No): No
  • Migration needed? (Yes/No): No
  • If yes, exact upgrade steps: N/A

Risks and Mitigations

  • Risk: chat-state timing bugs can reappear under bootstrapping or external-final sync edge cases.
    • Mitigation: this patch scopes state to the active session and adds targeted parser/view-model coverage around the churn paths.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a1f9853bc9

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread apps/shared/OpenClawKit/Sources/OpenClawChatUI/ChatViewModel.swift Outdated
@greptile-apps

greptile-apps Bot commented Mar 19, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR addresses three root causes behind iOS chat instability: session mis-scoping, stream acceptance / initial-load churn, and repeated layout re-parses. The changes are well-motivated and the majority of the implementation is sound — the stable AssistantTextSegment IDs, the shouldAcceptAgentEvent / matchesCurrentSessionKey routing logic, the load() idempotency guard, the typing-indicator visibility fix, and the streaming scroll debounce all look correct and are backed by good test coverage.

Key findings:

  • Logic issue — errored run can permanently append partial content without history reconciliation: In handleChatEvent, the "error" state shares the same case arm as "final" and "aborted". When a run errors mid-stream, appendFinalAssistantMessage falls through to streamedAssistantMessage(), which synthesizes a message from the partial streamingAssistantText. Because appendFinalAssistantMessage returns true, refreshHistoryAfterRun() is never called. The partial, incomplete response is now permanently in messages with no server reconciliation — on next reload it would silently vanish, creating a visible inconsistency. The "error" branch should be handled separately and should always trigger a history refresh.
  • Style note — hardcoded session-key alias: matchesCurrentSessionKey only handles the "agent:main:main""main" alias pair. Other configurations following the same agent:<name>:<name><name> convention would silently drop events.

Confidence Score: 3/5

  • Mostly safe to merge, but the error-state partial-message bug can cause silent client/server state drift on run failures with partial streaming content.
  • The bulk of the PR is well-implemented and tested. The one logic issue — errored runs bypassing refreshHistoryAfterRun() when partial streaming content exists — is a behavioural regression that creates data inconsistency, though it only fires on the specific path of (run errors) + (partial visible streaming text buffered). All other changes are safe and the new tests cover the happy paths well.
  • apps/shared/OpenClawKit/Sources/OpenClawChatUI/ChatViewModel.swift — specifically the "error" state arm of handleChatEvent (around line 874) and the matchesCurrentSessionKey alias logic.

Comments Outside Diff (2)

  1. apps/shared/OpenClawKit/Sources/OpenClawChatUI/ChatViewModel.swift, line 874-889 (link)

    P1 Errored run skips history refresh when partial streaming content exists

    For the "error" state, appendFinalAssistantMessage falls through to streamedAssistantMessage() when chat.message is nil (typical for errored runs). If any partial streaming text was buffered, streamedAssistantMessage() creates a synthesized message and returns true. This causes appendFinalAssistantMessage to return true, which suppresses refreshHistoryAfterRun().

    The result: partial, incomplete content from a failed run is permanently appended to self.messages as if it were a finalized assistant message, and the client never reconciles with the server. On next reload the partial message would disappear (since the server history for an errored run won't include it), creating a visible inconsistency.

    The "final" and "aborted" cases are intentional here — those runs legitimately complete. "error" is different: the server hasn't produced a final message, so skipping the history reconciliation is incorrect.

    Consider separating the error branch so it always triggers a history refresh, even when partial content exists:

    case "final", "aborted":
        let appendedFinalMessage = self.appendFinalAssistantMessage(from: chat)
        self.pendingToolCallsById = [:]
        self.streamingAssistantText = nil
        if !appendedFinalMessage {
            Task { await self.refreshHistoryAfterRun() }
        }
    case "error":
        self.errorText = chat.errorMessage ?? "Chat failed"
        if let runId = chat.runId {
            self.clearPendingRun(runId)
        } else if self.pendingRuns.count <= 1 {
            self.clearPendingRuns(reason: nil)
        }
        self.pendingToolCallsById = [:]
        self.streamingAssistantText = nil
        Task { await self.refreshHistoryAfterRun() }
    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: apps/shared/OpenClawKit/Sources/OpenClawChatUI/ChatViewModel.swift
    Line: 874-889
    
    Comment:
    **Errored run skips history refresh when partial streaming content exists**
    
    For the `"error"` state, `appendFinalAssistantMessage` falls through to `streamedAssistantMessage()` when `chat.message` is nil (typical for errored runs). If any partial streaming text was buffered, `streamedAssistantMessage()` creates a synthesized message and returns `true`. This causes `appendFinalAssistantMessage` to return `true`, which suppresses `refreshHistoryAfterRun()`.
    
    The result: partial, incomplete content from a failed run is permanently appended to `self.messages` as if it were a finalized assistant message, and the client never reconciles with the server. On next reload the partial message would disappear (since the server history for an errored run won't include it), creating a visible inconsistency.
    
    The `"final"` and `"aborted"` cases are intentional here — those runs legitimately complete. `"error"` is different: the server hasn't produced a final message, so skipping the history reconciliation is incorrect.
    
    Consider separating the error branch so it always triggers a history refresh, even when partial content exists:
    ```swift
    case "final", "aborted":
        let appendedFinalMessage = self.appendFinalAssistantMessage(from: chat)
        self.pendingToolCallsById = [:]
        self.streamingAssistantText = nil
        if !appendedFinalMessage {
            Task { await self.refreshHistoryAfterRun() }
        }
    case "error":
        self.errorText = chat.errorMessage ?? "Chat failed"
        if let runId = chat.runId {
            self.clearPendingRun(runId)
        } else if self.pendingRuns.count <= 1 {
            self.clearPendingRuns(reason: nil)
        }
        self.pendingToolCallsById = [:]
        self.streamingAssistantText = nil
        Task { await self.refreshHistoryAfterRun() }
    ```
    
    How can I resolve this? If you propose a fix, please make it concise.
  2. apps/shared/OpenClawKit/Sources/OpenClawChatUI/ChatViewModel.swift, line 895-908 (link)

    P2 Hardcoded alias pair may not generalize to other session configurations

    matchesCurrentSessionKey only handles one specific canonical/alias pair: "agent:main:main""main". Any other gateway session key following a similar convention (e.g., "agent:work:work""work", or "agent:staging:staging""staging") would fail the match and silently drop events.

    If the alias pattern is always agent:<name>:<name><name>, a general rule would be more resilient:

    // Generalize: "agent:<name>:<name>" is the canonical form of alias "<name>"
    let parts = incomingNormalized.split(separator: ":", maxSplits: 2)
    if parts.count == 3 && parts[0] == "agent" && parts[1] == parts[2] && parts[2] == currentNormalized {
        return true
    }
    let currentParts = currentNormalized.split(separator: ":", maxSplits: 2)
    if currentParts.count == 3 && currentParts[0] == "agent" && currentParts[1] == currentParts[2] && currentParts[2] == incomingNormalized {
        return true
    }

    If "main" truly is the only valid alias in production today, a comment to that effect would help future maintainers understand the constraint.

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: apps/shared/OpenClawKit/Sources/OpenClawChatUI/ChatViewModel.swift
    Line: 895-908
    
    Comment:
    **Hardcoded alias pair may not generalize to other session configurations**
    
    `matchesCurrentSessionKey` only handles one specific canonical/alias pair: `"agent:main:main"``"main"`. Any other gateway session key following a similar convention (e.g., `"agent:work:work"``"work"`, or `"agent:staging:staging"``"staging"`) would fail the match and silently drop events.
    
    If the alias pattern is always `agent:<name>:<name>``<name>`, a general rule would be more resilient:
    ```swift
    // Generalize: "agent:<name>:<name>" is the canonical form of alias "<name>"
    let parts = incomingNormalized.split(separator: ":", maxSplits: 2)
    if parts.count == 3 && parts[0] == "agent" && parts[1] == parts[2] && parts[2] == currentNormalized {
        return true
    }
    let currentParts = currentNormalized.split(separator: ":", maxSplits: 2)
    if currentParts.count == 3 && currentParts[0] == "agent" && currentParts[1] == currentParts[2] && currentParts[2] == incomingNormalized {
        return true
    }
    ```
    If `"main"` truly is the only valid alias in production today, a comment to that effect would help future maintainers understand the constraint.
    
    How can I resolve this? If you propose a fix, please make it concise.
Prompt To Fix All With AI
This is a comment left during a code review.
Path: apps/shared/OpenClawKit/Sources/OpenClawChatUI/ChatViewModel.swift
Line: 874-889

Comment:
**Errored run skips history refresh when partial streaming content exists**

For the `"error"` state, `appendFinalAssistantMessage` falls through to `streamedAssistantMessage()` when `chat.message` is nil (typical for errored runs). If any partial streaming text was buffered, `streamedAssistantMessage()` creates a synthesized message and returns `true`. This causes `appendFinalAssistantMessage` to return `true`, which suppresses `refreshHistoryAfterRun()`.

The result: partial, incomplete content from a failed run is permanently appended to `self.messages` as if it were a finalized assistant message, and the client never reconciles with the server. On next reload the partial message would disappear (since the server history for an errored run won't include it), creating a visible inconsistency.

The `"final"` and `"aborted"` cases are intentional here — those runs legitimately complete. `"error"` is different: the server hasn't produced a final message, so skipping the history reconciliation is incorrect.

Consider separating the error branch so it always triggers a history refresh, even when partial content exists:
```swift
case "final", "aborted":
    let appendedFinalMessage = self.appendFinalAssistantMessage(from: chat)
    self.pendingToolCallsById = [:]
    self.streamingAssistantText = nil
    if !appendedFinalMessage {
        Task { await self.refreshHistoryAfterRun() }
    }
case "error":
    self.errorText = chat.errorMessage ?? "Chat failed"
    if let runId = chat.runId {
        self.clearPendingRun(runId)
    } else if self.pendingRuns.count <= 1 {
        self.clearPendingRuns(reason: nil)
    }
    self.pendingToolCallsById = [:]
    self.streamingAssistantText = nil
    Task { await self.refreshHistoryAfterRun() }
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: apps/shared/OpenClawKit/Sources/OpenClawChatUI/ChatViewModel.swift
Line: 895-908

Comment:
**Hardcoded alias pair may not generalize to other session configurations**

`matchesCurrentSessionKey` only handles one specific canonical/alias pair: `"agent:main:main"``"main"`. Any other gateway session key following a similar convention (e.g., `"agent:work:work"``"work"`, or `"agent:staging:staging"``"staging"`) would fail the match and silently drop events.

If the alias pattern is always `agent:<name>:<name>``<name>`, a general rule would be more resilient:
```swift
// Generalize: "agent:<name>:<name>" is the canonical form of alias "<name>"
let parts = incomingNormalized.split(separator: ":", maxSplits: 2)
if parts.count == 3 && parts[0] == "agent" && parts[1] == parts[2] && parts[2] == currentNormalized {
    return true
}
let currentParts = currentNormalized.split(separator: ":", maxSplits: 2)
if currentParts.count == 3 && currentParts[0] == "agent" && currentParts[1] == currentParts[2] && currentParts[2] == incomingNormalized {
    return true
}
```
If `"main"` truly is the only valid alias in production today, a comment to that effect would help future maintainers understand the constraint.

How can I resolve this? If you propose a fix, please make it concise.

Last reviewed commit: "fix(ios): stabilize ..."

Re-review progress:

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1a08fd3040

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread apps/shared/OpenClawKit/Sources/OpenClawChatUI/ChatViewModel.swift Outdated
Comment thread apps/shared/OpenClawKit/Sources/OpenClawChatUI/ChatViewModel.swift Outdated

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 21aad88f23

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread apps/shared/OpenClawKit/Sources/OpenClawChatUI/ChatViewModel.swift Outdated
@eulicesl eulicesl force-pushed the upstream/fix-ios-streaming-layout-flicker branch from 21aad88 to 5df7c10 Compare March 19, 2026 18:19

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5df7c10821

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread apps/shared/OpenClawKit/Sources/OpenClawChatUI/ChatViewModel.swift
Comment thread apps/shared/OpenClawKit/Sources/OpenClawChatUI/ChatViewModel.swift
@eulicesl eulicesl force-pushed the upstream/fix-ios-streaming-layout-flicker branch from 5df7c10 to aa25cd5 Compare March 19, 2026 20:52
eulicesl added a commit to eulicesl/openclaw that referenced this pull request Mar 19, 2026
P1 review thread fixes for openclaw#50483:

1. History sync after local finals: appendFinalAssistantMessage now always calls refreshHistoryAfterRun() instead of only when the append failed. This ensures tool results from local runs are persisted to canonical history.

2. Multi-client reconciliation: When external client finals arrive while this client has a pending run, immediately append the message AND schedule history refresh instead of deferring refresh until pendingRuns.isEmpty. This prevents cross-client history gaps.

Added test cases:
- appendsFinalAssistantMessageImmediatelyAndRefreshesHistory: validates history sync after local finals
- externalFinalMessageRefreshesHistoryEvenDuringLocalPendingRun: validates multi-client reconciliation

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: aa25cd5e5c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread apps/shared/OpenClawKit/Sources/OpenClawChatUI/ChatViewModel.swift Outdated
@eulicesl eulicesl force-pushed the upstream/fix-ios-streaming-layout-flicker branch from aa25cd5 to 830fb0c Compare March 20, 2026 13:03
@eulicesl eulicesl force-pushed the upstream/fix-ios-streaming-layout-flicker branch from 830fb0c to ad77e13 Compare March 20, 2026 15:01
@eulicesl eulicesl force-pushed the upstream/fix-ios-streaming-layout-flicker branch from ad77e13 to 3d5e074 Compare March 21, 2026 21:10
@eulicesl eulicesl force-pushed the upstream/fix-ios-streaming-layout-flicker branch from 0c32438 to 01acd67 Compare March 23, 2026 02:52
@eulicesl

eulicesl commented Apr 6, 2026

Copy link
Copy Markdown
Contributor Author

Maintainer Review Update — 2026-04-06

All open review threads resolved. CI is passing.

CI Status

  • ✅ All checks passing

Validation

  • ✅ iOS Simulator build (Swift 6.3 / Xcode 26.4)
  • ✅ iOS device build (iPhone 17 Pro Max, iOS 26.2.0)

Ready for maintainer review.

@eulicesl eulicesl force-pushed the upstream/fix-ios-streaming-layout-flicker branch from 55779a7 to 6f520a8 Compare April 14, 2026 12:44

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6f520a890b

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread apps/shared/OpenClawKit/Sources/OpenClawChatUI/ChatViewModel.swift Outdated
@eulicesl eulicesl force-pushed the upstream/fix-ios-streaming-layout-flicker branch from 6f520a8 to 8bfa960 Compare April 17, 2026 12:29
eulicesl added a commit to eulicesl/openclaw that referenced this pull request Apr 17, 2026
P1 review thread fixes for openclaw#50483:

1. History sync after local finals: appendFinalAssistantMessage now always calls refreshHistoryAfterRun() instead of only when the append failed. This ensures tool results from local runs are persisted to canonical history.

2. Multi-client reconciliation: When external client finals arrive while this client has a pending run, immediately append the message AND schedule history refresh instead of deferring refresh until pendingRuns.isEmpty. This prevents cross-client history gaps.

Added test cases:
- appendsFinalAssistantMessageImmediatelyAndRefreshesHistory: validates history sync after local finals
- externalFinalMessageRefreshesHistoryEvenDuringLocalPendingRun: validates multi-client reconciliation
eulicesl added a commit to eulicesl/openclaw that referenced this pull request Apr 17, 2026
Address copilot-pull-request-reviewer feedback on openclaw#50483:
- Track optimistically-appended assistant messages by UUID in
  pendingOptimisticMessageIds so they are not dropped during the
  race window between append and history refresh.
- reconcileRunRefreshMessages now accepts pendingOptimisticIds and
  preserves any previous messages matching those IDs that are absent
  from incoming history.
- refreshHistoryAfterRun clears pendingOptimisticIds for messages
  now confirmed by the incoming history payload.
- bootstrap also passes pendingOptimisticIds to reconcileRunRefreshMessages
  so bootstrap history loads are consistent with the same logic.
- appendFinalAssistantMessage tracks new message IDs on append.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8bfa960895

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread apps/shared/OpenClawKit/Sources/OpenClawChatUI/ChatViewModel.swift Outdated

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1768568561

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread apps/shared/OpenClawKit/Sources/OpenClawChatUI/ChatViewModel.swift

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: bbd9bbd46c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread apps/shared/OpenClawKit/Sources/OpenClawChatUI/ChatViewModel.swift

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 89dab2506c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread apps/shared/OpenClawKit/Sources/OpenClawChatUI/ChatViewModel.swift

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 151405b3ff

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread apps/shared/OpenClawKit/Sources/OpenClawChatUI/ChatViewModel.swift Outdated
@chatgpt-codex-connector

Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ff1ecc4249

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread apps/shared/OpenClawKit/Tests/OpenClawKitTests/ChatViewModelTests.swift Outdated

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f5a032f220

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread apps/shared/OpenClawKit/Tests/OpenClawKitTests/ChatViewModelTests.swift Outdated
@chatgpt-codex-connector

Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

@clawsweeper

clawsweeper Bot commented Apr 28, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs real behavior proof before merge. Reviewed June 8, 2026, 1:18 AM ET / 05:18 UTC.

Summary
The PR changes shared OpenClawChatUI Swift chat streaming/session handling, assistant text parsing, chat view scroll/typing behavior, related Swift tests, and CHANGELOG.md.

PR surface: Docs +1, Other +396. Total +397 across 7 files.

Reproducibility: yes. for the review blockers: PR-head source shows the recovered-bootstrap flag gap and the stale-refresh assistant-final path. The original iOS flicker/session-flow symptom still lacks an inspectable runtime artifact in this review.

Review metrics: 1 noteworthy metric.

  • Release-owned changelog entries: 1 added. Normal PRs should not edit CHANGELOG.md because release generation owns that file.

Merge readiness
Overall: 🧂 unranked krab
Proof: 🧂 unranked krab
Patch quality: 🦪 silver shellfish
Result: blocked until real behavior proof is added.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

  • Rebase on current main and preserve the newer ChatViewModel session/history flow.
  • [P1] Fix the recovered-bootstrap and optimistic-assistant-final refresh defects.
  • Attach redacted real behavior proof for the iOS streaming/session scenario.

Proof guidance:

  • [P1] Needs real behavior proof before merge: No inspectable after-fix runtime artifact is attached; add a redacted recording, terminal output, copied live output, runtime logs, or linked artifact, then update the PR body to trigger re-review or ask a maintainer to comment @clawsweeper re-review.

Mantis proof suggestion
The user-visible iOS streaming/layout behavior would benefit from a short real app visual proof after the branch is rebased. A maintainer can ask Mantis to capture proof by posting a new PR comment that starts with the OpenClaw Mantis account mention, followed by:

visual task: verify iOS chat streaming stays attached to the active session and remains visually stable during bootstrap/history refresh.

Risk before merge

  • [P1] The branch is dirty against current main, and current main has newer generation-scoped ChatViewModel/session handling that conflict resolution must preserve.
  • [P1] Missing inspectable after-fix runtime proof is still a merge gate for this external user-visible iOS chat behavior.
  • [P1] The diff changes event acceptance, bootstrap, refresh, and single-lane streaming state, so a bad merge can regress session association or message delivery.
  • [P1] The PR edits CHANGELOG.md even though release generation owns that file.

Maintainer options:

  1. Rebase and repair before merge (recommended)
    Refresh the branch on current main, resolve ChatViewModel conflicts in favor of the newer session/history flow, and fix the two state defects before review continues.
  2. Require real iOS proof
    Keep the PR unmerged until a redacted recording, terminal output, runtime log, or linked artifact shows the after-fix streaming/session behavior in a real app setup.
  3. Close if the stale branch cannot be narrowed
    If rebasing would discard most of the branch, maintainers can close this PR and ask for a new narrow PR with only the still-missing UI/parser fixes.

Next step before merge

  • [P1] Human/contributor follow-up is needed because the branch is dirty, real behavior proof is missing, and conflict resolution must preserve current main's newer ChatViewModel behavior.

Security
Cleared: The diff is limited to Swift chat UI/model/tests and a changelog entry; no concrete security or supply-chain concern was found.

Review findings

  • [P2] Mark recovered bootstraps as loaded — apps/shared/OpenClawKit/Sources/OpenClawChatUI/ChatViewModel.swift:112-113
  • [P2] Preserve optimistic assistant finals through refresh — apps/shared/OpenClawKit/Sources/OpenClawChatUI/ChatViewModel.swift:1130-1136
  • [P3] Remove the changelog entry — CHANGELOG.md:92
Review details

Best possible solution:

Rebase onto current main, preserve the newer ChatViewModel session/history flow, fix the remaining stale-refresh and load-recovery defects, remove the changelog entry, and attach redacted real behavior proof for the iOS streaming scenario.

Do we have a high-confidence way to reproduce the issue?

Yes for the review blockers: PR-head source shows the recovered-bootstrap flag gap and the stale-refresh assistant-final path. The original iOS flicker/session-flow symptom still lacks an inspectable runtime artifact in this review.

Is this the best way to solve the issue?

No: the branch is a plausible bug fix, but not the best merge shape against current main until it is rebased, preserves current main's newer session/history implementation, and fixes the remaining state defects.

Full review comments:

  • [P2] Mark recovered bootstraps as loaded — apps/shared/OpenClawKit/Sources/OpenClawChatUI/ChatViewModel.swift:112-113
    After an initial bootstrap failure, a successful manual refresh() calls bootstrap() but never flips hasLoadedInitialState back to true. The next onAppear can run load() again and start a full bootstrap, which clears pending runs, tool calls, and streamingAssistantText; set the flag after any successful bootstrap, not only before load() starts one.
    Confidence: 0.9
  • [P2] Preserve optimistic assistant finals through refresh — apps/shared/OpenClawKit/Sources/OpenClawChatUI/ChatViewModel.swift:1130-1136
    appendFinalAssistantMessage appends a decoded or streamed assistant final, and handleChatEvent immediately refreshes history. If that first history response is stale but nonempty, reconcileRunRefreshMessages carries forward only local user messages, so the assistant final can appear and disappear until a later refresh; track or merge optimistic assistant finals until history confirms them.
    Confidence: 0.86
  • [P3] Remove the changelog entry — CHANGELOG.md:92
    CHANGELOG.md is release-owned in this repo. Keep the release-note context in the PR body or squash message instead, otherwise this normal PR can create release-generation churn and conflicts.
    Confidence: 0.96

Overall correctness: patch is incorrect
Overall confidence: 0.86

AGENTS.md: found and applied where relevant.

Codex review notes: model gpt-5.5, reasoning high; reviewed against b8adc11977ab.

Label changes

Label justifications:

  • P2: The PR targets a normal-priority shared chat UI/session-state bug with user-visible streaming instability but limited blast radius.
  • merge-risk: 🚨 session-state: The diff changes bootstrap, refresh, session switching, and pending-run state in the shared ChatViewModel.
  • merge-risk: 🚨 message-delivery: The diff changes which chat and agent streaming/final events are accepted, ignored, appended, or reconciled.
  • rating: 🧂 unranked krab: Overall readiness is 🧂 unranked krab; proof is 🧂 unranked krab and patch quality is 🦪 silver shellfish.
  • status: 📣 needs proof: The PR needs real behavior proof before ClawSweeper can clear the contributor ask. Needs real behavior proof before merge: No inspectable after-fix runtime artifact is attached; add a redacted recording, terminal output, copied live output, runtime logs, or linked artifact, then update the PR body to trigger re-review or ask a maintainer to comment @clawsweeper re-review.
Evidence reviewed

PR surface:

Docs +1, Other +396. Total +397 across 7 files.

View PR surface stats
Area Files Added Removed Net
Source 0 0 0 0
Tests 0 0 0 0
Docs 1 1 0 +1
Config 0 0 0 0
Generated 0 0 0 0
Other 6 415 19 +396
Total 7 416 19 +397

Acceptance criteria:

  • [P1] swift test --package-path apps/shared/OpenClawKit --filter ChatViewModelTests.
  • [P1] swift test --package-path apps/shared/OpenClawKit --filter AssistantTextParserTests.
  • [P1] xcodebuild -project apps/ios/OpenClaw.xcodeproj -scheme OpenClaw -destination 'generic/platform=iOS Simulator' CODE_SIGNING_ALLOWED=NO build.

What I checked:

Likely related people:

  • steipete: Current-line blame for the newer ChatViewModel bootstrap/session handling in this checkout points to Peter Steinberger on the grafted main commit. (role: recent area contributor; confidence: medium; commits: bab18d567b0c; files: apps/shared/OpenClawKit/Sources/OpenClawChatUI/ChatViewModel.swift)
  • Mariano: Local history shows canonical main-session alias and reconnect-resync work in the same ChatViewModel area. (role: adjacent feature contributor; confidence: medium; commits: fe3f0759b5c4, 42d11a3ec5f4; files: apps/shared/OpenClawKit/Sources/OpenClawChatUI/ChatViewModel.swift, apps/shared/OpenClawKit/Tests/OpenClawKitTests/ChatViewModelTests.swift)
  • Nachx639: Local history shows canonical session-key and reset-trigger changes touching ChatViewModel, ChatTransport, and tests shortly before this PR. (role: adjacent feature contributor; confidence: medium; commits: e951a42bcb86; files: apps/shared/OpenClawKit/Sources/OpenClawChatUI/ChatViewModel.swift, apps/shared/OpenClawKit/Sources/OpenClawChatUI/ChatTransport.swift, apps/shared/OpenClawKit/Tests/OpenClawKitTests/ChatViewModelTests.swift)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

@eulicesl

eulicesl commented May 5, 2026

Copy link
Copy Markdown
Contributor Author

Rebuilt this branch on top of current main and force-pushed a clean refresh.

What changed:

  • kept the chat/session-flow fixes
  • dropped the unrelated Windows installer / CI retrigger commits
  • added the missing changelog entry

Local verification:

  • swift test in apps/shared/OpenClawKit

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 29f6bd56d9

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

await self.fetchModels()
self.errorText = nil
} catch {
self.hasLoadedInitialState = false

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Mark recovered refreshes as loaded

When the initial load() bootstrap fails, this resets hasLoadedInitialState, but a later successful manual refresh() calls bootstrap() without ever setting the flag back to true. In that recovery path, the next view onAppear will run load() again and start a full bootstrap, which clears pending runs, tool calls, and streamingAssistantText; navigating away and back during a new in-flight reply after recovering from a startup error can therefore drop the active stream. Set the flag on any successful bootstrap, not only before load() starts one.

Useful? React with 👍 / 👎.

@clawsweeper

clawsweeper Bot commented May 19, 2026

Copy link
Copy Markdown
Contributor

ClawSweeper PR egg

🎁 Pass real behavior proof to wake the egg and unlock a hatchable treat.

Where did the egg go?
  • The egg game starts only after the PR passes the real-behavior proof check.
  • Before that, no creature or rarity is rolled. The treat waits for real proof.
  • This is still just collectible flavor: proof affects review readiness, not creature quality.

@openclaw-barnacle

Copy link
Copy Markdown

This pull request has been automatically marked as stale due to inactivity.
Please add updates or it will be closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

merge-risk: 🚨 message-delivery 🚨 May drop, duplicate, misroute, suppress, or wrongly target messages. merge-risk: 🚨 session-state 🚨 May lose, corrupt, stale, or mis-associate session, agent, or context state. P2 Normal backlog priority with limited blast radius. rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. size: M stale Marked as stale due to inactivity status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants