fix(tui): resync streaming watchdog after reconnect#74224
fix(tui): resync streaming watchdog after reconnect#74224fabianwilliams merged 3 commits intoopenclaw:mainfrom
Conversation
|
Codex review: needs maintainer review before merge. Keep this PR open for normal maintainer review. Current main still lacks the TUI reconnect/watchdog resync behavior for #69081, while this PR is a focused implementation candidate with tests and no obvious supply-chain-sensitive changes. Maintainer follow-up before merge: Keep this PR open and review it as the focused implementation for #69081. If the approach is sound, land the narrow TUI fix that treats active-run tool and non-terminal lifecycle events as proof-of-life, pauses watchdog timers while disconnected, rearms reconnect recovery without dropping legitimately in-flight runs, and reloads history when reconnect-stale runs time out. Best possible solution: Keep this PR open and review it as the focused implementation for #69081. If the approach is sound, land the narrow TUI fix that treats active-run tool and non-terminal lifecycle events as proof-of-life, pauses watchdog timers while disconnected, rearms reconnect recovery without dropping legitimately in-flight runs, and reloads history when reconnect-stale runs time out. Acceptance criteria:
What I checked:
Likely related people:
Remaining risk / open question:
Codex review notes: model gpt-5.5, reasoning high; reviewed against a1197b907524. |
Greptile SummaryThis PR fixes the streaming watchdog in the TUI to survive reconnects: it pauses the watchdog on disconnect, rearms it (preserving Confidence Score: 4/5Safe to merge; both findings are narrow edge-case P2s that don't affect the common reconnect path. No P0/P1 findings. Two P2 issues: a possible double
Prompt To Fix All With AIThis is a comment left during a code review.
Path: src/tui/tui-event-handlers.ts
Line: 123-128
Comment:
**Potential double `loadHistory` call in reconnect watchdog path**
`flushPendingHistoryRefreshIfIdle()` is called unconditionally on line 123 before the `reconnectPendingRunId` guard on line 124. If `pendingHistoryRefresh` happens to be `true` at this point (e.g. a local BTW run finalized while the reconnect-pending run was still active), both `flushPendingHistoryRefreshIfIdle` and the explicit `void loadHistory?.()` on line 126 will fire, issuing two concurrent history fetches. Consider moving the `reconnectPendingRunId` check before `flushPendingHistoryRefreshIfIdle`, or resetting `pendingHistoryRefresh` inside the reconnect branch to prevent the duplicate call.
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: src/tui/tui-event-handlers.ts
Line: 236-240
Comment:
**`activityStatus` not reset when stale `activeChatRunId` is dropped on reconnect**
`state.activeChatRunId` is set to `null` on line 238 before `clearStaleStreamingRunIfNoTrackedRunRemains()` is called. That helper's first check is `!activeRunId`, so it returns early immediately — leaving `state.activityStatus` stuck at `"streaming"` and `flushPendingHistoryRefreshIfIdle()` never called. After reconnect the TUI can keep displaying a streaming indicator for a run that no longer exists. The fix is to call `setActivityStatus("idle")` (and `flushPendingHistoryRefreshIfIdle()`) explicitly in this branch instead of delegating to a helper that will no-op on the already-nulled field.
How can I resolve this? If you propose a fix, please make it concise.Reviews (1): Last reviewed commit: "fix(tui): keep reconnect history fallbac..." | Re-trigger Greptile |
c0120b9 to
215f687
Compare
fabianwilliams
left a comment
There was a problem hiding this comment.
LGTM. Watchdog state-machine fix done correctly: (1) treats active-run tool events as proof-of-life so legitimate long-running tool calls don't flip to false-idle, (2) pause/rearm semantics around reconnect preserve activeChatRunId instead of dropping it, (3) reload-history-on-stale-reconnect-timeout closes the loop on missed final events. Tests cover all four scenarios distinctly. Changelog attribution to @EenvoudJasper is the right shape.
* fix(tui): resync streaming watchdog after reconnect * fix(tui): keep reconnect history fallback armed * fix(tui): tighten reconnect watchdog recovery
* fix(tui): resync streaming watchdog after reconnect * fix(tui): keep reconnect history fallback armed * fix(tui): tighten reconnect watchdog recovery
Summary
activeChatRunIdFixes #69081.
Validation
pnpm test src/tui/tui-event-handlers.test.ts src/tui/tui-session-actions.test.ts src/tui/tui-command-handlers.test.ts src/tui/embedded-backend.test.tspnpm exec oxfmt --check --threads=1 src/tui/tui-event-handlers.ts src/tui/tui.ts src/tui/tui-event-handlers.test.ts CHANGELOG.mdgit diff --checkNotes
activeChatRunId.pnpm check:changedcurrently fails in unrelated pre-existing core typecheck drift outside this patch, includingsrc/agents/openai-transport-stream.ts,src/config/types.models.ts, andsrc/plugin-sdk/provider-*.