fix #87699: [Bug]: [BUG] UI shows agent "running" after conversation ends β requires manual page refresh every time#89727
Conversation
|
Codex review: needs maintainer review before merge. Reviewed June 3, 2026, 4:22 AM ET / 08:22 UTC. Summary PR surface: Source +11, Tests +52. Total +63 across 4 files. Reproducibility: yes. Source inspection shows current Review metrics: none identified. Merge readiness Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch. Risk before merge
Maintainer options:
Next step before merge
Security Review detailsBest possible solution: Land the focused Control UI lifecycle-helper fix after normal maintainer and CI review, without changing gateway protocol or adding a second run-state cleanup path. Do we have a high-confidence way to reproduce the issue? Yes. Source inspection shows current Is this the best way to solve the issue? Yes. Routing ACK completion through the existing run lifecycle helper is the narrowest maintainable fix; changing gateway/session protocol or adding another cleanup branch would duplicate the owner path. AGENTS.md: found and applied where relevant. Codex review notes: model gpt-5.5, reasoning high; reviewed against 34c3827290a7. Label changesLabel changes:
Label justifications:
Evidence reviewedPR surface: Source +11, Tests +52. Total +63 across 4 files. View PR surface stats
What I checked:
Likely related people:
What the crustacean ranks mean
Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics. How this review workflow works
|
a7db911 to
fb7a332
Compare
|
Maintainer verification after rebasing onto current
|
Summary
chat.sendACKs as completed runs in both direct and queued Control UI send paths.sessions.listactive row.check-test-typesfailure by making the test-onlyloadSessionscast go throughunknown.Root Cause
Root cause: When
chat.sendreturned a terminalokACK, the Control UI only cleared a few local stream fields. It did not publish a terminal run status or arm the existing local terminal reconciliation window. A latersessions.listresponse could therefore reintroduce a stale active row and make the composer showStop/In progressafter the conversation had already finished.Why this is root-cause fix: The existing lifecycle helper is already the single place that clears local run state, writes terminal session state, publishes terminal UI status, and arms stale-row suppression. Routing ACK
okthrough that helper makes the synchronous completion path follow the same invariant as other terminal run paths instead of only hiding part of the local stream state.Verification
node scripts/run-vitest.mjs ui/src/ui/app-chat.test.ts ui/src/ui/controllers/chat.test.ts ui/src/ui/chat/run-lifecycle.test.tsβ 3 files passed, 184 tests passed.node scripts/run-tsgo.mjs -p test/tsconfig/tsconfig.core.test.json --incremental --tsBuildInfoFile .artifacts/tsgo-cache/core-test.tsbuildinfoβ passed on follow-up commita7db91173c, covering the CI failure reported bycheck-test-types.node scripts/run-vitest.mjs ui/src/ui/app-gateway.sessions.node.test.tsβ 1 file passed, 29 tests passed.Regression Test Plan
Target test file:
ui/src/ui/app-chat.test.ts,ui/src/ui/controllers/chat.test.ts,ui/src/ui/chat/run-lifecycle.test.ts, andui/src/ui/app-gateway.sessions.node.test.ts.Patch quality notes: The production change is limited to the two Control UI
chat.sendACKokbranches. It reuses the existing run lifecycle helper instead of adding a second cleanup path. The new regression test simulates the race that matters for [Bug]: [BUG] UI shows agent "running" after conversation ends β requires manual page refresh every timeΒ #87699: terminal ACK first, then stale activesessions.listdata. The follow-up commit is test-only and fixes the TypeScript cast shape used by that regression test.Fix classification: Root-cause UI session-state bug fix plus test-only CI type fix.
Maintainer-ready confidence: High for the touched desktop Control UI chat path because the fix is small, covered by targeted regression tests, verified through a live local Control UI run, and the current head has the failing test-types shard reproduced locally.
Real behavior proof
Behavior or issue addressed: The Control UI should not keep showing
Stop/In progressafter a chat turn has completed.Why it matters / User impact: Before this fix, users could see an agent as still running after the visible conversation ended and had to refresh the page before sending reliably again. The fix restores the expected idle composer state without a manual refresh.
Real environment tested: Local dev gateway serving the patched Control UI from this branch, opened in headless Chromium through Playwright.
Exact steps or command run after this patch: Started a local dev gateway on an isolated port, opened
/chatin Chromium, sentReply exactly: OC_LIVE_OK_87699, waited for an assistant response containingOC_LIVE_OK_87699, then sampled the app state immediately and 8 seconds later.Evidence after fix: After the assistant response appeared, the browser state had
chatRunId=null,chatStream=null, sessionhasActiveRun=false,Stopbutton count0, and noIn progresstext.Copied live output:
Observed result after fix: The composer returned to the normal Send state without requiring a manual page refresh; 8 seconds later
Stopwas still absent andIn progresswas still absent.What was not tested: I did not run the full UI suite or remote CI locally. The follow-up commit only changes a test cast, so I re-ran the failed typecheck shard and the related UI chat tests instead of repeating the browser live proof.
Review findings addressed
RF-001 (P1, ClawSweeper session-state risk): Fixed/covered on current head. Terminal ACK
okhandling passes the gateway ACKrunIdand the currentsessionKeyinto the existing lifecycle reconciliation helper, so the stale active-row suppression is scoped through the same session lifecycle path as other terminal outcomes. The regression test covers terminal ACK completion followed by stale activesessions.listdata, and the live Control UI proof showschatRunId=null,hasActiveRun=false, noStopbutton, and noIn progresstext after completion.RF-002 (P2, no repair lane needed): Addressed on current head. The repair lane only added a test-only
unknowncast to clear the PR-introducedcheck-test-typesfailure; the production fix remains the same focused Control UI session-state reconciliation change.Linked issue live-proof request: Satisfied by the local dev gateway + headless Chromium proof copied above. The follow-up commit is test-only, so I re-ran the failed typecheck shard and related UI chat regression tests instead of repeating the browser proof.
Merge risk
Risk labels considered:
merge-risk:session-stateis relevant because this changes Control UI run-state reconciliation. Auth-provider, automation, availability, compatibility, message-delivery, security-boundary, protocol, public API, public config, and schema risks are not applicable to this PR because the committed diff only touches Control UI run-state handling and UI tests.Risk explanation: The behavior change is intentionally narrow: only terminal ACK
okhandling changes. Non-terminalstarted/in_flightACKs still adopt the run id as before, and gateway protocol/config/auth/channel behavior is unchanged.Why acceptable: The existing lifecycle helper already owns terminal run cleanup and stale active-row suppression. Using it for ACK
okremoves a one-off cleanup branch and makes the completion path more consistent with the rest of the Control UI.