fix(tui-gateway): re-arm WS orphan reap while a detached session is mid-turn#44102
Open
AIalliAI wants to merge 1 commit into
Open
fix(tui-gateway): re-arm WS orphan reap while a detached session is mid-turn#44102AIalliAI wants to merge 1 commit into
AIalliAI wants to merge 1 commit into
Conversation
…id-turn The disconnect-path reap timer was one-shot: if the grace window expired while the detached session still had a turn in flight (long tool calls, context compression, the post-turn background review pass), _ws_session_is_orphaned spared it once and nothing ever re-checked. The session stayed parked on the drop transport with its DB row un-ended until the multi-hour idle reaper, so the compression continuation child showed up as a separate "active" Desktop/TUI chat. _reap now re-arms itself while the session is detached, un-finalized, and running; a reconnect that re-binds a live transport still cancels the chain, and _finalize_session already ends the rotated continuation session id (NousResearch#20001), so the reap lands on the right DB row. Fixes NousResearch#44045 Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Contributor
Author
|
Requesting maintainer review — this is ready to land from my side. Standalone fork CI is pending first-run approval here; the rollup branch in #44061 carrying this session's batch is fully green on upstream CI (all test shards, typecheck, e2e). |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The Desktop/TUI websocket disconnect path parks a detached session on the drop transport and schedules a one-shot grace timer (
_schedule_ws_orphan_reap)._ws_session_is_orphaneddeliberately spares sessions that are mid-turn — but when the in-flight turn outlives the grace window (long tool calls, context compression, the post-turn background review pass), the timer fired once, sawrunning=True, returned, and nothing ever re-checked. The session then sat detached with its DB row un-ended until the multi-hour idle reaper (_SESSION_TTL_S= 6h default), so the compression continuation child showed up as a separate "active" Desktop chat alongside the user's real workflow — the exactended_at = NULLghost row in the report._reapnow re-arms itself while the session is detached, un-finalized, and still running. A reconnect orsession.resumethat re-binds a live transport still cancels the chain exactly as before, and since_finalize_sessionalready endsagent.session_id(the rotated continuation id, #20001), the reap lands on the right DB row withend_reason=ws_orphan_reap.Fixes #44045
Why this addresses the reported shape
end_reason='compression', childended_at=NULL: compression rotates the session id mid-turn; the client had already disconnected, so the spared-once reap never came back for the continuation.active_sessions: 1: the ghost is not running anything — it's only parked in_sessionsand projected as active via its un-ended DB row.Testing
Two new tests in
tests/test_tui_gateway_server.py(pattern-matched to the existing orphan-reap tests, no real timers):test_ws_orphan_reap_rearms_while_detached_session_is_mid_turn— grace expires mid-turn → spared and re-armed; turn ends → re-armed timer reaps withws_orphan_reap; no further re-arm. Fails onmain(timer never re-arms), passes with the fix.test_ws_orphan_reap_does_not_rearm_after_reattach— a mid-turn session that re-bound a live transport stops the chain (guards the reconnect behavior).tests/test_tui_gateway_server.py+tests/test_tui_gateway_ws.py: 260 passed, 1 pre-existing environment-dependent failure (test_browser_manage_connect_default_local_reports_launch_hint, fails identically on cleanmain).