fix(desktop): rebind sessions after websocket reconnect (salvage of #41740)#43004
Merged
Conversation
The reconnect fix turns on two subtle conditions with no inline rationale: `seenGatewayStateRef` suppresses a spurious "became open" on the first effect run (so a session mounting with the gateway already open doesn't double-resume), and the `gatewayBecameOpen ||` arm forces a re-resume even when the route looks `alreadyActive` because the cached runtime id can be stale after the gateway rebinds/reaps the session. Comment both so the next reader doesn't "simplify" them back into the original bug. No behavior change.
Contributor
🔎 Lint report:
|
Contributor
|
Verification comment — reviewed the diff and found no issues.
|
The check-attribution CI job requires contributor emails from PR commits to appear in scripts/release.py AUTHOR_MAP.
wachoo
pushed a commit
to wachoo/hermes-agent
that referenced
this pull request
Jun 10, 2026
…ousResearch#41740) (NousResearch#43004) * fix(desktop): rebind sessions after websocket reconnect * docs(desktop): explain the reconnect-resume guard in use-route-resume The reconnect fix turns on two subtle conditions with no inline rationale: `seenGatewayStateRef` suppresses a spurious "became open" on the first effect run (so a session mounting with the gateway already open doesn't double-resume), and the `gatewayBecameOpen ||` arm forces a re-resume even when the route looks `alreadyActive` because the cached runtime id can be stale after the gateway rebinds/reaps the session. Comment both so the next reader doesn't "simplify" them back into the original bug. No behavior change. --------- Co-authored-by: Josh Dow <josh.dow@prepad.io>
changman
pushed a commit
to changman/hermes-agent
that referenced
this pull request
Jun 10, 2026
…ousResearch#41740) (NousResearch#43004) * fix(desktop): rebind sessions after websocket reconnect * docs(desktop): explain the reconnect-resume guard in use-route-resume The reconnect fix turns on two subtle conditions with no inline rationale: `seenGatewayStateRef` suppresses a spurious "became open" on the first effect run (so a session mounting with the gateway already open doesn't double-resume), and the `gatewayBecameOpen ||` arm forces a re-resume even when the route looks `alreadyActive` because the cached runtime id can be stale after the gateway rebinds/reaps the session. Comment both so the next reader doesn't "simplify" them back into the original bug. No behavior change. --------- Co-authored-by: Josh Dow <josh.dow@prepad.io>
13 tasks
alt-glitch
pushed a commit
that referenced
this pull request
Jun 14, 2026
…41740) (#43004) * fix(desktop): rebind sessions after websocket reconnect * docs(desktop): explain the reconnect-resume guard in use-route-resume The reconnect fix turns on two subtle conditions with no inline rationale: `seenGatewayStateRef` suppresses a spurious "became open" on the first effect run (so a session mounting with the gateway already open doesn't double-resume), and the `gatewayBecameOpen ||` arm forces a re-resume even when the route looks `alreadyActive` because the cached runtime id can be stale after the gateway rebinds/reaps the session. Comment both so the next reader doesn't "simplify" them back into the original bug. No behavior change. --------- Co-authored-by: Josh Dow <josh.dow@prepad.io>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Salvages #41740 by @joshuadow and adds a comment-only follow-up. Supersedes #41740.
Problem
When Windows Desktop talks to a WSL-hosted gateway (with or without a reverse proxy), the WebSocket transport can drop while the backend process and its stored sessions stay alive.
session.activatereturned the parked session without rebinding its transport, so it stayed pointed at_detached_ws_transportand the grace-windowed orphan reaper tore it down. Desktop, meanwhile, trusted its cached runtime id after reconnect, so it kept using a stale id and eventually surfacedsession not founduntil restart.Fix
session.activatenow passestransport=current_transport() or _stdio_transportto_live_session_payload, which rebindssession["transport"]. This is the same reattachsession.resumealready does, so an activate after reconnect cancels the orphan reap instead of racing it.useRouteResumere-runs the resume when the gateway transitions back to open, even if the selected stored session still has a cached runtime id. AseenGatewayStateRefguard keeps the initial mount from being mistaken for a reconnect.Soundness check
_ws_session_is_orphanedkeys solely offsession["transport"] is _detached_ws_transport, so rebinding a live transport is exactly what clears the orphan state (_schedule_ws_orphan_reapre-checks under the resume lock)._live_session_payloadalready acceptstransport=and rebinds underhistory_lock; activate now uses it the same way resume does.seenGatewayStateRefprevents a double-resume on mount.tsconfig.jsonES2023 bump is already onmain, so the cherry-pick dropped that one redundant file; the substantive changes all landed and the merge cleanly composed with main'sstuckOnRoutedSessionself-heal.Commits
fix(desktop): rebind sessions after websocket reconnect— @joshuadow's fix, cherry-picked (authorship preserved).docs(desktop): explain the reconnect-resume guard in use-route-resume— comment-only follow-up documenting whyseenGatewayStateRefand thegatewayBecameOpen ||arm exist, so they don't get "simplified" back into the bug.Test plan
scripts/run_tests.sh tests/tui_gateway/test_protocol.py— 59 passedapps/desktopnpm run type-check— cleanapps/desktopnpm run test:ui -- --run src/app/session/hooks/use-route-resume.test.tsx— 4 passednpx eslint src/app/session/hooks/use-route-resume.ts— clean