fix(desktop): recoverable boot error after prolonged gateway drop (escape hatch)#147
Merged
Conversation
…escape hatch) When the (remote) gateway dropped post-boot and every reconnect kept failing, useGatewayBoot looped the backoff forever with boot.error=null — so the CONNECTING screen covered the app with no recovery surface (the dead-end CONNECTING combo, especially after a remote VPS/tunnel goes away). Two "FIX:"-prefixed specs in use-gateway-boot.test.tsx asserted the intended behavior and were red. Now: once the reconnect backoff passes the 6th attempt (~45s of sustained failure, post-boot only), raise a RECOVERABLE failDesktopBoot error so BootFailureOverlay (Use local gateway / Sign in / Retry) becomes reachable. The backoff keeps running underneath; a later successful reconnect calls completeDesktopBoot to clear the error and hide the overlay. Below the threshold the drop is still treated as transient (no error), preserving the quiet sleep/wake reconnect. - use-gateway-boot.ts: RECONNECT_ESCALATION_ATTEMPTS=6; raise in scheduleReconnect past the threshold; clear in onState 'open'. - i18n: boot.errors.gatewayUnreachable (en/zh/ja/zh-hant + types). Fixes the 2 use-gateway-boot escape-hatch tests (now 4/4). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
🔎 Lint report:
|
…ale since original commit) The 'uses widthOverride from the store' test set an override on a non-resizable pane, but trackForPane only applies overrides to resizable panes (that's where an override originates — drag-resize). Test predates that gating (untouched since 51c68d4). Mark the pane resizable so it exercises the real path. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
When the gateway dropped after a healthy boot and every reconnect kept failing (classic case: a remote gateway / tunnel goes away),
useGatewayBootlooped the backoff forever withboot.errorstill null — so the CONNECTING screen covered the whole app with no recovery surface. TwoFIX:-prefixed specs inuse-gateway-boot.test.tsxdocumented the intended escape hatch and were red on main (part of the baseline-red masking;hermes-desktop-preexisting-test-failures-20260610).What changed
apps/desktop/src/app/gateway/hooks/use-gateway-boot.ts: newRECONNECT_ESCALATION_ATTEMPTS = 6. InscheduleReconnect, oncereconnectAttemptpasses 6 (~45s of sustained failure, post-boot only — initial-boot failure has its own path) and no error is set yet, raise a recoverablefailDesktopBoot(boot.errors.gatewayUnreachable)soBootFailureOverlay(Use local gateway / Sign in / Retry) replaces the dead-end spinner. The backoff loop keeps running;onState('open')callscompleteDesktopBoot()to clear the error and hide the overlay when a reconnect finally succeeds. Below the threshold a drop stays transient (no error) — the quiet sleep/wake reconnect is preserved.i18n:boot.errors.gatewayUnreachablein en/zh/ja/zh-hant + types.How to review
use-gateway-boot.ts: the constant, the raise inscheduleReconnect, the clear inonState('open').bootCompleted(don't fight the initial-boot error path) and!$desktopBoot.get().error(raise once per episode).Evidence
FIX: after the prolonged drop the hook raises a recoverable boot errorandFIX: a successful reconnect clears the recoverable errornow pass; thedead-end CONNECTING combotest still passes (sub-threshold stays null).Verification
apps/desktop:tsc -b0 errors;vitest use-gateway-boot.test.tsx4/4. Fullvitest src: 480 passed / 5 failed — down from 7; the 5 remaining are the other pre-existing failures in untouched files (model-settings, toolset-config, pane-shell, streaming, use-prompt-actions sleep/wake), tracked separately.Risks / gaps
hermes-desktop-preexisting-test-failures-20260610.Collaborators