Skip to content

feat(stability): manual Resync action for Passive Mode sources (#3122)#3126

Merged
Yeraze merged 1 commit into
mainfrom
fix/issue-3122-manual-resync
May 21, 2026
Merged

feat(stability): manual Resync action for Passive Mode sources (#3122)#3126
Yeraze merged 1 commit into
mainfrom
fix/issue-3122-manual-resync

Conversation

@Yeraze

@Yeraze Yeraze commented May 21, 2026

Copy link
Copy Markdown
Owner

Follow-up to #3125. Implements the Manual Resync action the reporter called out as item 1 in their feedback on the issue.

Why

With Passive Mode enabled, automatic reconnects skip `want_config_id` while the cached config is "fresh" (< 4 h). That's the right default for stability, but it leaves the operator without a way to force a refresh when they know the device config changed (remote channel rekey, hwModel swap, role change). This adds a per-source action that forces exactly one full sync, with guards so the forced sync can't accidentally become a new failure loop.

What

  • `MeshtasticManager.requestManualResync()` — public method that sends a single `want_config_id` regardless of staleness.
    • Single-flight: only one resync in flight per source.
    • Cooldown: 30 s between attempts (`MANUAL_RESYNC_COOLDOWN_MS`).
    • Watchdog: 120 s max in-flight (`MANUAL_RESYNC_WATCHDOG_MS`). If the device never reaches `configComplete`, the watchdog clears the latch so the button isn't stuck disabled.
    • Recovery latch (`suppressNextAutoSync`): if the forced sync causes the node to close the socket, the next connect re-uses the cached config and does not auto-retry the sync. This is the reporter's "do not let a forced sync turn into a new failure loop" requirement.
  • HTTP routes in `sourceRoutes.ts`:
    • `POST /api/sources/:id/resync` → starts a resync. 200 with `{started, inFlight, cooldownExpiresAt}`; 409 with a `reason` when rejected (in-flight / cooldown / not-connected / send-failed); 400 for non-Meshtastic sources.
    • `GET /api/sources/:id/resync` → reads current `{inFlight, cooldownExpiresAt}` for the UI.
  • UI: "Resync" kebab item on connected meshtastic_tcp source cards. Click POSTs and re-fetches source status so the dot updates.

What's not in this PR

  • A live cooldown countdown / disabled-button state polled from `GET /resync`. v1 surfaces cooldown via the server's 409 response; if the operator clicks during cooldown the request is rejected server-side without ill effect. Worth a follow-up if the silent rejection feels unfriendly.
  • Toast / popover feedback for success vs cooldown. The kebab close + status refetch is the current UX. Easy to layer on later.

Test plan

  • `npx tsc --noEmit` clean
  • `src/server/meshtasticManager.manualResync.test.ts` (11 tests): happy path, suppress latch set, capture state reset, single-flight rejection, cooldown rejection, post-cooldown allow, not-connected rejection, send-failed cleanup, watchdog timeout, state-query getter
  • Full suite: 5275 pass / 3 pre-existing fails (`mqttBrokerManager` zero-hop encode failures; reproduce on clean main, unrelated)
  • Manual smoke against a real meshtastic_tcp source

Fixes the manual-Resync follow-up from #3122.

🤖 Generated with Claude Code

…3122)

Follow-up to #3125. Adds an operator-initiated full config refresh for
Meshtastic TCP sources, useful for Passive Mode where the automatic
reconnect path skips want_config_id while the cached config is fresh.

* MeshtasticManager.requestManualResync(): forces a single want_config_id
  regardless of staleness; single-flight latch; 30s cooldown; 120s
  watchdog so a stuck "in progress" never disables the button forever.
* Recovery latch: if the forced sync triggers a node-side socket close,
  the next connect re-uses the cached config and skips another full
  sync — prevents the forced-sync-becomes-loop failure mode the reporter
  called out in the issue.
* HTTP routes: POST /api/sources/:id/resync to start, GET to read state.
  Returns {inFlight, cooldownExpiresAt, reason?} so the UI can render.
* UI: new "Resync" kebab item on connected meshtastic_tcp source cards.
  Refetches source status after the request so the dot updates.

Tests: 11 new tests covering happy path, single-flight rejection,
cooldown rejection, post-cooldown allow, not-connected rejection,
send-failed cleanup, watchdog timeout, and state-query getter.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions

github-actions Bot commented May 21, 2026

Copy link
Copy Markdown

Claude finished @Yeraze's task —— View job


PR Review for Manual Resync Feature

Completed comprehensive review of the manual resync implementation for Passive Mode sources. Overall, this is a well-designed and thoroughly implemented feature with excellent safety guards.

Todo List

  • Read and analyze core implementation in meshtasticManager.ts
  • Review test coverage in meshtasticManager.manualResync.test.ts
  • Analyze HTTP routes in sourceRoutes.ts
  • Check UI components for proper integration
  • Evaluate overall architecture and provide feedback

Strengths

Robust Safety Architecture

  • Single-flight protection: Prevents concurrent resyncs via manualResyncInFlight flag
  • Cooldown mechanism: 30s between attempts prevents rapid-fire abuse
  • Watchdog timeout: 120s timeout prevents stuck states if device never completes config
  • Recovery latch: suppressNextAutoSync flag prevents failure loops when resync causes disconnection

Excellent Test Coverage

The test suite in meshtasticManager.manualResync.test.ts is comprehensive with 11 test cases covering:

  • Happy path scenarios
  • All guard conditions (in-flight, cooldown, not-connected, send-failed)
  • Edge cases (watchdog timeout, state recovery)
  • Proper cleanup on failures

Clean API Design

  • HTTP routes follow RESTful patterns with appropriate status codes
  • Consistent JSON response shapes for success/error cases
  • Proper permission scoping (sources:write for POST, sources:read for GET)

UI Integration

  • Proper conditional rendering based on source type and connection state
  • Good UX with tooltips explaining the feature
  • Integrates cleanly with existing kebab menu pattern

🔍 Areas for Improvement

Minor Code Quality Issues

  1. Magic Numbers: While constants are defined, consider making them configurable via settings for different deployment scenarios:

    // Could be made configurable for different environments
    private static readonly MANUAL_RESYNC_COOLDOWN_MS = 30_000;
    private static readonly MANUAL_RESYNC_WATCHDOG_MS = 120_000;
  2. Error Handling: The send-failed case in requestManualResync() could provide more specific error details:

    // Current: generic "send-failed" 
    // Better: include specific error type/code for debugging
    reason: `send-failed: ${msg}`

Security Considerations

Positive: Permission checks are properly implemented with requirePermission('sources', 'write') for the POST endpoint.

Positive: The feature is scoped to meshtastic_tcp sources only, preventing misuse on other source types.

⚠️ Minor: Consider rate limiting at the HTTP level for additional DoS protection, though the 30s cooldown provides good protection already.

Performance Considerations

Positive: The implementation is efficient with minimal overhead when not in use.

Positive: Single-flight protection prevents resource exhaustion.

Observation: The watchdog timer could theoretically accumulate if many manual resyncs are initiated across different sources, but this is unlikely to be an issue in practice.

Potential Edge Cases

  1. Race Condition: If the manager is stopped/restarted while a resync is in-flight, the watchdog timer may leak. Consider clearing timers in the stop() method:

    async stop(): Promise<void> {
      // Clear any active watchdog timers
      if (this.manualResyncWatchdog) {
        clearTimeout(this.manualResyncWatchdog);
        this.manualResyncWatchdog = null;
      }
      // ... existing stop logic
    }
  2. State Persistence: Manual resync state is not persisted across server restarts. While this is probably the desired behavior, it's worth documenting.


🏆 Overall Assessment

This is a high-quality implementation that demonstrates:

  • Thoughtful design addressing the specific problem of Passive Mode cache management
  • Comprehensive safety guards preventing common failure modes
  • Excellent test coverage with realistic scenarios
  • Clean integration with existing architecture patterns
  • Proper security with appropriate permission checks

Code Quality: A-

  • Well-structured with clear separation of concerns
  • Good error handling and logging
  • Follows established patterns in the codebase
  • Minor opportunities for configurability improvements

Security: A

  • Proper permission checks implemented
  • Rate limiting via cooldown mechanism
  • Scoped to appropriate source types only

Test Coverage: A+

  • Comprehensive test suite covering all scenarios
  • Good use of mocks and fake timers
  • Tests verify both positive and negative cases

Performance: A

  • Minimal overhead when inactive
  • Efficient guard mechanisms
  • No obvious performance bottlenecks

Recommendation

APPROVE - This PR successfully addresses the requirements from issue #3122 with a well-designed, safe, and thoroughly tested implementation. The minor suggestions above are optional improvements rather than blocking issues.

The implementation strikes an excellent balance between functionality and safety, which is crucial for a feature that directly interacts with hardware devices.


Job Run

@Yeraze Yeraze merged commit 72d795e into main May 21, 2026
19 checks passed
@Yeraze Yeraze deleted the fix/issue-3122-manual-resync branch May 21, 2026 18:55
Yeraze added a commit that referenced this pull request May 21, 2026
#3128)

Follow-up to #3125 / #3126. Closes the third item from the reporter's
feedback on #3122.

On large/fragile TCP nodes the *first* config-sync session frequently
closes mid-stream but the second attempt works cleanly. Under normal
exponential backoff that recovery would wait 8–16s before retrying;
this short startup-grace window cuts that gap.

* TcpTransport.setStartupGraceReconnect(graceMs, fastDelayMs): for the
  next graceMs after the call, scheduleReconnect uses fastDelayMs
  instead of the exponential backoff. After the window expires, normal
  backoff resumes automatically. Disabled by default (graceMs=0).
* MeshtasticManager opt-in: passive-mode sources get a 2-minute grace
  window with a 3-second reconnect delay during initial startup. Other
  sources keep the legacy backoff.

Tests: 6 new tcpTransport tests cover default-disabled, in-window
fast delay regardless of attempt count, post-window exponential
fallback, explicit disable, sanity, and that the reconnect fires
after the configured delay.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Yeraze added a commit that referenced this pull request May 21, 2026
#3122) (#3129)

Follow-up to #3125 / #3126 / #3128. Closes item 2 of the reporter's
feedback on #3122 (advanced per-source setting for the staleness
window).

* SourceConfig gains optional passiveResyncStaleMs (ms). Falls back to
  the 4h class default when absent.
* MeshtasticManager.effectivePassiveResyncStaleMs() resolves the active
  threshold, clamping out-of-range overrides ([1 min, 7 days]) so a 0
  or astronomical value can't disable the safeguard.
* Plumbed through all 5 MeshtasticManager construction sites in
  sourceRoutes; a change in the value triggers a transport restart
  alongside the existing passiveMode toggle.
* UI: numeric "Resync staleness window (hours)" input appears under
  the Passive Mode toggle when enabled. Blank = use 4h default.

Tests: 9 new tests cover default, constructor override, configureSource
override + clear, below-floor / above-ceiling rejection, exact
boundary acceptance, and NaN rejection.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant