Skip to content

fix(connection): abort module config fetch on connectivity loss (#3637)#3638

Merged
Yeraze merged 1 commit into
mainfrom
claude/great-dijkstra-xnde8r
Jun 22, 2026
Merged

fix(connection): abort module config fetch on connectivity loss (#3637)#3638
Yeraze merged 1 commit into
mainfrom
claude/great-dijkstra-xnde8r

Conversation

@Yeraze

@Yeraze Yeraze commented Jun 22, 2026

Copy link
Copy Markdown
Owner

Summary

Fixes #3637 — random node disconnects / TX failures since 4.11.x.

  • Root cause: requestAllModuleConfigs() (added in 4.11.0 for backup capability) silently swallowed "Not connected to Meshtastic node" errors inside its 15-request loop. If the device dropped the TCP connection mid-fetch, all remaining requests failed quietly, then moduleConfigsEverFetched = true was set anyway. On every subsequent reconnect the fetch was skipped ("already fetched this session"), leaving the flag stuck and the module configs permanently incomplete until a process restart.
  • Fix: Added a pre-iteration isConnected guard that throws early if the connection is gone, and changed the catch block to re-throw connectivity errors instead of swallowing them. The caller's try/catch sees the failure, moduleConfigsEverFetched stays false, and the fetch retries cleanly on the next reconnection.

The log fingerprint in the issue (✅ All module config requests sent immediately before TX failures) is caused by the loop completing after all 15 requests silently failed — that misleading success log now only appears when all requests actually succeed.

Changes

  • src/server/meshtasticManager.ts: 15-line change to requestAllModuleConfigs() loop error handling

Test plan

  • Verify TypeScript compiles cleanly (tsc --noEmit) ✅ confirmed before PR
  • Full Vitest suite passes (CI)
  • Manual: connect a node, observe 📦 Requesting all module configs for backup... in logs ~50s after connect; simulate a disconnect mid-fetch and confirm moduleConfigsEverFetched is not set (next reconnect re-attempts the fetch rather than skipping it)

🤖 Generated with Claude Code

https://claude.ai/code/session_01XQytw5q5QAjuivoaDqdzPt


Generated by Claude Code

…event stuck moduleConfigsEverFetched (#3637)

requestAllModuleConfigs() silently swallowed "Not connected" errors in
its loop, so moduleConfigsEverFetched was set to true even when all 15
requests failed due to a mid-fetch disconnect. On subsequent reconnects
the fetch was skipped entirely, leaving users unable to TX until a full
process restart.

Propagate connectivity errors out of the loop (both via a pre-iteration
isConnected guard and by re-throwing "Not connected" from the catch),
so the caller never sets moduleConfigsEverFetched=true on a failed run
and the fetch retries cleanly on the next reconnection.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01XQytw5q5QAjuivoaDqdzPt
@github-actions

github-actions Bot commented Jun 22, 2026

Copy link
Copy Markdown

Claude finished @Yeraze's task —— View job


PR Review

  • Read changed file and analyze the fix
  • Review code quality and best practices
  • Check for potential bugs or issues
  • Assess test coverage
  • Post final review

Summary

The fix is correct and well-scoped. The root cause analysis in the PR description is accurate: moduleConfigsEverFetched was being set to true even when all 15 requests silently failed mid-disconnect, permanently blocking retry on subsequent reconnects.

What changed (src/server/meshtasticManager.ts:13174–13196)

Before: The loop caught all errors internally and continued regardless of cause. Connectivity errors were indistinguishable from transient per-config errors.

After: Two layers of connectivity detection were added:

  1. A pre-iteration guard (isConnected check at the top of each loop iteration) — lines 13179–13182
  2. Re-throw in the catch block when the error message matches 'Not connected to Meshtastic node' — lines 13188–13193

Code Quality

Strengths:

  • The fix is minimal and surgical — 15 lines, 1 file.
  • The pre-iteration guard (if (!this.isConnected || !this.transport)) runs before requestModuleConfig() so a connection drop between requests is caught immediately, not only after a failed send.
  • The catch re-throw preserves the original error object (throw error), so the caller's stack trace and error type are retained.
  • Log messages clearly describe the "will retry on reconnect" behavior, which aids future debugging.
  • The comment at line 13175 (// Abort early if we lost the connection mid-fetch (#3637)) explicitly links the guard to the issue, which is good practice for large files.

Minor observations:

  1. String-matching for error type detection (lines 13188–13189) — the connectivity error is identified by comparing error.message to the literal string 'Not connected to Meshtastic node'. This is a fragile pattern (a typo anywhere in that string across the ~40+ throw sites would silently regress the fix). The existing codebase uses this message uniformly, so it works today, but a typed ConnectivityError class or a isConnectivityError(e) helper would be more robust long-term. This is a pre-existing pattern, not introduced by this PR, and addressing it would be out of scope here.

  2. Duplicate warning log — when requestModuleConfig() throws a connectivity error, two warning lines are logged: one from the catch block at line 13191, and potentially one from within requestModuleConfig itself if it logs before throwing. Looking at requestModuleConfig (line 12592), it delegates to transport.send and only logs on error via logger.error inside the method — the inner error path is not duplicated. No issue here, but worth keeping in mind if requestModuleConfig is ever changed to log before re-throwing.

  3. refreshModuleConfigs() (line 13216) — this method resets moduleConfigsEverFetched = false, calls requestAllModuleConfigs(), then sets moduleConfigsEverFetched = true directly after. With the new propagation behavior, if requestAllModuleConfigs() throws (connectivity lost mid-refresh), the moduleConfigsEverFetched = true at line 13221 will not execute, which is correct. No bug here.

  4. refreshNodeDatabase() (line 14288) — calls requestAllModuleConfigs() inside a setTimeout with a simple try/catch that logs the error. A connectivity-loss throw here is caught and logged, which is fine — the caller doesn't set moduleConfigsEverFetched, so the next reconnect will still retry properly.

Potential Issues

None identified. The fix is logically correct and the two code paths that call requestAllModuleConfigs() elsewhere both handle the new propagated throw appropriately.

Test Coverage

No new tests are included. Given the complexity of mocking a mid-loop TCP drop in the existing unit test framework, this is acceptable for a targeted fix. The manual test plan in the PR description is reasonable. If a regression test is desired, a unit test that mocks this.isConnected flipping to false mid-iteration (or requestModuleConfig throwing the connectivity error) would verify that moduleConfigsEverFetched remains false after the throw.

Verdict

Approved. The fix correctly addresses the described bug with minimal, targeted changes. The pre-iteration guard is a nice defensive addition beyond just the catch re-throw. No blocking issues.


@Yeraze Yeraze merged commit 7e0bbfc into main Jun 22, 2026
19 checks passed
@Yeraze Yeraze deleted the claude/great-dijkstra-xnde8r branch June 22, 2026 11:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] random node disconnects since 4.11.x

2 participants