fix(meshcore): stop virtual node server during transport teardown by Yeraze · Pull Request #3706 · Yeraze/meshmonitor

Yeraze · 2026-06-24T12:27:23Z

Summary

Fixes #3705 — MeshCore virtual node stuck in "Connecting" when the real node connection drops.

When the real MeshCore node loses connectivity, teardownTransportOnly() sets isConnected() = false but left the virtual node server running. Every AppStart from the mobile app got BadState → app looped in "Connecting".
Added await this.stopVirtualNodeServer() at the top of teardownTransportOnly(), mirroring what the full disconnect() path already does. The server restarts in connect() once the real node is back.

Test plan

Verify existing virtual node unit tests still pass (src/server/meshcoreVirtualNodeServer.test.ts)
Manual: connect mobile app to virtual node with a stable real-node connection → should work as before
Manual: simulate real-node disconnect (unplug/stop the device) → virtual node server stops (app drops); reconnect real node → virtual node restarts; app can connect again

🤖 Generated with Claude Code

https://claude.ai/code/session_01LBEuSU2sHGZS3xNiXshTxm

Generated by Claude Code

) When the real MeshCore node connection drops and the manager enters reconnect state via teardownTransportOnly(), the virtual node server was left running while isConnected() returned false. Any AppStart from a connecting app client got a BadState error, causing the app to loop in "Connecting" indefinitely. Mirror what disconnect() already does: stop the virtual node server before marking the transport as down. The server is restarted in connect() once the real node is back. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01LBEuSU2sHGZS3xNiXshTxm

github-actions · 2026-06-24T12:27:38Z

Claude finished @Yeraze's task —— View job

PR Review

Gather context
Read changed files
Analyze fix logic
Provide review feedback

Summary

This is a clean, minimal, and well-targeted fix. The root cause is clear: teardownTransportOnly() mirrored the native backend teardown from disconnect() but omitted the stopVirtualNodeServer() call, leaving the virtual node server accepting mobile app connections while isConnected() was false. The fix correctly mirrors the disconnect() pattern.

Code Quality

src/server/meshcoreManager.ts:4064-4087 — The fix is correct and idiomatic:

Placement at the top of teardownTransportOnly() mirrors disconnect() (line 841) exactly.
The inline comment (lines 4065–4068) clearly explains the "why" — the BadState loop — which will help future readers.
stopVirtualNodeServer() is already null-guarded (if (!this.virtualNodeServer) return; at line 796), so calling it during a reconnect where no virtual node is configured is harmless.

Potential Issues / Edge Cases

1. Reconnect gap vs. virtual node restart

When beginReconnect() fires (line 4057), the sequence is:

teardownTransportOnly() → stops virtual node server
scheduleNextReconnect() → waits delay ms (exponential backoff, up to 60 s by default)
attemptReconnect() → calls connect() → calls startVirtualNodeServer()

During the backoff window the virtual node server is down, so any mobile app that tries to connect will get a TCP connection refused rather than a BadState loop. This is the correct UX — it's actually an improvement over the pre-fix behavior (infinite BadState loop). Worth noting in release notes, but not a blocker.

2. void this.teardownTransportOnly().then(...) (line 4057)

The void here suppresses the promise. If stopVirtualNodeServer() throws an unhandled error, it will be a silent failure that won't prevent scheduleNextReconnect() from being called (since the .then() is attached). The stopVirtualNodeServer() implementation (lines 795–802) has a try/catch that swallows errors, so this is already handled — no issue.

3. disconnect() redundancy

disconnect() (line 841) also calls stopVirtualNodeServer(). Since teardownTransportOnly() now stops it first, calling disconnect() on a node already in the reconnecting state would call stopVirtualNodeServer() twice. The null-guard at line 796 (if (!this.virtualNodeServer) return;) makes this idempotent — no issue.

Performance / Security

No performance or security concerns. This is a state-machine correctness fix with no new I/O paths or privilege changes.

Test Coverage

The PR description lists manual test steps but no automated test changes. The existing meshcoreVirtualNodeServer.test.ts suite validates the server's own start/stop behavior. A unit test for teardownTransportOnly() verifying that stopVirtualNodeServer() is called (e.g. via a spy) would strengthen the regression coverage, but given the simplicity and risk profile of this change, absence of a new test is not a blocker.

Verdict

✅ Looks good to merge. The fix is correct, minimal, well-commented, and consistent with the existing disconnect() pattern. No blocking issues found.

Adds a focused regression test for the PR #3706 fix (issue #3705): when the real node connection drops, teardownTransportOnly() must stop the Virtual Node server. Without this, the VN server kept accepting MeshCore mobile-app connections while isConnected() was false, returning BadState to every AppStart and trapping the app in a "Connecting" loop. The test spies on stopVirtualNodeServer and asserts it is invoked during teardownTransportOnly(), verifies it runs before the native backend teardown (ordering matters — the VN server reads connection state to answer AppStart), and exercises the real stop() delegation. Confirmed to fail without the one-line fix and pass with it. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_011JEaCGwY9Wz8jeV4e22GW4

github-actions · 2026-06-24T14:19:47Z

Claude finished @Yeraze's task —— View job

PR Review

Gather context
Read changed files
Analyze fix logic and tests
Provide review feedback

Summary

This PR adds 6 lines to fix the root cause (stopping the VN server in teardownTransportOnly()) and then backs it with a dedicated test file covering all meaningful paths. Both changes are clean and correct.

Fix (`src/server/meshcoreManager.ts`)

The one-liner fix at meshcoreManager.ts:4069 is exactly right:

await this.stopVirtualNodeServer();

Mirrors the disconnect() code path identically.
stopVirtualNodeServer() (lines 795–803) is null-guarded and swallows stop errors, so calling it when no VN server is configured is a no-op.
The inline comment (lines 4065–4068) clearly explains the BadState loop — good for future readers.
No performance or security concerns.

Tests (`src/server/meshcoreManager.virtualNode.test.ts`)

The new test file adds three focused regression tests. All three are well-structured:

Test 1 — "stops the Virtual Node server when tearing down transport" ✅
Verifies the primary regression: stopVirtualNodeServer() is called exactly once. Clean and minimal.

Test 2 — "tears down native backend after stopping VN server" ✅
This is the strongest test — it verifies ordering using invocationCallOrder, ensuring stopVirtualNodeServer runs before nativeBackend.disconnect. It also validates post-teardown state (connected === false, nativeBackend === null). Good depth.

Test 3 — "delegates to VN server stop() through stopVirtualNodeServer()" ✅
Reaches through to the actual stop() call on the underlying server object and verifies virtualNodeServer is nulled out afterward. This tests the concrete teardown contract, not just the spy boundary.

One minor observation on test 3: The test sets isRunning: () => true on the mock but stopVirtualNodeServer() doesn't actually check isRunning() before calling stop() (it only checks if (!this.virtualNodeServer)). This is harmless — the property is ignored — but it could slightly mislead future readers. Not a blocker.

Verdict

✅ Looks good to merge. The fix is correct, minimal, and well-documented. The new test file addresses the gap noted in the prior review and validates both the call order and the post-teardown state. No blocking issues.

…ter-disconnect (#3705) (#3794) * fix(meshcore): detect socket drops + add user-configurable heartbeat (#3705) Two gaps left a MeshCore source unable to recover from a dropped real-node link without a manual reconnect or container restart: 1. The manager ignored the native backend's own 'disconnected' event (it only logged it), so a socket/serial-level drop left `connected = true`. isConnected() kept returning true, so the Virtual Node server answered AppStart with a stale SelfInfo while real sends silently failed, and nothing recovered. handleUnexpectedDisconnect() now reflects reality (drops to disconnected + stops the VN server) or, when auto-reconnect is enabled, hands off to the existing backoff machinery. An intentionalTeardown guard plus a stale-backend-instance check keep it from firing on our own teardowns. 2. Heartbeat/auto-reconnect was unreachable: neither the source config type nor meshcoreConfigFromSource() carried heartbeatIntervalSeconds, and there was no UI for it — so startHeartbeat() always early-returned and the reconnect path (incl. PR #3706's fix) never ran. Added a heartbeat field to the MeshCore source form (mirroring Meshtastic), plumbed heartbeatIntervalSeconds through MeshCoreSourceConfig and meshcoreConfigFromSource(). The update route already restarts the manager on any config change, so editing it reconnects. Regression tests cover the drop-detection branches and the config passthrough. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_011JEaCGwY9Wz8jeV4e22GW4 * fix(meshcore): keep manager registered on manual disconnect (#3705) The UI Disconnect (POST /api/sources/:id/disconnect) called meshcoreManagerRegistry.remove(), deleting the manager from the registry. Every /api/sources/:id/meshcore/* route is behind a guard that 404s with "No MeshCore manager for source <id>" when the manager is absent — so after a manual disconnect the page's status/read polling errored out and the source couldn't be driven again without a container restart. Disconnect now tears down the device link via manager.disconnect() (which already stops the VN server, heartbeat and schedulers) while leaving the manager registered, so /meshcore/* keeps serving a clean disconnected state and /connect re-establishes against the existing manager. Mirrors the intent of the autoConnect=false stop/start workflow without making the source unaddressable. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_011JEaCGwY9Wz8jeV4e22GW4 * docs(meshcore): document source heartbeat field + changelog (#3705) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_011JEaCGwY9Wz8jeV4e22GW4 * fix(meshcore): release dead backend on unrecovered drop (review #3794) Addresses Claude review finding #2: when a socket drop is handled with auto-reconnect disabled, handleUnexpectedDisconnect() left this.nativeBackend pointing at the closed connection, so sendBridgeCommand()'s `!nativeBackend` guard never tripped and callers got a write-to-closed error instead of a clean "disconnected" until /connect ran. Now tears down and nulls the dead backend (connectionState is set to 'disconnected' first so any re-emitted event short-circuits). Added a regression test. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_011JEaCGwY9Wz8jeV4e22GW4 --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Yeraze merged commit b41594e into main Jun 24, 2026
25 of 29 checks passed

Yeraze deleted the claude/great-dijkstra-5nz4pv branch June 24, 2026 15:06

This was referenced Jun 24, 2026

[BUG] MeshCore: Cannot connect to virtual node #3705

Closed

fix(meshcore): heartbeat/auto-reconnect, drop detection, reconnect-after-disconnect (#3705) #3794

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(meshcore): stop virtual node server during transport teardown#3706

fix(meshcore): stop virtual node server during transport teardown#3706
Yeraze merged 2 commits into
mainfrom
claude/great-dijkstra-5nz4pv

Yeraze commented Jun 24, 2026

Uh oh!

github-actions Bot commented Jun 24, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jun 24, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

Yeraze commented Jun 24, 2026

Summary

Test plan

Uh oh!

github-actions Bot commented Jun 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review

Summary

Code Quality

Potential Issues / Edge Cases

Performance / Security

Test Coverage

Verdict

Uh oh!

github-actions Bot commented Jun 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review

Summary

Fix (src/server/meshcoreManager.ts)

Tests (src/server/meshcoreManager.virtualNode.test.ts)

Verdict

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

github-actions Bot commented Jun 24, 2026 •

edited

Loading

github-actions Bot commented Jun 24, 2026 •

edited

Loading

Fix (`src/server/meshcoreManager.ts`)

Tests (`src/server/meshcoreManager.virtualNode.test.ts`)