feat(meshcore): query only 0-hop repeaters for region discovery + route the query direct (#3743)#3765
Conversation
Run a 0-hop discovery sweep first and query only the repeaters/room-servers that answered, in arrival order. Retry the sweep once if it finds nothing; surface 'no 0-hop repeaters found' rather than falling back to querying every known repeater. Cuts region discovery from ~N x 20s (all known repeaters, e.g. 62 -> ~21min) to a single ~8s sweep plus quick replies from nearby repeaters. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_011JEaCGwY9Wz8jeV4e22GW4
v1.15+ repeaters answer a regions ANON_REQ only when it arrives via a DIRECT route (firmware simple_repeater onAnonDataRecv gates the REGIONS branch on isRouteDirect(); flooded requests are silently dropped — login is the lone flood-exception, which is why admin CLI worked but Discover Regions didn't). The companion floods whenever the contact has no installed out_path. Since the 0-hop discovery sweep only hears direct-range repeaters, each swept repeater is a direct neighbour, so install a zero-hop direct out_path before querying it. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_011JEaCGwY9Wz8jeV4e22GW4
…covery The CMD_ADD_UPDATE_CONTACT write lands even when its Ok ack is lost in the post-sweep radio chatter (meshcore.js resolves on Ok, so it reports a timeout while the route is actually installed). Use a short 3s window and treat a missed ack as non-fatal — the request_regions reply is the real success signal — so a working install no longer waits 12s or logs a scary 'may flood' warning. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_011JEaCGwY9Wz8jeV4e22GW4
A set_out_path timeout means meshcore.js didn't see the CMD_ADD_UPDATE_CONTACT Ok ack, not that the device is unreachable — the write applies regardless. Log it at debug instead of a misleading 'check serial/TCP connection' warning; genuine rejections still warn. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_011JEaCGwY9Wz8jeV4e22GW4
|
Claude finished @Yeraze's task —— View job PR Review: feat(meshcore): query only 0-hop repeaters for region discovery + route the query direct (#3743)Tasks
Overall AssessmentThis is a well-executed fix for a subtle firmware-level routing issue. The root cause analysis is thorough, the fix is well-targeted, and the test coverage is comprehensive. The PR is in good shape to merge. I have a few observations, mostly minor: Core Logic (
|
| Area | Status |
|---|---|
| Root cause fix (direct routing) | ✅ Correct |
| Sweep → query → retry flow | ✅ Clean |
| Timeout handling (best-effort) | ✅ Appropriate |
discoverNodes return type extension |
✅ Non-breaking |
| Frontend UX feedback | ✅ Good, minor copy note |
| Test coverage | ✅ Comprehensive |
| Security | ✅ No concerns |
| Performance | ✅ Significant improvement (N×20s → ~8s sweep + fast replies) |
Recommend merge. The only actionable suggestion is optionally updating the UI hint text in MeshCoreSettingsView.tsx:269 to reflect that the button now runs its own sweep internally.
…out-string dependency - The UI hint no longer says 'Run Discover Repeaters first' — the button now runs its own 0-hop sweep internally. - Comment the isTimeout string-match so future maintainers know it depends on meshcore.js's error wording. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_011JEaCGwY9Wz8jeV4e22GW4
|
Claude finished @Yeraze's task —— View job PR Review: feat(meshcore): query only 0-hop repeaters for region discovery + route the query direct (#3743)Tasks
Overall AssessmentThis is an excellent, well-reasoned fix that solves a subtle firmware routing requirement. The root cause is clearly documented, the implementation is surgical, and the test coverage is thorough. Recommend merge. A few observations below, all minor.
|
| Area | Status |
|---|---|
Root cause fix (direct routing via set_out_path) |
✅ Correct |
| Sweep → query → retry flow | ✅ Clean |
Timeout handling (best-effort, debug log) |
✅ Appropriate |
discoverNodes return type extension |
✅ Non-breaking |
| Frontend UX feedback (3 states) | ✅ Correct |
| Updated hint text | ✅ Accurate now |
| Test coverage | ✅ Comprehensive |
| Security | ✅ No concerns |
| Performance | ✅ Significant improvement (N×20s → ~8s sweep + fast replies) |
Recommend merge. The observations above are all informational or minor style suggestions — none are blockers.
|
Thanks for the review! Addressed in
On the "repeater in |
… OWNER (#3820) "Discover Repeaters" left repeaters as "Unknown" because NODE_DISCOVER_RESP carries no name and a zero-hop repeater may not advert for a long time (observed >30 min on live hardware), so the advert-refresh path (prior commit) could wait indefinitely. Firmware (ripplebiz/MeshCore simple_repeater `handleAnonOwnerReq`) answers an UNAUTHENTICATED ANON_REQ OWNER (CMD_SEND_ANON_REQ sub-type 0x02) with "node_name\nowner_info" — the same anon-request transport, and the same isRouteDirect() gating, as the region discovery shipped in #3743/#3765. So we can pull a discovered repeater's real name without admin login. - meshcoreNativeBackend: new `request_owner` bridge command — builds the [57][pubkey][0x02][0x00] frame, awaits the 0x8C BinaryResponse (serialized via runExclusiveRadioOp, mirroring request_regions), parses clock + first line of "node_name\nowner_info". - meshcoreManager: new `fetchOwnerName()` — installs a zero-hop direct out_path (firmware drops flooded OWNER reqs), issues request_owner, writes the name onto the contact and broadcasts it. discoverNodes() gains an opt-in `fetchNames` flag that fetches names for nameless repeaters/room-servers after the sweep. - meshcoreRoutes: the user-facing POST /discover opts in (fetchNames=true); the internal region-discovery sweep does not (it has its own per-repeater pass). Verified live: delete "Yeraze Repeater" -> Discover Repeaters -> name populates as "Yeraze Repeater" within the discovery window (~12 s), no admin login, no advert required. The advert-refresh (Option A) remains as a zero-airtime complement for repeaters that do advert. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_011JEaCGwY9Wz8jeV4e22GW4
… no admin login (#3820) (#3825) * fix(meshcore): refresh known-but-nameless contacts so discovered repeaters get their name (#3820) A repeater added via "Discover Repeaters" stays "Unknown" in the node list until an unrelated refresh (e.g. opening its admin panel) happens to run — even after it sends a zero-hop advert. Root cause (firmware-verified): NODE_DISCOVER_RESP carries no name, so discovery pre-creates a *nameless* device contact. The repeater's later zero-hop advert DOES carry the name and the firmware DOES store it on the device — but because the contact already exists, the firmware pushes a pubkey-only 0x80 advert (not the full 0x8A new-advert record), so MeshMonitor's contact_advertised event arrives with no adv_name. The device-record re-read (schedulePathRefresh -> get_contacts) was gated behind `if (!wasKnown)`, so it never ran for the already-known stub, leaving it "Unknown". Fix: move the missing-name/type refresh trigger out of the `!wasKnown` branch so it also fires for known-but-nameless contacts. Zero airtime — firmware drops nameless adverts, so every contact_advertised event means the device just stored a real name; the refresh is a local debounced get_contacts read and stops re-firing once the name is pulled. #3756's `||`-not-`??` guard can only protect a name already held; it cannot acquire one never pulled, so it was insufficient for the fresh-discovery case. Plan/analysis: docs/internal/dev-notes/MESHCORE_UNNAMED_NODE_3820_PLAN.md Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_011JEaCGwY9Wz8jeV4e22GW4 * feat(meshcore): actively fetch discovered repeater names via ANON_REQ OWNER (#3820) "Discover Repeaters" left repeaters as "Unknown" because NODE_DISCOVER_RESP carries no name and a zero-hop repeater may not advert for a long time (observed >30 min on live hardware), so the advert-refresh path (prior commit) could wait indefinitely. Firmware (ripplebiz/MeshCore simple_repeater `handleAnonOwnerReq`) answers an UNAUTHENTICATED ANON_REQ OWNER (CMD_SEND_ANON_REQ sub-type 0x02) with "node_name\nowner_info" — the same anon-request transport, and the same isRouteDirect() gating, as the region discovery shipped in #3743/#3765. So we can pull a discovered repeater's real name without admin login. - meshcoreNativeBackend: new `request_owner` bridge command — builds the [57][pubkey][0x02][0x00] frame, awaits the 0x8C BinaryResponse (serialized via runExclusiveRadioOp, mirroring request_regions), parses clock + first line of "node_name\nowner_info". - meshcoreManager: new `fetchOwnerName()` — installs a zero-hop direct out_path (firmware drops flooded OWNER reqs), issues request_owner, writes the name onto the contact and broadcasts it. discoverNodes() gains an opt-in `fetchNames` flag that fetches names for nameless repeaters/room-servers after the sweep. - meshcoreRoutes: the user-facing POST /discover opts in (fetchNames=true); the internal region-discovery sweep does not (it has its own per-repeater pass). Verified live: delete "Yeraze Repeater" -> Discover Repeaters -> name populates as "Yeraze Repeater" within the discovery window (~12 s), no admin login, no advert required. The advert-refresh (Option A) remains as a zero-airtime complement for repeaters that do advert. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_011JEaCGwY9Wz8jeV4e22GW4 --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Summary
Closes #3743. Makes MeshCore region discovery actually return regions, in two parts:
N × 20s(e.g. 62 known repeaters → ~21 min) to a single ~8s sweep plus quick replies from nearby repeaters.Root cause (firmware-verified against
repeater-v1.15.0)A MeshCore repeater answers a regions
ANON_REQonly when it arrives via a direct route —simple_repeater'sonAnonDataRecvgates the regions branch onpacket->isRouteDirect()and silently drops flooded ones. (Login is the lone flood-exception, which is why remote admin/CLI worked but Discover Regions didn't.) The companion floods whenever the contact has no installedout_path(sendAnonReq:out_path_len == OUT_PATH_UNKNOWN). MeshMonitor'sdiscover_pathonly produces the diagnostic path (push0x8D) — it never installs the routingout_path— so the request kept flooding and timing out.What changed
discoverRegions()now:discover_nodes, filter0x0C= repeater + room-server), retries once if it finds nothing, and reportsnoZeroHopRepeatersinstead of silently querying everyone (addresses @m0urs's retry request on the issue).out_path(setContactOutPath) before eachrequest_regions— since a 0-hop sweep only hears direct-range repeaters, each one is a direct neighbour — so theANON_REQroutes direct and the repeater replies.setContactOutPath()gains an optional timeout; aset_out_pathack timeout is now treated as benign (theCMD_ADD_UPDATE_CONTACTwrite lands even when meshcore.js loses itsOkack in post-sweep radio chatter) — logged at debug instead of a misleading "check serial/TCP connection" warning.discoverNodes()now returns theseenpublic-key set so the sweep result can drive selection.useMeshCore+MeshCoreSettingsViewsurface "no nearby (0-hop) repeaters found" vs "repeaters reported no regions".Testing
meshcoreManager.scope.test.ts(36 pass):0x0Csweep, query-only-swept-repeaters, arrival order, retry-then-succeed, two-empty-sweeps →noZeroHopRepeaters, and an assertion that a zero-hop directout_pathis installed before querying.tscclean.🤖 Generated with Claude Code