Skip to content

feat(meshcore): channel "heard repeaters" via self-echo correlation (#3700)#3707

Merged
Yeraze merged 1 commit into
mainfrom
feat/3700-meshcore-heard-repeaters
Jun 24, 2026
Merged

feat(meshcore): channel "heard repeaters" via self-echo correlation (#3700)#3707
Yeraze merged 1 commit into
mainfrom
feat/3700-meshcore-heard-repeaters

Conversation

@Yeraze

@Yeraze Yeraze commented Jun 24, 2026

Copy link
Copy Markdown
Owner

Summary

MeshCore channel (broadcast) messages carry no protocol-level ACK, so nothing told you whether any repeater actually relayed an outgoing channel post (DMs get meshcore:send-confirmed; channel sends don't). This adds a best-effort "heard repeaters" signal via self-echo correlation.

When a nearby repeater re-floods one of our GRP_TXT (0x05) channel packets, our device hears the re-flood as an inbound OTA packet (LogRxDataota_packet) whose relay-hash chain (path_hops) names the relaying repeaters. We attribute those hashes to the most recent matching outgoing channel send within a 30s window. It's best-effort by design (we don't have the outgoing payload bytes to prove identity) and bounded by window + GRP_TXT type + a pending channel send — matching the issue's guidance and the original MeshCore app's heard-repeater list.

What's included

  • New per-source side table meshcore_heard_repeaters (migration 102, SQLite/PostgreSQL/MySQL, idempotent). One row per (message, repeater hash), unique on (sourceId, messageId, repeaterHash), tracks max SNR. Scoped by sourceId per the multi-source rules.
  • Correlation runs on raw ota_packet data BEFORE the opt-in packet-monitor gate, so it works regardless of the monitor setting.
  • Relay hashes resolved to repeater names best-effort via the existing contact-prefix resolver; raw hash shown when unknown.
  • Repository methods (recordHeardRepeater, getHeardRepeatersForMessage, getHeardRepeatersForMessages) using Drizzle query builders; outgoing channel messages enriched with heardBy[].
  • New incremental meshcore:channel-heard WS event (mirrors meshcore:send-confirmed).
  • UI: count badge (📡 N) + expandable repeater/SNR list in the MeshCore message stream.

Tests

  • findEchoMatch (pure): match / window-expiry / dedup / type + path gating.
  • Integration correlation via handleBridgeEvent('ota_packet'): records repeaters and emits the WS event with the packet monitor OFF; ignores no-match and non-GRP_TXT (DM) packets.
  • Repository per-source isolation + dedup / max-SNR / name-fill / grouping.
  • Migration registry count/last-name and migrationTables coverage updated.

Full Vitest suite green (7399 passed, 0 failed). Frontend build + server tsc clean.

Closes #3700

🤖 Generated with Claude Code

https://claude.ai/code/session_011JEaCGwY9Wz8jeV4e22GW4

…3700)

MeshCore channel (broadcast) messages carry no protocol ACK, so nothing
told you whether any repeater relayed an outgoing channel post. We now
infer it best-effort by self-echo correlation: when a nearby repeater
re-floods one of our GRP_TXT channel packets, our device hears the
re-flood as an inbound `ota_packet` whose relay-hash chain names the
relaying repeaters; we attribute those hashes to the most recent matching
outgoing channel send within a 30s window.

- New per-source side table `meshcore_heard_repeaters` (migration 102,
  all three backends, idempotent) — one row per (message, repeater hash),
  unique on (sourceId, messageId, repeaterHash), tracks max SNR.
- Correlation runs on raw ota_packet data BEFORE the opt-in packet-monitor
  gate, so it works regardless of the monitor setting.
- Relay hashes resolved to repeater names best-effort via the contact
  prefix resolver; raw hash shown when unknown.
- Outgoing channel messages enriched with `heardBy[]`; incremental
  `meshcore:channel-heard` WS event (mirrors send-confirmed).
- UI: count badge (📡 N) + expandable repeater list in the message stream.
- Tests: pure findEchoMatch (match/window/dedup/type-gate), integration
  correlation (monitor-off, no-match, DM-ignored), repo per-source
  isolation + dedup/max-SNR/grouping.

Closes #3700

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_011JEaCGwY9Wz8jeV4e22GW4
@Yeraze Yeraze added the system-test Trigger hardware system tests label Jun 24, 2026
@github-actions

github-actions Bot commented Jun 24, 2026

Copy link
Copy Markdown

Claude finished @Yeraze's task —— View job


PR Review: feat(meshcore): channel "heard repeaters" via self-echo correlation

  • Read and analyze changed files
  • Review migration and schema
  • Review repository layer
  • Review manager correlation logic
  • Review tests
  • Review UI components
  • Compile and post final review

Overall Assessment

This is a well-structured, thoughtfully implemented feature. The self-echo correlation approach is sound given the MeshCore protocol's lack of channel ACKs, and the implementation correctly treats it as best-effort throughout. The code quality is high overall with a few observations worth addressing.


Potential Bug: recordHeardRepeater has a TOCTOU race

src/db/repositories/meshcore.ts:938–966

The recordHeardRepeater method does a SELECT-then-UPDATE using Drizzle, without any transaction wrapper. On a busy channel with rapid echoes coming in, two concurrent calls with the same (sourceId, messageId, repeaterHash) could both read existing = undefined, then both try to INSERT — one will hit the unique constraint and throw. The correlateChannelEcho caller swallows the error gracefully, so this won't crash, but the second echo's SNR and name updates would be silently lost.

Recommendation: Wrap the select+insert/update in a transaction, or use a database-level INSERT ... ON CONFLICT DO UPDATE (upsert) instead of manual select+branch. An UPSERT is also simpler and faster:

// SQLite example
await this.db
  .insert(meshcoreHeardRepeaters)
  .values({ sourceId, messageId, repeaterHash, repeaterName, snr, heardAt, createdAt: now })
  .onConflictDoUpdate({
    target: [meshcoreHeardRepeaters.sourceId, meshcoreHeardRepeaters.messageId, meshcoreHeardRepeaters.repeaterHash],
    set: {
      snr: sql`MAX(excluded.snr, ${meshcoreHeardRepeaters.snr})`,
      repeaterName: sql`COALESCE(excluded.repeaterName, ${meshcoreHeardRepeaters.repeaterName})`,
      heardAt: sql`excluded.heardAt`,
    },
  });

The max-SNR and name-fill logic in the current SELECT branch is correct (using the maxNullable helper at line 147) — just not race-safe without an UPSERT or transaction.


Minor: heardBy variable is unused after population

src/server/meshcoreManager.ts:1329–1344

The local heardBy array is assembled during the loop but then used only for heardBy.length === 0 at line 1345. The actual broadcast uses fullHeardBy (the re-read from DB at line 1349–1353). This is intentional (re-read ensures consistency with concurrent echoes), but the per-hop push to heardBy at line 1342 serves no purpose since the full set is re-fetched anyway. The heardBy array could be replaced with a simple count check on match.pathHops. This is a minor readability concern, not a bug.


Design: pendingChannelSends not cleaned up after match

src/server/meshcoreManager.ts:1374–1406

findEchoMatch returns the most-recent matching send but doesn't remove it from pendingChannelSends. This is intentional (multiple repeaters can echo the same message independently), but it means a single send can be matched by many inbound echoes over the 30s window. The PR description acknowledges this. One edge case worth noting: if two different channel messages are sent within 30s (before the first expires), all echoes for both messages will be attributed to the most-recent one. This is documented as best-effort, but for a chatty channel it could cause mis-attribution. The current design is reasonable; just flagging the known limitation.


Design: No cleanup of meshcore_heard_repeaters rows over time

There is no retention/pruning mechanism for the meshcore_heard_repeaters table. For long-running deployments with high channel traffic, this table will grow indefinitely. Existing tables like meshcore_messages have deleteMessagesOlderThan. Consider either:

  • Adding a pruning method and hooking it into the existing message-pruning path, or
  • Adding a note that rows are expected to accumulate (if the volume is acceptable — repeater echoes are much rarer than messages)

Test: Integration test uses double-flush but doesn't pin timing

src/server/meshcoreManager.channelHeard.test.ts:63–67

The flush() helper uses 3 await Promise.resolve() calls to drain async chains. This is fragile in general (it works here because the async chains are shallow), but since the test has been passing in the full Vitest suite, it's acceptable. A vi.waitFor() or exposing a test-only drain method would be more robust long-term.


Schema: MySQL VARCHAR lengths may be undersized

src/db/schema/meshcoreHeardRepeaters.ts:61

repeaterHash is VARCHAR(16) in MySQL. The PR comment says relay hashes are "hex strings", and the tests use 2-char values like 'a3'. The MeshCore protocol uses varying hash sizes (the path_len_raw byte packs hash-size in the top 2 bits: 1, 2, or 3 bytes = 2, 4, or 6 hex chars). 16 chars is generous enough for the current wire format, but worth a schema comment confirming the max observed hash length in practice.


Nit: CSS uses emoji directly in button content

src/components/MeshCore/MeshCoreMessageStream.tsx:289

{' 📡 '}{m.heardBy.length}

Rendering an emoji directly in a button is fine functionally, but aria-label or title only covers screen readers — the emoji has no semantic meaning in the accessibility tree. The title prop already covers the tooltip. Consider wrapping the emoji in <span aria-hidden="true"> to mark it as decorative, though this is very minor.


Positive Highlights

  • The findEchoMatch static method is pure (no I/O) — excellent choice for testability. The unit tests cover all the important edge cases (window expiry, type gating, zero-hop, dedup, case normalization).
  • Correlation runs before the opt-in packet-monitor gate (meshcoreManager.ts:1233–1236) — this is the right call so users get heard-repeater info without needing to enable the packet monitor.
  • Per-source scoping is correct throughoutsourceId on every insert/query, source isolation tested in meshcoreHeardRepeaters.perSource.test.ts.
  • The batched getHeardRepeatersForMessages enrichment on getChannelMessages (line 4103) is the right pattern — one DB round-trip instead of N.
  • WS event carries the full current heard-by set (not just the delta) so the client can replace state idempotently without tracking partial updates.
  • Migration is correctly idempotent across all three backends with appropriate use of IF NOT EXISTS / information_schema guards.
  • Error swallowing in correlateChannelEcho (line 1359) is appropriate — correlation failures must never affect the message pipeline.

Summary

The main actionable item is the TOCTOU race in recordHeardRepeater — a concurrent double-echo for the same repeater could cause a unique-constraint violation that silently swallows the second SNR update. Everything else is minor or informational. The feature is well-conceived, the test coverage is solid, and the architecture choices (per-source scoping, best-effort semantics, pre-gate correlation) are all correct.

@github-actions

Copy link
Copy Markdown
System Test Results

MeshMonitor System Test Results

Test Run: 2026-06-24 08:40:11 EDT

❌ Overall Result: FAILED

Aborted on first failing test (fail-fast mode). Remaining tests skipped.

Failed Test

  • Quick Start Test

@github-actions

Copy link
Copy Markdown
System Test Results

MeshMonitor System Test Results

Test Run: 2026-06-24 08:49:01 EDT

❌ Overall Result: FAILED

Aborted on first failing test (fail-fast mode). Remaining tests skipped.

Failed Test

  • Reverse Proxy Test

@Yeraze Yeraze merged commit bc1fc00 into main Jun 24, 2026
19 of 22 checks passed
@Yeraze Yeraze deleted the feat/3700-meshcore-heard-repeaters branch June 24, 2026 12:56
@github-actions

Copy link
Copy Markdown
System Test Results

MeshMonitor System Test Results

Test Run: 2026-06-24 08:59:07 EDT

❌ Overall Result: FAILED

Aborted on first failing test (fail-fast mode). Remaining tests skipped.

Failed Test

  • Reverse Proxy Test

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

system-test Trigger hardware system tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEAT] MeshCore: Sending a channel message should indicate heard repeaters

1 participant