Skip to content

fix(nodes): guard lastHeard against replayed/stale packets#3569

Merged
Yeraze merged 1 commit into
mainfrom
fix/replay-guard-lastheard
Jun 19, 2026
Merged

fix(nodes): guard lastHeard against replayed/stale packets#3569
Yeraze merged 1 commit into
mainfrom
fix/replay-guard-lastheard

Conversation

@Yeraze

@Yeraze Yeraze commented Jun 19, 2026

Copy link
Copy Markdown
Owner

Summary

A node powered off for weeks could keep appearing "recently heard" in MeshMonitor, with a continually-refreshed Last Heard time. The cause is a replayed frame — typically a retained MQTT telemetry message, or an MQTT→LoRa bridge re-injecting an offline node's cached reading onto the mesh with a new packet id (the payload is byte-identical, so packet-id dedup never catches it). Every packet attributed to a node otherwise stamps lastHeard = now, which resurrects the dead node on each replay. This PR adds a conservative defensive guard so replayed/stale frames can no longer advance lastHeard.

The replay itself originates upstream (a retained/looping broker message); this change stops MeshMonitor from amplifying it.

Changes

  • New helper src/server/utils/replayGuard.ts (+ 10 unit tests):
    • isStaleReplayRxTime(rxTimeSec, nowSec) — true when a packet's own origin timestamp (rx_time) is a plausible absolute unix time (≥ 2020) but more than 6 hours old.
    • resolveLastHeardSec(rxTimeSec, nowMs) — returns now for live packets, undefined for a stale replay.
  • No upsert changes needed. Both upsertNode paths already preserve the stored value when lastHeard is omitted — async (nodes.ts:382 ?? existingNode.lastHeard) and sync SQLite (nodes.ts:1761 setIfProvided). Callers simply pass lastHeard: undefined for a stale replay, so a replay can neither resurrect a dead node nor drag a live one backward, on every backend.
  • Wired into every lastHeard-refresh site:
    • meshtasticManager.ts — generic packet handler (the reported path: transport=1 / decryptedBy=node) and the NodeInfo handler.
    • mqttIngestion.ts — resolved once per envelope and applied across NODEINFO / POSITION / TELEMETRY / generic, plus the traceroute, neighbor (threaded as a param), paxcounter, and store-forward helpers.
  • CHANGELOG.md — Bug Fixes entry.

Known limitation (documented in the helper): if the locally connected node's own clock is wrong by >6h, its packets would be misread as stale. In practice that node is time-synced (MeshMonitor pushes time to it), and the 2020 floor catches the common "clock reads ~0" failure mode.

Issues Resolved

None (reported via user diagnostics; no GitHub issue filed).

Documentation Updates

CHANGELOG.md updated. No feature/API docs affected (internal behavior change).

Testing

  • Full unit suite passes (6920 passed, 0 failed; 2338 suites)
  • New replayGuard.test.ts — 10 unit tests covering the threshold boundary, 2020 floor, absent/non-finite/future rx_time
  • New mqttIngestion.test.ts regression tests — a ~20-day-old telemetry replay omits lastHeard; recent and rx_time-absent packets still stamp now
  • TypeScript: touched files compile cleanly (pre-existing baseline noImplicitAny errors in unrelated src/components/*.tsx are unaffected)
  • Reviewer: confirm the 6h threshold / 2020 floor tradeoff is acceptable for your deployments

🤖 Generated with Claude Code

A node powered off for weeks could keep showing recent activity and a
fresh Last Heard time. The cause is a replayed frame — a retained MQTT
telemetry message, or an MQTT→LoRa bridge re-injecting an offline node's
cached reading onto the mesh with a new packet id (identical payload, so
packet-id dedup never catches it). Every attributed packet otherwise
stamps lastHeard = now, resurrecting the dead node on each replay.

Add a replay guard (src/server/utils/replayGuard.ts) that detects a stale
frame by its own origin timestamp (rx_time >= 2020 but >6h in the past)
and omits the lastHeard refresh, so the upsertNode merge preserves the
node's existing value instead of advancing it. Wired into the device
packet/NodeInfo paths and every MQTT lastHeard-refresh site
(NODEINFO/POSITION/TELEMETRY/traceroute/neighbor/paxcounter/store-forward).

Conservative by design: absent or boot-relative clocks (rx_time < 2020)
and ordinary skew/jitter fall through to normal "stamp now" behavior.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01SVGxkuD4Fwa2JGVim8ZeVj
@github-actions

github-actions Bot commented Jun 19, 2026

Copy link
Copy Markdown

Claude finished @Yeraze's task —— View job


PR Review: fix(nodes): guard lastHeard against replayed/stale packets

Tasks:

  • Gather context and understand the PR
  • Read changed files
  • Analyze implementation for code quality, bugs, performance, security
  • Check test coverage
  • Post review feedback

Overall this is a well-scoped, conservative fix for a real field-reported issue. The helper is cleanly isolated, the documentation is excellent, and the "fail open" (stamp now when rxTime is absent/boot-relative) design avoids the worst false-positive class. A few things worth addressing:

Minor Bug — Mixed float/floor in resolveLastHeardSec

src/server/utils/replayGuard.ts:73-74:

const nowSec = Math.floor(nowMs / 1000);
return isStaleReplayRxTime(rxTimeSec, nowMs / 1000) ? undefined : nowSec;

nowSec is floored to an integer, but isStaleReplayRxTime is called with nowMs / 1000 (a float, e.g. 1781913600.123). This is a sub-second inconsistency and won't affect correctness in practice (the threshold is 6 hours), but the two expressions should use the same value for consistency and to avoid confusion. The fix is trivial:

const nowSec = Math.floor(nowMs / 1000);
return isStaleReplayRxTime(rxTimeSec, nowSec) ? undefined : nowSec;

Design Note — Text messages from replayed packets are still stored

In mqttIngestion.ts, the TEXT_MESSAGE_APP case (around line 341) inserts the message into the DB regardless of whether it's a stale replay — only the lastHeard stamp on the sender node is suppressed. This is likely intentional (you don't want to lose a message that might have been legitimately delayed), but it's worth a comment confirming this, since the PR description focuses entirely on lastHeard and the "don't resurrect a dead node" goal. The same applies to TELEMETRY_APP — data is still stored even for stale replays.

Missing integration test coverage for most packet types

mqttIngestion.test.ts only tests the replay guard for TELEMETRY_APP. The helpers ingestTraceroute, ingestNeighborInfo, ingestPaxcounter, and ingestStoreForward all call resolveLastHeardSec independently and are not covered by the new replay-guard describe block. Given that ingestTraceroute and ingestStoreForward call resolveLastHeardSec from packet.rxTime directly (rather than using the pre-computed lastHeardSec from the envelope level), a test verifying those paths also suppress lastHeard correctly would add confidence. This is non-blocking but worth noting.

Inconsistent guard call-site pattern (ingestTraceroute, ingestPaxcounter, ingestStoreForward)

The top-level ingestServiceEnvelope computes lastHeardSec once and passes it down to ingestNeighborInfo (good). But ingestTraceroute (mqttIngestion.ts:460), ingestPaxcounter (line 773), and ingestStoreForward (line 866) each call resolveLastHeardSec(packet.rxTime, nowMs) again internally. This is redundant but not wrong — it produces the same result since packet and nowMs are the same object references. Passing the pre-computed lastHeardSec as a parameter (as ingestNeighborInfo already does) would be more consistent. Not blocking.

Threshold / floor choice — looks reasonable

The 6-hour threshold is defensible: it comfortably covers multi-day MQTT broker retention scenarios while being generous to legitimate delivery delays and clock skew. The 2020 floor (MIN_PLAUSIBLE_UNIX_SEC = 1_600_000_000) correctly catches boot-relative timestamps (uptime counters that look like small integers).

One edge case not explicitly tested: if rxTimeSec is a very large future timestamp (e.g. year 2100 from a misconfigured clock), nowSec - rxTimeSec will be deeply negative, which correctly returns false from isStaleReplayRxTime. This is covered implicitly by the "never flags a future-dated rx_time" test but is worth keeping in mind.

Security / correctness

No security concerns. The guard is purely defensive — it suppresses a lastHeard update; it doesn't drop packets, alter stored telemetry values, or affect authentication paths.

Summary

Severity Finding
Minor resolveLastHeardSec passes nowMs / 1000 (float) to isStaleReplayRxTime instead of the already-computed nowSec (integer)
Non-blocking Integration tests only cover TELEMETRY_APP; helpers called with packet.rxTime directly are untested
Non-blocking ingestTraceroute / ingestPaxcounter / ingestStoreForward re-compute resolveLastHeardSec instead of receiving the pre-computed value
Design Q Confirm by intent that stale-replay text messages and telemetry are still stored (only lastHeard is suppressed)

The float/floor inconsistency in replayGuard.ts:74 is the only item I'd call out for a fix before merge — everything else is polish.

@Yeraze Yeraze merged commit b30a4f5 into main Jun 19, 2026
18 checks passed
@Yeraze Yeraze deleted the fix/replay-guard-lastheard branch June 19, 2026 22:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant