fix(nodes): guard lastHeard against replayed/stale packets#3569
Conversation
A node powered off for weeks could keep showing recent activity and a fresh Last Heard time. The cause is a replayed frame — a retained MQTT telemetry message, or an MQTT→LoRa bridge re-injecting an offline node's cached reading onto the mesh with a new packet id (identical payload, so packet-id dedup never catches it). Every attributed packet otherwise stamps lastHeard = now, resurrecting the dead node on each replay. Add a replay guard (src/server/utils/replayGuard.ts) that detects a stale frame by its own origin timestamp (rx_time >= 2020 but >6h in the past) and omits the lastHeard refresh, so the upsertNode merge preserves the node's existing value instead of advancing it. Wired into the device packet/NodeInfo paths and every MQTT lastHeard-refresh site (NODEINFO/POSITION/TELEMETRY/traceroute/neighbor/paxcounter/store-forward). Conservative by design: absent or boot-relative clocks (rx_time < 2020) and ordinary skew/jitter fall through to normal "stamp now" behavior. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01SVGxkuD4Fwa2JGVim8ZeVj
|
Claude finished @Yeraze's task —— View job PR Review: fix(nodes): guard lastHeard against replayed/stale packetsTasks:
Overall this is a well-scoped, conservative fix for a real field-reported issue. The helper is cleanly isolated, the documentation is excellent, and the "fail open" (stamp now when rxTime is absent/boot-relative) design avoids the worst false-positive class. A few things worth addressing: Minor Bug — Mixed float/floor in
|
| Severity | Finding |
|---|---|
| Minor | resolveLastHeardSec passes nowMs / 1000 (float) to isStaleReplayRxTime instead of the already-computed nowSec (integer) |
| Non-blocking | Integration tests only cover TELEMETRY_APP; helpers called with packet.rxTime directly are untested |
| Non-blocking | ingestTraceroute / ingestPaxcounter / ingestStoreForward re-compute resolveLastHeardSec instead of receiving the pre-computed value |
| Design Q | Confirm by intent that stale-replay text messages and telemetry are still stored (only lastHeard is suppressed) |
The float/floor inconsistency in replayGuard.ts:74 is the only item I'd call out for a fix before merge — everything else is polish.
Summary
A node powered off for weeks could keep appearing "recently heard" in MeshMonitor, with a continually-refreshed Last Heard time. The cause is a replayed frame — typically a retained MQTT telemetry message, or an MQTT→LoRa bridge re-injecting an offline node's cached reading onto the mesh with a new packet id (the payload is byte-identical, so packet-id dedup never catches it). Every packet attributed to a node otherwise stamps
lastHeard = now, which resurrects the dead node on each replay. This PR adds a conservative defensive guard so replayed/stale frames can no longer advancelastHeard.The replay itself originates upstream (a retained/looping broker message); this change stops MeshMonitor from amplifying it.
Changes
src/server/utils/replayGuard.ts(+ 10 unit tests):isStaleReplayRxTime(rxTimeSec, nowSec)— true when a packet's own origin timestamp (rx_time) is a plausible absolute unix time (≥ 2020) but more than 6 hours old.resolveLastHeardSec(rxTimeSec, nowMs)— returnsnowfor live packets,undefinedfor a stale replay.upsertNodepaths already preserve the stored value whenlastHeardis omitted — async (nodes.ts:382?? existingNode.lastHeard) and sync SQLite (nodes.ts:1761setIfProvided). Callers simply passlastHeard: undefinedfor a stale replay, so a replay can neither resurrect a dead node nor drag a live one backward, on every backend.lastHeard-refresh site:meshtasticManager.ts— generic packet handler (the reported path: transport=1 / decryptedBy=node) and the NodeInfo handler.mqttIngestion.ts— resolved once per envelope and applied across NODEINFO / POSITION / TELEMETRY / generic, plus the traceroute, neighbor (threaded as a param), paxcounter, and store-forward helpers.Known limitation (documented in the helper): if the locally connected node's own clock is wrong by >6h, its packets would be misread as stale. In practice that node is time-synced (MeshMonitor pushes time to it), and the 2020 floor catches the common "clock reads ~0" failure mode.
Issues Resolved
None (reported via user diagnostics; no GitHub issue filed).
Documentation Updates
CHANGELOG.md updated. No feature/API docs affected (internal behavior change).
Testing
replayGuard.test.ts— 10 unit tests covering the threshold boundary, 2020 floor, absent/non-finite/future rx_timemqttIngestion.test.tsregression tests — a ~20-day-old telemetry replay omitslastHeard; recent and rx_time-absent packets still stamp nownoImplicitAnyerrors in unrelatedsrc/components/*.tsxare unaffected)🤖 Generated with Claude Code