fix(dm): skip PKI flag when keyMismatchDetected to prevent silent drop after firmware purge#3140
Conversation
…ey after purge When key-repair detects a mismatch and (via processKeyRepairs or the immediate-purge path) calls sendRemoveNode, the firmware purges the destination's entry — including its public key — from its NodeDB. Our local nodes.publicKey column still holds the (now-stale) key, so sendTextMessage was still setting pkiEncrypted=true. The firmware, with no key to use, silently drops the outbound DM. Net effect: DMs to a node with an active keyMismatchDetected appear to "send" from MeshMonitor's side but never reach the destination. This was masked by — and the symptom surfaced through — #3133 (MQTT publicKey hex vs base64). MQTT-first-seen nodes had their key written as hex; every subsequent direct-radio NodeInfo string-compared unequal against the base64 form, triggering a false mismatch, triggering the auto-purge, leaving the firmware keyless for that destination. Once the hex/base64 normalization landed (2430152), the false positives stopped and DMs flowed again. The underlying "PKI=true with no key → silent drop" problem remains for genuine key changes, hence this guard. The fix consults targetNode.keyMismatchDetected before requesting PKI. When the flag is set, we skip pkiEncrypted=true and fall back to channel encryption — matching what the post-purge node-info-exchange code already does (lines 2174-2175, 5604-5605). Regression test: src/server/meshtasticManager.test.ts adds a "DM PKI guard" describe block covering the five publicKey/keyMismatchDetected combinations, mirroring the production decision logic so future refactors of that branch fail loudly. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Claude finished @Yeraze's task —— View job PR Review CompleteThis is a well-crafted fix for a subtle but critical bug in PKI encryption handling. Here's my comprehensive review: 🎯 SummaryExcellent work! This PR addresses a genuine security/reliability issue where DMs were being silently dropped by firmware after key repair flows. The fix is minimal, targeted, and well-tested. ✅ StrengthsCode Quality & Best Practices
Security & Correctness
Test Coverage
🔍 Technical AnalysisRoot Cause UnderstandingThe fix correctly identifies the core issue: after Implementation Approach
📋 Minor Suggestions
🚫 No Significant Issues Found
🧪 Test Plan ValidationThe manual test steps in the PR description are appropriate:
🎯 Overall AssessmentLGTM with high confidence. This is a high-quality fix that:
The engineering approach demonstrates solid understanding of the Meshtastic protocol, firmware behavior, and the codebase architecture. This is exactly the kind of targeted fix that should be merged quickly. Recommendation: Approve and merge ✨ |
…ish (#3145) Version bump and CHANGELOG entry for 4.6.5. Highlights: - #3143 MQTT Bridge mirror dashboard (read-only Open button on bridge cards; suppresses every TX surface; MqttBridgeManager now exposes the Meshtastic-shaped methods the consolidated /api/poll + /api/connection endpoints expect). - #3142 MeshCore Telemetry dashboard tab via source-agnostic Dashboard. - #3141 MeshCore: preserve LPP channel byte in remote telemetry rows. - #3140 DM: skip PKI flag when keyMismatchDetected. - #3138 docs reorganization. - #3137 unified: merge nodes across sources so labels show in the unified Nodes view. - #3136 mqtt: allow standalone mqtt_bridge as client-proxy target. CLAUDE.md version line updated to 4.6.5. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
sendTextMessagewas settingpkiEncrypted=truewhenever ournodes.publicKeycolumn was non-null, but after the key-repair flow callssendRemoveNodethe firmware no longer holds a key for that destination — so the firmware silently drops the DM.keyMismatchDetectedguard so we fall back to channel encryption while a mismatch is active, matching what the post-purge node-info-exchange path already does (meshtasticManager.ts:2174-2175,meshtasticManager.ts:5604-5605).Why this matters / how it surfaced
This was masked by — and revealed through — #3133. MQTT-first-seen nodes had
publicKeystored as hex; every subsequent direct-radio NodeInfo string-compared unequal against the base64 form, tripping a false mismatch → auto-purge → firmware NodeDB no longer has the destination's key. With our DB still holding a (now-stale) hex key,sendTextMessagekept settingpkiEncrypted=true, the firmware kept dropping the DM, and users saw "messages send but never arrive."The hex/base64 normalization in #3133 stopped the false positives, which is why DMs flowed again. The underlying "PKI requested but firmware has no key → silent drop" hazard remains for genuine key changes (lost device, fresh keypair, etc.), hence this guard.
The bug chain
meshtasticManager.ts:5572does string-equality → false positive →keyMismatchDetected = true.processKeyRepairs(or the immediate-purge branch at line 5610) callssendRemoveNode(node.nodeNum)— firmware purges the entry, including its public key.publicKeyvalue →sendTextMessagesetspkiEncrypted=true.After this PR, step 5 checks
keyMismatchDetectedand skips PKI. The DM goes out over channel encryption until a fresh NodeInfo re-establishes the key.Test plan
npx vitest run src/server/meshtasticManager.test.ts— 111 passed, 0 failednpx tsc --noEmit— cleankeyMismatchDetected=truein the DB, send a DM and verify the log shows the "skipping PKI" message and the packet reaches the destination on channel encryption.pkiEncrypted=true.Related
fix(mqtt): store ingested publicKey as base64 (not hex)— masked this bug for the common casemeshtasticManager.ts:2113processKeyRepairs— the path that triggers the firmware purgemeshtasticManager.ts:5610immediate-purge branch — same hazard🤖 Generated with Claude Code