fix(legacy/netsync): auto-detect v2 txmeta wire format#954
Conversation
processTXmetaBatchMessage read the first 4 bytes as a v1 uint32 entry count unconditionally. If validator_txmeta_wireFormat=v2 is enabled the producer emits messages with a 0xFF magic byte, an 8-byte header, and an 8-byte xxhash before each entry's tx hash; the legacy consumer would either log "truncated message" and ack silently or misalign and announce garbled tx hashes to peers. Mirror the detection pattern from services/subtreevalidation/txmetaHandler.go (PR bsv-blockchain#912): branch on data[0] == 0xFF, validate the sub-version, read the entry count from bytes 4-8, and skip the per-entry xxhash. netsync discards the xxhash because it only announces ADDs to peers via txAnnounceBatcher and does no partition-aligned cache writes. Added v2 subtests covering the happy path, DELETE-skipping, unknown sub-version, and truncated header; the happy-path and DELETE tests failed against the unpatched code, confirming the regression.
|
🤖 Claude Code Review Status: Complete Current Review: No issues found. The PR successfully addresses the v2 wire format auto-detection bug and implements comprehensive safeguards. Analysis: This PR fixes a critical bug where Key Changes Verified:
Documentation Accuracy: All claims verified against implementation:
Previous Copilot Issues: All addressed in commit d0757af with test coverage and author responses. Recommendation: Approve for merge. |
There was a problem hiding this comment.
Pull request overview
Updates the legacy netsync txmeta Kafka consumer to auto-detect and parse the newer v2 txmeta wire format (magic/header + per-entry xxhash prefix), preventing mis-parsing or silent drops when producers switch to validator_txmeta_wireFormat=v2.
Changes:
- Add v2 header detection and per-entry xxhash skipping in
processTXmetaBatchMessage. - Introduce v2 message builder + new v2-focused unit subtests for coinbase filtering, DELETE skipping, unknown version, and truncated header handling.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| services/legacy/netsync/manager.go | Adds v2 wire-format detection + parsing adjustments (skip xxhash) for txmeta Kafka batches. |
| services/legacy/netsync/kafka_txmeta_test.go | Adds a v2 message builder helper and v2 parsing/behavior subtests. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Benchmark Comparison ReportBaseline: Current: Summary
All benchmark results (sec/op)
Threshold: >10% with p < 0.05 | Generated: 2026-05-27 10:48 UTC |
oskarszoon
left a comment
There was a problem hiding this comment.
Approve. Mirrors the v2 detection in #912 cleanly.
Parse logic checks out end-to-end: 4 distinct truncation guards, no unchecked slice indexing, no panic possible on malformed input. Magic-byte collision is unreachable (v1 entry count would need to be ≥ 4,278,190,080). Sub-version validation rejects anything ≠ 0x02 with Errorf "unknown v2 wire version N" + ack-discard — that's the forward-compat seam working as designed.
One deliberate divergence from txmetaHandler.go worth noting: manager.go return nil on a truncated entry (drops the whole batch), txmetaHandler.go break (commits the parsed prefix). Both are correct for their respective consumers — netsync only announces ADDs to peers, so a truncated batch should drop entirely rather than leak garbled tx hashes.
Red-phase claim verified — v2 happy-path and DELETE-skip subtests fail against unpatched main (0xFF first byte misread as v1 entry count 767, loop exits truncated after 0 announcements). All 11 subtests pass -race.
The v2 magic-byte detection was unsafe: v1 messages with entry counts 255, 511, 767, ... begin with byte 0xFF (the LE-uint32 low byte), so the naive `data[0] == 0xFF` check misclassified them. Count 767 in particular aliases the entire 4-byte v2 header `0xFF 0x02 0x00 0x00` exactly. Both legacy/netsync (this PR) and subtreevalidation (PR bsv-blockchain#912) had the bug, silently dropping the netsync announce or garbling the txmeta cache write. Detection now requires the full 4-byte header signature plus a plausibility check on the resulting v2 entry count (entries * minEntrySize must fit in the remaining buffer); any failure falls through to v1 parsing instead of dropping the message. Move the wire-format constants (action bytes, magic, version, header length, min entry size) into a new stores/txmetacache/wire.go so the producer (validator) and all consumers (subtreevalidation, legacy/netsync) reference one source of truth. Drops three local copies and prevents future drift. Regression tests cover the count=255 and count=767 collisions in both the netsync consumer and the subtreevalidation handler, asserting that the v1 dispatch path runs and the v2 path does not. Addresses Copilot review feedback on PR bsv-blockchain#954.
Adds the same plausibility check on the v1 fallback path of the subtreevalidation txmeta handler that the v2 path already had: `entries * minEntrySize ≤ remainingBuffer`. Without this, a malformed or malicious v1 message could declare ~4B entries and trigger an oversized pool-buffer pre-allocation (~128GB) before the per-entry truncation check would kick in. The bound now uses the shared WireV1MinEntrySize constant added to stores/txmetacache/wire.go and mirrors the v2 detection-time check in shape. Renames TestServer_txmetaHandler_V2_UnknownVersion to reflect its new semantics — an unknown v2 sub-version now falls back to v1 parsing rather than being rejected outright. The test now also asserts that the v2 dispatch path is NOT taken, which is the actual invariant. Adds TestServer_txmetaHandler_V1_ImplausibleEntryCount_IsRejected to cover the new bound. Rewrites the Transaction Metadata Message Format section of docs/references/kafkaMessageFormat.md to describe the actual binary batch wire format (v1 and v2) with the multi-byte detection rules and plausibility checks. The previous protobuf-shaped description was stale from before the batched binary format landed; this PR widens the gap by introducing v2, so the doc is updated as part of the same change. Index entries adjusted to match.
|



Summary
processTXmetaBatchMessageinservices/legacy/netsync/manager.goread the first 4 bytes as a v1uint32entry count unconditionally. If a future deploy flipsvalidator_txmeta_wireFormat=v2, the producer emits messages with a0xFFmagic byte, an 8-byte header, and an 8-byte xxhash before each entry's tx hash. The legacy consumer would either log "truncated message" and ack silently, or misalign and announce garbled tx hashes to peers.This mirrors the v2 detection added to
services/subtreevalidation/txmetaHandler.goin #912:data[0] == 0xFF→ v2; otherwise v1.data[4:8], setoffset = 8.netsyncdiscards the xxhash because it only announces ADDs to peers viatxAnnounceBatcher— partition-aligned cache writes are asubtreevalidationconcern.Test plan
go test -race ./services/legacy/netsync/— all v1 subtests still pass; 4 new v2 subtests added (happy path, DELETE-skipping, unknown sub-version, truncated header)main(current code logstruncated message at entry Nand announces nothing), proving the regression the fix addressesgo vet ./services/legacy/...cleangolangci-lint run ./services/legacy/netsync/...— 0 issuesgo build ./...cleanNotes
txmetaWireV2Magic(0xFF) /txmetaWireV2Version(0x02) are now declared in three places: validator (producer),subtreevalidation/txmetaHandler.go, and nowlegacy/netsync/manager.go. Consolidating into a sharedmodel/helper is a small follow-up — left out here to keep the diff surgical.0xFFbecause that would require >4.27B entries) is the same one used by the subtreevalidation handler.