Summary
PR #772 (fix(subtreevalidation): bound HTTP body reads when fetching subtree data) introduces a hard regression for any node whose local maximum_merkle_items_per_subtree is smaller than the network's actual subtree size. Catchup aborts on every peer and the node stops advancing.
Observed on bsva-ovh-teranode-ttn-eu-3 running v0.15.1-beta-2 syncing teratestnet:
- node tip stuck at height 174
- network tip ~19k
- p2p tip-block announcements continue; catchup runs and fails immediately
Error
SERVICE_ERROR (59): [catchup:fetchAndStoreSubtreeAndSubtreeData] All peers failed to fetch subtree 03061e277dcb02638e9a7692bb4913dfce5ce3462162c8c369f7d3dc75ea3738
-> SERVICE_ERROR (59): [catchup:fetchSubtreeFromPeer] failed to fetch subtree from https://testnet.teranode.sv/api/v1/subtree/03061e27...
-> EXTERNAL (8): http request [...] response body exceeds 1048576 bytes
Root cause
services/blockvalidation/get_blocks.go:663:
maxSubtreeBytes := int64(u.settings.BlockAssembly.MaximumMerkleItemsPerSubtree) * int64(chainhash.HashSize)
subtreeBytes, err := util.DoHTTPRequestBounded(ctx, url, maxSubtreeBytes)
Cap derived from local MaximumMerkleItemsPerSubtree. On the docker profile (settings.conf:1089):
maximum_merkle_items_per_subtree.docker = 32768
→ cap = 32768 × 32 = 1,048,576 bytes (1 MiB). Real teratestnet subtrees exceed 1 MiB → bounded reader rejects → all peers fail → catchup aborts every cycle.
Same pattern in:
services/subtreevalidation/SubtreeValidation.go:960
services/subtreevalidation/check_block_subtrees.go:218
Why the bound is wrong
Peer's subtree size is set by their producer, not by this node's policy. Local MaximumMerkleItemsPerSubtree controls what we assemble, not what we accept. Bounding incoming subtrees by local assembly policy is a category error.
Suggested fix
Bound by a network/consensus max (largest legitimate subtree size), not by the receiving node's assembly policy. Or remove the bound and rely on the existing 5-minute streaming timeout + connection-level limits.
If the goal is DoS protection, the right knob is a separate max_incoming_subtree_bytes policy with a generous default (e.g. matching mainnet 32 MiB or higher), not a derived value from assembly config.
Unblock right now
Repro
- Bring up docker quickstart on teratestnet with default
.env
- Wait — node hits subtrees > 1 MiB
- catchup logs
response body exceeds 1048576 bytes, never advances
Summary
PR #772 (
fix(subtreevalidation): bound HTTP body reads when fetching subtree data) introduces a hard regression for any node whose localmaximum_merkle_items_per_subtreeis smaller than the network's actual subtree size. Catchup aborts on every peer and the node stops advancing.Observed on
bsva-ovh-teranode-ttn-eu-3runningv0.15.1-beta-2syncing teratestnet:Error
Root cause
services/blockvalidation/get_blocks.go:663:Cap derived from local
MaximumMerkleItemsPerSubtree. On thedockerprofile (settings.conf:1089):→ cap = 32768 × 32 = 1,048,576 bytes (1 MiB). Real teratestnet subtrees exceed 1 MiB → bounded reader rejects → all peers fail → catchup aborts every cycle.
Same pattern in:
services/subtreevalidation/SubtreeValidation.go:960services/subtreevalidation/check_block_subtrees.go:218Why the bound is wrong
Peer's subtree size is set by their producer, not by this node's policy. Local
MaximumMerkleItemsPerSubtreecontrols what we assemble, not what we accept. Bounding incoming subtrees by local assembly policy is a category error.Suggested fix
Bound by a network/consensus max (largest legitimate subtree size), not by the receiving node's assembly policy. Or remove the bound and rely on the existing 5-minute streaming timeout + connection-level limits.
If the goal is DoS protection, the right knob is a separate
max_incoming_subtree_bytespolicy with a generous default (e.g. matching mainnet 32 MiB or higher), not a derived value from assembly config.Unblock right now
maximum_merkle_items_per_subtree.docker = 1048576(matches default), orRepro
.envresponse body exceeds 1048576 bytes, never advances