Skip to content

consensus_encoding: match core's range check decoding compact sizes#5897

Open
nyonson wants to merge 4 commits intorust-bitcoin:masterfrom
nyonson:cores-range-check
Open

consensus_encoding: match core's range check decoding compact sizes#5897
nyonson wants to merge 4 commits intorust-bitcoin:masterfrom
nyonson:cores-range-check

Conversation

@nyonson
Copy link
Copy Markdown
Contributor

@nyonson nyonson commented Mar 25, 2026

Updates the default max value limit of the compact size decoder to MAX_COMPACT_SIZE: usize = 0x0200_0000 which matches the Core implementation.

I went with a usize here to minimize the API change, and also, I think usize generally makes the most sense with compact sizes. However, this value is larger than would fit on a 16-bit machine...not sure if that is a deal breaker. We already have one const defined which is too big if it matters (decoders::MAX_VECTOR_ALLOCATE in the same module).

The existing MAX_VEC_SIZE limit is moved over decoders and used with the vec decoder. Callsites are generally kept the same, with some new constructors exposed for possible overrides in the future (p2p/psbt).

Closes #5752

@github-actions github-actions bot added the C-consensus_encoding PRs modifying the consensus-encoding crate label Mar 25, 2026
@nyonson nyonson force-pushed the cores-range-check branch from 585a9e0 to 1036ea2 Compare March 25, 2026 18:25
@nyonson nyonson force-pushed the cores-range-check branch from 1036ea2 to 207a771 Compare March 25, 2026 18:27
///
/// This is an anti-DoS limit which won't possibly reject any block,
/// or part of a block, on the network.
#[cfg(feature = "alloc")]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Im curious why this cfg feature for alloc is needed here. Where is the allocation happening that this needs to be enabled exactly?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The const is only used when alloc is enabled.

error: constant `MAX_VEC_SIZE` is never used
  --> consensus_encoding/src/decode/decoders.rs:27:7
   |
27 | const MAX_VEC_SIZE: usize = 4_000_000;
   |       ^^^^^^^^^^^^
   |
   = note: `-D dead-code` implied by `-D warnings`
   = help: to override `-D warnings` add `#[expect(dead_code)]` or `#[allow(dead_code)]`

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh right I see now. Strange my compiler doesn't seem to give me that error. Is this just from a cargo build with the feature gate removed from the const?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, remove the gate and then build with no features enable. Can use cargo rbmt test -p consensus_encoding if you want the different feature flag sets to be automatically run for you.

@tcharding
Copy link
Copy Markdown
Member

ByteVecDecoder should use the old limit, right?

Currently

impl ByteVecDecoder {
    /// Constructs a new byte decoder.
    pub const fn new() -> Self {
        Self {
            prefix_decoder: Some(CompactSizeDecoder::new()),
            buffer: Vec::new(),
            bytes_expected: 0,
            bytes_written: 0,
        }
    }

@nyonson
Copy link
Copy Markdown
Contributor Author

nyonson commented Mar 25, 2026

ByteVecDecoder should use the old limit, right?

Should it? I was debating that one internally. It is allocating just a hunk of bytes, not a vector of things, so I thought it fit the looser MAX_COMPACT_SIZE.

@tcharding
Copy link
Copy Markdown
Member

Isn't the use case only ever decode something that goes in a block and the prefix is the length of that. So it is exactly what we introduced the 4MB limit for, right?

@nyonson
Copy link
Copy Markdown
Contributor Author

nyonson commented Mar 26, 2026

Isn't the use case only ever decode something that goes in a block and the prefix is the length of that.

I don't think that's exactly right. Just looking at the spots we already use ByteVecDecoder, we use it for non-in-a-block things like p2p compact block filters. The ByteVecDecoder is used for blob-of-bytes scenarios like the compact block filters and witness scripts.

So it is exactly what we introduced the 4MB limit for, right?

I don't have a great grasp on this whole story yet. But looking at the Core code, they appear to take a more two-tiered approach, and this is what I am tacking towards in this change. Core has the 32MB max for compact size in general, but then enforces lower maxes in type-specific scenarios. But for compact block filters for example, it looks like they don't enforce any lower of a bound, so the default 32MB is used (although I think this is separately bound by a max p2p message size, handled at a different layer). So our current approach is actually more conservative.

Do we want to try to match Core one for one on this? I am not sure the best way to go yet, but I do think this is a nudge in the right direction.

@tcharding
Copy link
Copy Markdown
Member

tcharding commented Mar 26, 2026

EDIT: Me and Nick talked out of band, expect a force push that breaks this up to make review easier.

Do we want to try to match Core one for one on this?

I don't know the answer to that. I thought we were explicitly trying to be more conservative. Correct me if I'm wrong but I was under the impression that the MAX_VEC_SIZE was our invention.

I am not sure the best way to go yet, but I do think this is a nudge in the right direction.

If you have dug into it this much I'm happy to go with what ever you come up with. We will have to burn a few of @apoelstra's clock cycles too though.

@nyonson nyonson force-pushed the cores-range-check branch from 207a771 to 138a477 Compare March 27, 2026 04:11
@nyonson
Copy link
Copy Markdown
Contributor Author

nyonson commented Mar 27, 2026

138a477: took a more conservative approach and broke down any new defaults. Added a small API addition at the end so we can kick the can on changing any more defaults for now.

@github-actions github-actions bot added the C-p2p label Mar 27, 2026
@nyonson nyonson force-pushed the cores-range-check branch from 138a477 to 68b651a Compare March 27, 2026 14:09
@github-actions github-actions bot added the C-bitcoin PRs modifying the bitcoin crate label Mar 27, 2026
@nyonson
Copy link
Copy Markdown
Contributor Author

nyonson commented Mar 27, 2026

68b651a: fix formatting and run the formatter for the workspace.

Copy link
Copy Markdown
Member

@tcharding tcharding left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 68b651a

@tcharding
Copy link
Copy Markdown
Member

tcharding commented Mar 29, 2026

Thanks for breaking this up, I"ve got way more confidence in my review now.

/// limit.
pub const fn new() -> Self { Self { buf: ArrayVec::new(), limit: MAX_VEC_SIZE } }
/// The decoded value must not exceed the default 32MB limit and must fit in
/// a `usize`, otherwise [`end`](Self::end) will return an error.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In 989dec3:

This comment seems wrong. If MAX_COMPACT_SIZE doesn't fit in a usize then the code just won't compile.

At some point we can conditonally compile so that this constant is reduced to usize::MAX on 16-bit machines, but I'm fine ignoring this until we figure out how we can actually test on 16-bit machines. But this doccomment we should change.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea you are right, made sense when the default was smaller.

use encoding::{
ArrayDecoder, ArrayEncoder, ByteVecDecoder, BytesEncoder, CompactSizeEncoder, CompactSizeU64Decoder, Decoder2, Decoder4, Encoder2, Encoder4
ArrayDecoder, ArrayEncoder, ByteVecDecoder, BytesEncoder, CompactSizeEncoder,
CompactSizeU64Decoder, Decoder2, Decoder4, Encoder2, Encoder4,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In 989dec3:

All the changes in address.rs in this commit appear to be formatting changes.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dang my bad, dropped.

@nyonson nyonson force-pushed the cores-range-check branch from 68b651a to 098b587 Compare March 30, 2026 19:39
@github-actions github-actions bot added C-hashes PRs modifying the hashes crate and removed C-bitcoin PRs modifying the bitcoin crate C-p2p labels Mar 30, 2026
@nyonson
Copy link
Copy Markdown
Contributor Author

nyonson commented Mar 30, 2026

098b587: rebased and folded in doc and formatting fixes.

Copy link
Copy Markdown
Member

@tcharding tcharding left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 098b587

@apoelstra
Copy link
Copy Markdown
Member

Can you drop 098b587? It messes up the formatting.

@tcharding
Copy link
Copy Markdown
Member

Formatting issues fixed in #5927

@tcharding
Copy link
Copy Markdown
Member

ACK f05289b

@tcharding
Copy link
Copy Markdown
Member

@nyonson if you don't rebase and just drop the last commit then my ack should be good.

@nyonson nyonson force-pushed the cores-range-check branch from 098b587 to f05289b Compare March 31, 2026 02:40
@github-actions github-actions bot removed the C-hashes PRs modifying the hashes crate label Mar 31, 2026
@nyonson
Copy link
Copy Markdown
Contributor Author

nyonson commented Mar 31, 2026

Abandoned the last format commit

Copy link
Copy Markdown
Member

@tcharding tcharding left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK f05289b

@tcharding
Copy link
Copy Markdown
Member

After all that I accidentally acked this trying to ack #5879 ...

@apoelstra
Copy link
Copy Markdown
Member

f05289b needs rebase

nyonson added 4 commits March 31, 2026 11:30
Bump the max decoding limit on the standard compact size decoder from
4MB to 32MB. This matches the value used in Core as the general anti-DoS
setting.

All current call sites across the crates are updated to use the old
default explicitly. They can be changed to use the new default if it
makes sense on a case by case basis.

This new default breaks on 16-bit machines since it doesn't fit in a
usize. The alternative would be to change the type to a u64 which sucks
for everyone else.
These limit overrides are unnecessary because both types,
`PrefilledTransaction.idx` and `Offset`, are u16's under the hood. The
decoding will fail on too large of values when the type conversion
is attempted in `::end()`.
Allow setting a custom limit in case the default 4MiB is too conservative.
@nyonson nyonson force-pushed the cores-range-check branch from f05289b to 02145aa Compare March 31, 2026 18:30
@nyonson
Copy link
Copy Markdown
Contributor Author

nyonson commented Mar 31, 2026

@apoelstra @tcharding want me to fix up the API files here or would that be kinda confusing

@tcharding
Copy link
Copy Markdown
Member

Confirming these API changes are unrelated. They exist on master right now, fixed in #5930. Caused by me in #5899

@tcharding
Copy link
Copy Markdown
Member

@apoelstra @tcharding want me to fix up the API files here or would that be kinda confusing

Andrew's call really but from my perspective a patch at the front of this (same as #5930) would be fine. But Andrew might be happy to merge as is, if my memory is correct his CI doesn't consider the API text files.

@apoelstra
Copy link
Copy Markdown
Member

Corrrect, my CI doesn't consider the API files. But this also needs a rebase.

I merged #5914 which I had thought would fix CI. But maybe it's actually #5930 (needs rebase)?

@apoelstra
Copy link
Copy Markdown
Member

02145aa needs rebase

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

C-consensus_encoding PRs modifying the consensus-encoding crate C-primitives

Projects

None yet

Development

Successfully merging this pull request may close these issues.

consensus_encoding: Add range check to when decoding a compact size

4 participants