Skip to content

Delegate witness decoder#5224

Closed
nyonson wants to merge 4 commits intorust-bitcoin:masterfrom
nyonson:witness-vec-decoder
Closed

Delegate witness decoder#5224
nyonson wants to merge 4 commits intorust-bitcoin:masterfrom
nyonson:witness-vec-decoder

Conversation

@nyonson
Copy link
Copy Markdown
Contributor

@nyonson nyonson commented Oct 30, 2025

Delegate the witness decoder to the consensus_decoding VecDecoder and ByteVecDecoder drops a lot of duplicated logic. It also allows the DoS protection of consensus_encoding to be an implementation detail instead of part of the crate's exposed API.

This allows VecDecoder<Vec<u8>> to work, enabling natural composition
of nested vector types like Vec<Vec<u8>> (witness data). The
implementation delegates to ByteVecDecoder which provides optimized
bulk copying rather than decoding bytes one at a time.
@github-actions github-actions bot added C-consensus_encoding PRs modifying the consensus-encoding crate test C-primitives labels Oct 30, 2025
@nyonson nyonson force-pushed the witness-vec-decoder branch from 6d5f782 to 4b3e8e5 Compare October 30, 2025 17:05
Delegate to the consensus_encoding byte vector decoder instead of
duplicating the logic over in this crate. Also allows DoS protections
to live entirely in consensus_encoding.
DoS protection can now be an implementation detail of the
consensus_encoding crate.
@nyonson nyonson force-pushed the witness-vec-decoder branch from 4b3e8e5 to 9c0a138 Compare October 30, 2025 17:07
@nyonson
Copy link
Copy Markdown
Contributor Author

nyonson commented Oct 30, 2025

@jrakibi I think this would simplify #5177 with one less allocation spot (witness decoder now delegates to the consensus_encoding ones). Plus, the DoS protection could now be private to consensus_encoding.

/// This delegates to [`ByteVecDecoder`] which provides an optimized implementation
/// that uses bulk copying rather than decoding bytes one at a time.
#[cfg(feature = "alloc")]
impl Decodable for Vec<u8> {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes the types checkout nicely, but assumes any vec of bytes will have a compact size at the front. I believe this is an ok assumption in the bitcoin domain?

@tcharding
Copy link
Copy Markdown
Member

This is a really nice PR. Hurts a bit that I did not think of it during development of the witness decoder. Thanks for the work man, big improvement.

Copy link
Copy Markdown
Member

@tcharding tcharding left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 0eadbb7

@nyonson
Copy link
Copy Markdown
Contributor Author

nyonson commented Oct 30, 2025

Hurts a bit that I did not think of it during development of the witness decoder

Heh, I felt that with the error composites stuff.

@jrakibi
Copy link
Copy Markdown
Contributor

jrakibi commented Oct 30, 2025

This is a nice refactor

Copy link
Copy Markdown
Contributor

@jrakibi jrakibi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 0eadbb7

@apoelstra
Copy link
Copy Markdown
Member

In 86cd93d:

Whoah, this doubles the number of allocations required by WitnessDecoder. We decode a Vec<Vec<u8>> and then immediately need to construct a new vector to hold the inner of Witness.

Concept NACK.

@nyonson
Copy link
Copy Markdown
Contributor Author

nyonson commented Oct 31, 2025

Whoah, this doubles the number of allocations required by WitnessDecoder. We decode a Vec<Vec<u8>> and then immediately need to construct a new vector to hold the inner of Witness.

Dang too good to be true? I was focused on the types, totally overlooked allocations. I think I need some pointers on where the new allocations are though. Does "immediately need to construct a new vector to hold the inner of Witness" refer to the Witness::from_slice call in WitnessDecoder::end?

@apoelstra
Copy link
Copy Markdown
Member

Does "immediately need to construct a new vector to hold the inner of Witness" refer to the Witness::from_slice call in WitnessDecoder::end?

Yes. This is the whole reason we had Witness::from_parts_unstable; if we were willing to stomach allocating a vector then re-allocating the whole thing we could've just used Witness::from_slice.

@nyonson
Copy link
Copy Markdown
Contributor Author

nyonson commented Oct 31, 2025

Yes. This is the whole reason we had Witness::from_parts_unstable; if we were willing to stomach allocating a vector then re-allocating the whole thing we could've just used Witness::from_slice.

Got it, this is making sense as I sink into the Witness struct implementation. FTR I don't think my patches here introduce the double allocations since the current version is using Witness::from_slice too. But my goal of "get all DoS protection into consensus_encoding" probably isn't possible because the witness decoder needs to convert the wire encoding to the single allocation on the fly with custom logic.

@apoelstra
Copy link
Copy Markdown
Member

since the current version is using Witness::from_slice too

Oof. This is what I get for merging #4998 with multiple big issues. Apparently I missed a big issue.

I guess we could merge this as-is since it's a strict improvement, but we're gonna have to rewrite the witness decoder at some point. I'm a little concerned by this PR tightening the API though by changing the error types. It makes me worried that we're tying our hands in a way that will prevent us from writing a proper decoder.

@nyonson
Copy link
Copy Markdown
Contributor Author

nyonson commented Oct 31, 2025

I guess we could merge this as-is since it's a strict improvement, but we're gonna have to rewrite the witness decoder at some point. I'm a little concerned by this PR tightening the API though by changing the error types. It makes me worried that we're tying our hands in a way that will prevent us from writing a proper decoder.

I am taking a trip down memory lane here and learning a bunch on how the old decoder was written. It is convincing me to just close this, I don't think it will help in the long run like you are saying.

@apoelstra
Copy link
Copy Markdown
Member

Yeah, I think you might be right. Sorry for the wasted effort!

@nyonson
Copy link
Copy Markdown
Contributor Author

nyonson commented Oct 31, 2025

I'll attempt to re-write the WitnessDecoder to have a single allocation like the old style.

@nyonson nyonson closed this Oct 31, 2025
@tcharding
Copy link
Copy Markdown
Member

\me crawls slowly off into the corner ...

@tcharding
Copy link
Copy Markdown
Member

I'm kidding, we got this.

apoelstra added a commit that referenced this pull request Nov 7, 2025
df64705 primitives: single allocation witness decoder (Nick Johnson)

Pull request description:

  When attempting to simplify the `WitnessDecoder` in #5224, we noticed that we were double allocating which erased the performance gains of the old decoder from #672. This updates `WitnessDecoder` to use a single allocation like the old decoder.
  
  While the intention of this patch was purely for performance, three of the new tests fail against the current version of the `WitnessDecoder`: `decode_empty_element`, `decode_multiple_empty_elements`, and `decode_incomplete_witness_count`. 
  
  For the first two tests, there is a functional change in how empty witness elements are handled. This patch introduces a short-circuit where the old version might get stuck in a loop forever requesting zero bytes.
  
  The third test, `decode_incomplete_witness_count`, shows that the impl has a more explicit failure now if `end` is called before the witness element count is complete, instead of just returning and empty witness.


ACKs for top commit:
  tcharding:
    ACK df64705
  jrakibi:
    ACK df64705
  apoelstra:
    ACK df64705; successfully ran local tests


Tree-SHA512: 5177410da2d3ceb65a7bfc3e8444a913d23cad1c8e83f045501882b45bac2a90f403d0f7e72fe6b88c25774bbf636e007726d0dc93aeb77bed4ec94d67efa2bb
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

C-consensus_encoding PRs modifying the consensus-encoding crate C-primitives test

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants