primitives: single allocation witness decoder by nyonson · Pull Request #5239 · rust-bitcoin/rust-bitcoin

nyonson · 2025-11-03T20:16:00Z

When attempting to simplify the WitnessDecoder in #5224, we noticed that we were double allocating which erased the performance gains of the old decoder from #672. This updates WitnessDecoder to use a single allocation like the old decoder.

While the intention of this patch was purely for performance, three of the new tests fail against the current version of the WitnessDecoder: decode_empty_element, decode_multiple_empty_elements, and decode_incomplete_witness_count.

For the first two tests, there is a functional change in how empty witness elements are handled. This patch introduces a short-circuit where the old version might get stuck in a loop forever requesting zero bytes.

The third test, decode_incomplete_witness_count, shows that the impl has a more explicit failure now if end is called before the witness element count is complete, instead of just returning and empty witness.

nyonson · 2025-11-03T20:17:37Z

primitives/src/witness.rs

+    }
+
+    /// Resizes the content buffer if needed, doubling the size each time.
+    fn resize_if_needed(&mut self, required_len: usize) {


This resize logic is copied from the old decoder, but should maybe delegate to some consensus_encoding defined thing like @jrakibi is working on?

primitives/src/witness.rs

jrakibi · 2025-11-04T16:37:48Z

primitives/src/witness.rs

+            // be large enough to cover most witnesses without reallocating.
+            let witness_index_space = witness_elements * 4;
+            self.cursor = witness_index_space;
+            self.content = alloc::vec![0u8; self.cursor + 128];


why did we choose 128 here ?

I copied this over from the old decoding logic, and if I had to guess, it probably covers the majority of witness sizes (e.g. P2WPKH witness is ~100 bytes). But I don't have the data and it might be worth making it a documented const.

Yeah, I was just swagging "a pubkey and a signature and some overhead". We can add a comment if you'd like. I don't think it's worth a named constant.

Beefed up the comment.

tcharding · 2025-11-05T03:06:59Z

I wanted to convince myself that this was an optimisation only PR so I put moved the tests into a separate patch at the front. Three of them fail (at a minimum I would have expected decode_empty_element to pass) so this is either a bug fix PR also or something else is going on? So I don't have to use my brain if you already know, did you expect that? I.e do you think the code before this patch is applied is buggy? I'm not asking you to debug if you don't know.

primitives/src/witness.rs

tcharding · 2025-11-05T03:29:20Z

Looks good man!

nyonson · 2025-11-05T03:59:06Z

did you expect that? I.e do you think the code before this patch is applied is buggy?

Nope I did not...which makes me feel a bit uneasy! I can dig more into that tomorrow though.

tcharding · 2025-11-05T04:02:03Z

$10 says my impl is buggy and yours is correct - you can pay me later. I wouldn't put too much thought into it, the tests you add look correct. I'd just mention in the PR description that these three tests don't pass before and that at least for two of them its surprising.

This converts the new witness decoder to use the optimized single allocation internal memory structure which was shown back in commit 2fd0125 with the old decoder to be a lot more performant, justifying this increased complexity.

nyonson · 2025-11-05T18:13:39Z

@tcharding I added to the description reasoning for each test.

tcharding

ACK df64705

tcharding · 2025-11-05T22:07:33Z

Thanks man, good effort - as usual!

jrakibi · 2025-11-06T16:24:01Z

primitives/src/witness.rs

+    fn resize_if_needed(&mut self, required_len: usize) {
+        if required_len >= self.content.len() {
+            let mut new_len = self.content.len().max(1);
+            while new_len <= required_len {
+                new_len *= 2;
+            }
+            self.content.resize(new_len, 0);
        }


do u think we can use ~1MB batch allocation here instead of doubling ?

something like:

fn resize_if_needed(&mut self, required_len: usize) { if required_len > self.content.len() { let bytes_remaining = required_len - self.content.len(); let batch_size = bytes_remaining.min(Self::MAX_VECTOR_ALLOCATE); self.content.resize(self.content.len() + batch_size, 0); } }

if this make sense, am happy to ACK the PR as it is, and I can address this in #5177

Yea, I copied this allocation logic over from the old decoder. I think it would be better to address the DoS stuff more holistically in your work. One thing I am wondering is if the witness decoder should have its own strategy, which would allow the consensus_encoding stuff to be private to that crate, or if witness should delegate to some exposed logic.

I think with the new refactoring you did we can have consensus_encoding stuff to be private

FWIW exponential back off is a pretty standard algorithm when allocating memory, so I found it easy to review and understand why it was chosen. If we choose a more domain specific algo it would need IMO more justification.

I think the current doubling strategy is still vulnerable to DoS attack

Here’s a test that demonstrate that:

#[test] fn test_dos_attack() { let mut encoded = Vec::new(); encoded.extend_from_slice(&[0xFE, 0x00, 0x09, 0x3D, 0x00]); // 4_000_000 (witness count) encoded.extend_from_slice(&[0xFE, 0x00, 0x09, 0x3D, 0x00]); // 4_000_000 (1st element length) let mut slice = encoded.as_slice(); let mut dec = WitnessDecoder::new(); assert!(dec.push_bytes(&mut slice).unwrap()); assert!(dec.content.len() >= 32_000_000); }

With only 10 bytes inputs an attacker can allocate 32 Mbytes (16mb for index space and it doubles when we call resize_if_needed)

I guess, if we were allocating 1 meg at a time this example would take 20Mb instead of 32?

It seems to me the way this works is:

The attacker declares that there are 4 million witness elements (the maximum we allow, on the theory that maybe an entire block could be filled with empty witnesses, okay) so we allocate 16Mb to store these 4 million indices.

Then the attacker declares that the first witness element has size 4Mb (though in this example he could say any positive value); we then need to store 16Mb + 1 so we allocate 32Mb.

But if we were allocating 1Mb at a time, we'd still be allocating 16Mb of index space and still be allocating 4Mb for the first witness.

And of course, because you aren't actually providing the claimed data, we return an error and deallocate everything. So I don't see anyway to make this attack use more than 32Mb, nor any way to amplify the attack to allocate lots of 32Mbs.

I really like this unit test though. I wonder if there's a way we can turn it into a fuzz test so we can test my "you can't make it allocate more than 32Mb without providing at least 16Mb of input" belief.

BTW if we really wanted to reduce this memory (which I do not think we should do in this PR) we could be cleverer about how to store the index. At the very least we could use 3-byte lengths instead of 4-byte ones.

jrakibi

ACK df64705

apoelstra · 2025-11-07T14:44:13Z

primitives/src/witness.rs

+        if required_len >= self.content.len() {
+            let mut new_len = self.content.len().max(1);
+            while new_len <= required_len {
+                new_len *= 2;


In df64705:

This can overflow on a 16-bit machine. We should clamp new_len to usize::MAX. (Fine to address in a followup.)

apoelstra · 2025-11-07T14:46:48Z

In df64705:

I also think that somewhere we should be capping required_len to 20 Mb, which I think is the maximum amount it can be for a valid witness (one with 4 million entries, 3999999 of which are 0-length and the other one 4 million bytes).

Again, can wait for a followup, especially because I'm unsure of the exact number and I think we want to review this separately from the rest of the logic.

apoelstra

ACK df64705; successfully ran local tests

nyonson · 2025-11-07T19:26:20Z

Follow up issue for the allocation improvements: #5258

add batched allocation for witness decoder. we also apply batched allocation for the index space to avoid allocating the full 16 MB upfrontwhen an attacker claims 4 million witness_elements. instead, we allocate gradually in ~1 MB batches The current approach automatically addresses the concerns raised in rust-bitcoin#5258 and rust-bitcoin#5239 (comment)

Add batched allocation for witness decoder (data area). we allocate incrementally in ~1 MB batches The current approach automatically addresses the concerns raised in rust-bitcoin#5258 and rust-bitcoin#5239 (comment)

Add batched allocation for witness decoder (data area). we allocate incrementally in ~1 MB batches The current approach automatically addresses the concerns raised in rust-bitcoin#5258 and rust-bitcoin#5239 (comment) we aslo add `reserve_batch` to Mutants exclusion list Mutation testing in WitnessDecoder::reserve_batch changes the buffer allocation logic and they cause infinite loop during decoding

…nessDecoder 9bf1522 consensus_encoding: implement batched allocation for WitnessDecoder (jrakibi) Pull request description: Currently an attacker can force a 32 MB memory allocation by only providing 10 bytes as inputs (see test case that proves that in #5239 (comment)) With the current approach we reduce this attack surface to ~17 MB. ### Attack vector (test case inputs): - `[0xFE, 0x00, 0x09, 0x3D, 0x00]` = 4_000_000 (witness count) - `[0xFE, 0x00, 0x09, 0x3D, 0x00]` = 4_000_000 (1st element length) 10 bytes of input will trigger the allocation below --- Old approach with `resize_if_needed` (32 MB allocation): * Index space = 16 MB (4_000_000 witness elements × 4 bytes) * Initial buffer = 16 MB + 128 bytes * When the 4 MB element length is decoded, `resize_if_needed` doubles the buffer * Because `required_len = 16 MB + 128 + 4 MB = 20 MB + 128 bytes`, it doubles from 16 MB to 32 MB * Total allocation = 32 MB New approach with `reserve_batch` (~17 MB allocation): * Index space = 16 MB * Initial buffer = 16 MB + 128 bytes * When a 4 MB element length is decoded, `reserve_batch` allocates in 1 MB batches * Total allocation = 16 MB + 128 bytes + 1 MB = 17_000_128 bytes The current approach automatically addresses the concerns raised in #5258 and #5239 (comment) ACKs for top commit: tcharding: ACK 9bf1522 apoelstra: ACK 9bf1522; successfully ran local tests Tree-SHA512: 81ad652f6158d333fc473846e0bb56d0726d4bf1b68a0ba82d4e673915224be306593a685b948d3e71bb114577033cb0d9f935ad69c39ba07b9ffe971a69a9d7

github-actions bot added the C-primitives label Nov 3, 2025

nyonson commented Nov 3, 2025

View reviewed changes

jrakibi reviewed Nov 4, 2025

View reviewed changes

nyonson force-pushed the single-allocation-decoder branch 2 times, most recently from f513e42 to f4269c0 Compare November 4, 2025 19:42

tcharding reviewed Nov 5, 2025

View reviewed changes

primitives/src/witness.rs Outdated Show resolved Hide resolved

tcharding reviewed Nov 5, 2025

View reviewed changes

primitives/src/witness.rs Outdated Show resolved Hide resolved

tcharding reviewed Nov 5, 2025

View reviewed changes

primitives/src/witness.rs Show resolved Hide resolved

primitives: single allocation witness decoder

df64705

This converts the new witness decoder to use the optimized single allocation internal memory structure which was shown back in commit 2fd0125 with the old decoder to be a lot more performant, justifying this increased complexity.

nyonson force-pushed the single-allocation-decoder branch from f4269c0 to df64705 Compare November 5, 2025 13:47

tcharding approved these changes Nov 5, 2025

View reviewed changes

jrakibi reviewed Nov 6, 2025

View reviewed changes

jrakibi approved these changes Nov 6, 2025

View reviewed changes

apoelstra reviewed Nov 7, 2025

View reviewed changes

apoelstra approved these changes Nov 7, 2025

View reviewed changes

apoelstra merged commit b19f238 into rust-bitcoin:master Nov 7, 2025
26 checks passed

nyonson mentioned this pull request Nov 7, 2025

primitives: further tighten memory allocations in witness decoder #5258

Closed

jrakibi mentioned this pull request Nov 15, 2025

consensus_encoding: implement batched allocation for vector decoders #5177

Merged

jrakibi mentioned this pull request Nov 17, 2025

consensus_encoding: implement batched allocation for WitnessDecoder #5298

Merged

mpbagot mentioned this pull request Feb 10, 2026

Re-visit Witness::from_slice and Decodable #3469

Closed

Conversation

nyonson commented Nov 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tcharding commented Nov 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tcharding commented Nov 5, 2025

Uh oh!

nyonson commented Nov 5, 2025

Uh oh!

tcharding commented Nov 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nyonson commented Nov 5, 2025

Uh oh!

tcharding left a comment

Choose a reason for hiding this comment

Uh oh!

tcharding commented Nov 5, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nyonson Nov 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jrakibi Nov 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tcharding Nov 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jrakibi left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

apoelstra commented Nov 7, 2025

Uh oh!

apoelstra left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

nyonson commented Nov 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

nyonson commented Nov 3, 2025 •

edited

Loading

tcharding commented Nov 5, 2025 •

edited

Loading

tcharding commented Nov 5, 2025 •

edited

Loading

nyonson Nov 6, 2025 •

edited

Loading

jrakibi Nov 6, 2025 •

edited

Loading

tcharding Nov 6, 2025 •

edited

Loading