Take Writer/Reader by `&mut` in consensus en/decoding by dpc · Pull Request #1035 · rust-bitcoin/rust-bitcoin

dpc · 2022-06-03T06:04:09Z

Fix #1020 (see more relevant discussion there)

This definitely makes the amount of generics compiler
has to generate by avoding generating the same functions
for R, &mut R, &mut &mut R and so on.

old:

> ls -al target/release/deps/bitcoin-07a9dabf1f3e0266
-rwxrwxr-x 1 dpc dpc 9947832 Jun  2 22:42 target/release/deps/bitcoin-07a9dabf1f3e0266
> strip target/release/deps/bitcoin-07a9dabf1f3e0266
> ls -al target/release/deps/bitcoin-07a9dabf1f3e0266
-rwxrwxr-x 1 dpc dpc 4463024 Jun  2 22:46 target/release/deps/bitcoin-07a9dabf1f3e0266

new:


> ls -al target/release/deps/bitcoin-07a9dabf1f3e0266
-rwxrwxr-x 1 dpc dpc 9866800 Jun  2 22:44 target/release/deps/bitcoin-07a9dabf1f3e0266
> strip target/release/deps/bitcoin-07a9dabf1f3e0266
> ls -al target/release/deps/bitcoin-07a9dabf1f3e0266
-rwxrwxr-x 1 dpc dpc 4393392 Jun  2 22:45 target/release/deps/bitcoin-07a9dabf1f3e0266

In the unit-test binary itself, it saves ~100KB of data.

I did not expect much performance gains, but turn out I was wrong(*):

old:

test blockdata::block::benches::bench_block_deserialize                 ... bench:   1,072,710 ns/iter (+/- 21,871)
test blockdata::block::benches::bench_block_serialize                   ... bench:     191,223 ns/iter (+/- 5,833)
test blockdata::block::benches::bench_block_serialize_logic             ... bench:      37,543 ns/iter (+/- 732)
test blockdata::block::benches::bench_stream_reader                     ... bench:   1,872,455 ns/iter (+/- 149,519)
test blockdata::transaction::benches::bench_transaction_deserialize     ... bench:         136 ns/iter (+/- 3)
test blockdata::transaction::benches::bench_transaction_serialize       ... bench:          51 ns/iter (+/- 8)
test blockdata::transaction::benches::bench_transaction_serialize_logic ... bench:           5 ns/iter (+/- 0)
test blockdata::transaction::benches::bench_transaction_size            ... bench:           3 ns/iter (+/- 0)

new:

test blockdata::block::benches::bench_block_deserialize                 ... bench:   1,028,574 ns/iter (+/- 10,910)
test blockdata::block::benches::bench_block_serialize                   ... bench:     162,143 ns/iter (+/- 3,363)
test blockdata::block::benches::bench_block_serialize_logic             ... bench:      30,725 ns/iter (+/- 695)
test blockdata::block::benches::bench_stream_reader                     ... bench:   1,437,071 ns/iter (+/- 53,694)
test blockdata::transaction::benches::bench_transaction_deserialize     ... bench:          92 ns/iter (+/- 2)
test blockdata::transaction::benches::bench_transaction_serialize       ... bench:          17 ns/iter (+/- 0)
test blockdata::transaction::benches::bench_transaction_serialize_logic ... bench:           5 ns/iter (+/- 0)
test blockdata::transaction::benches::bench_transaction_size            ... bench:           4 ns/iter (+/- 0)

(*) - I'm benchmarking on a noisy laptop. Take this with a grain of salt. But I think
at least it doesn't make anything slower.

While doing all this manual labor that will probably generate conflicts,
I took a liberty of changing generic type names and variable names to
r and R (reader) and w and W for writer.

dpc · 2022-06-03T06:48:24Z

In debug build (cargo test) the differene is around 900KB:

old:

> ls -al target/debug/deps/bitcoin-32a2c32b15dd4dc4
-rwxrwxr-x 1 dpc dpc 35997144 Jun  2 23:46 target/debug/deps/bitcoin-32a2c32b15dd4dc4

new:

> ls -al target/debug/deps/bitcoin-32a2c32b15dd4dc4
-rwxrwxr-x 1 dpc dpc 35097536 Jun  2 23:45 target/debug/deps/bitcoin-32a2c32b15dd4dc4

src/blockdata/transaction.rs

apoelstra

ACK 0833cc037a31975ba83f107b9f77e0d48571fab4

tcharding

Sorry to be a bore but can you fix patch 3 please? The consensus_decode_global changes should be in patch 1, right? And the other change:
a) Could be improved (removing the multiple muts)
b) Could be in a patch of its own with a commit log

Thanks

tcharding · 2022-06-05T23:08:39Z

src/internal_macros.rs

Just out of interest, is there any technical reason for using r.by_ref() vs &mut r? What made you choose to do so?

I kind of doubt there is, though I could imagine some readers theoretically implementing by_ref() somehow in a "smarter" way. Maybe. Possibly. I kind of don't know what I'm talking about maybe way.

tcharding · 2022-06-05T23:10:03Z

src/network/address.rs

Damn! I thought I got rid of this :( Added to my todo list.

src/network/address.rs

apoelstra · 2022-06-06T11:39:03Z

The first commit now fails to comipile with

error[E0046]: not all trait items implemented, missing: `consensus_decode`
   --> /tmp/tmpzshlvtp4/src/util/psbt/mod.rs:300:1
    |
300 | impl Decodable for PartiallySignedTransaction {
    | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ missing `consensus_decode` in implementation
    |
   ::: /tmp/tmpzshlvtp4/src/consensus/encode.rs:365:5
    |
365 |     fn consensus_decode<R: io::Read>(reader: &mut R) -> Result<Self, Error>;
    |     ------------------------------------------------------------------------ `consensus_decode` from trait

For more information about this error, try `rustc --explain E0046`.
error: could not compile `bitcoin` due to previous error

dpc · 2022-06-06T17:14:33Z

Done.

tcharding

ACK 60a6c8337e0c57fa7eed0b92d61d888a9fbaacc1

apoelstra

ACK 60a6c8337e0c57fa7eed0b92d61d888a9fbaacc1

apoelstra · 2022-06-07T12:14:12Z

@TheBlueMatt can you take a quick look at this and concept ACK it? It's a fairly agressive API change (though not to APIs we expect users to be touching..) but it's got solid benchmarks behind it.

dpc · 2022-06-10T17:29:08Z

Doesn't hurt to get a second opinion: https://users.rust-lang.org/t/taking-r-mut-r-vs-mut-r-r-where-r-io-read/76780

dpc · 2022-06-23T22:53:40Z

Doesn't hurt to get a second opinion: https://users.rust-lang.org/t/taking-r-mut-r-vs-mut-r-r-where-r-io-read/76780

Only one response, but positive.

Fix #1020 (see more relevant discussion there) This definitely makes the amount of generics compiler has to generate by avoding generating the same functions for `R`, &mut R`, `&mut &mut R` and so on. old: ``` > ls -al target/release/deps/bitcoin-07a9dabf1f3e0266 -rwxrwxr-x 1 dpc dpc 9947832 Jun 2 22:42 target/release/deps/bitcoin-07a9dabf1f3e0266 > strip target/release/deps/bitcoin-07a9dabf1f3e0266 > ls -al target/release/deps/bitcoin-07a9dabf1f3e0266 -rwxrwxr-x 1 dpc dpc 4463024 Jun 2 22:46 target/release/deps/bitcoin-07a9dabf1f3e0266 ``` new: ``` > ls -al target/release/deps/bitcoin-07a9dabf1f3e0266 -rwxrwxr-x 1 dpc dpc 9866800 Jun 2 22:44 target/release/deps/bitcoin-07a9dabf1f3e0266 > strip target/release/deps/bitcoin-07a9dabf1f3e0266 > ls -al target/release/deps/bitcoin-07a9dabf1f3e0266 -rwxrwxr-x 1 dpc dpc 4393392 Jun 2 22:45 target/release/deps/bitcoin-07a9dabf1f3e0266 ``` In the unit-test binary itself, it saves ~100KB of data. I did not expect much performance gains, but turn out I was wrong(*): old: ``` test blockdata::block::benches::bench_block_deserialize ... bench: 1,072,710 ns/iter (+/- 21,871) test blockdata::block::benches::bench_block_serialize ... bench: 191,223 ns/iter (+/- 5,833) test blockdata::block::benches::bench_block_serialize_logic ... bench: 37,543 ns/iter (+/- 732) test blockdata::block::benches::bench_stream_reader ... bench: 1,872,455 ns/iter (+/- 149,519) test blockdata::transaction::benches::bench_transaction_deserialize ... bench: 136 ns/iter (+/- 3) test blockdata::transaction::benches::bench_transaction_serialize ... bench: 51 ns/iter (+/- 8) test blockdata::transaction::benches::bench_transaction_serialize_logic ... bench: 5 ns/iter (+/- 0) test blockdata::transaction::benches::bench_transaction_size ... bench: 3 ns/iter (+/- 0) ``` new: ``` test blockdata::block::benches::bench_block_deserialize ... bench: 1,028,574 ns/iter (+/- 10,910) test blockdata::block::benches::bench_block_serialize ... bench: 162,143 ns/iter (+/- 3,363) test blockdata::block::benches::bench_block_serialize_logic ... bench: 30,725 ns/iter (+/- 695) test blockdata::block::benches::bench_stream_reader ... bench: 1,437,071 ns/iter (+/- 53,694) test blockdata::transaction::benches::bench_transaction_deserialize ... bench: 92 ns/iter (+/- 2) test blockdata::transaction::benches::bench_transaction_serialize ... bench: 17 ns/iter (+/- 0) test blockdata::transaction::benches::bench_transaction_serialize_logic ... bench: 5 ns/iter (+/- 0) test blockdata::transaction::benches::bench_transaction_size ... bench: 4 ns/iter (+/- 0) ``` (*) - I'm benchmarking on a noisy laptop. Take this with a grain of salt. But I think at least it doesn't make anything slower. While doing all this manual labor that will probably generate conflicts, I took a liberty of changing generic type names and variable names to `r` and `R` (reader) and `w` and `W` for writer.

dpc · 2022-06-23T22:56:07Z

Rebased, no changes other than a tiny merge conflict.

apoelstra

ACK a24a3b0

tcharding

ACK a24a3b0

apoelstra · 2022-06-24T12:15:47Z

I'd still like a third ACK here because it's a fairly big change to the encoding API

dpc · 2022-06-24T16:35:10Z

@Kixunil What do you think about it?

RCasatta · 2022-06-28T07:54:21Z

I tested downstream RCasatta/blocks_iterator#56

I don't fully grasp the change I had to make to this line:

let len = block_extra.consensus_encode(&mut self.buffer[..]).unwrap();

where buffer is a Vec<u8>.

To solve:

error[E0277]: the size for values of type `[u8]` cannot be known at compilation time
   --> src/pipe.rs:41:52
    |
41  |             let len = block_extra.consensus_encode(&mut self.buffer[..]).unwrap(); // buffer is big enough, we can unwrap
    |                                   ---------------- ^^^^^^^^^^^^^^^^^^^^ doesn't have a size known at compile-time
    |                                   |
    |                                   required by a bound introduced by this call
    |
    = help: the trait `Sized` is not implemented for `[u8]`
note: required by a bound in `consensus_encode`
   --> /home/casatta/.cargo/git/checkouts/rust-bitcoin-804a7fa56d920f24/a24a3b0/src/consensus/encode.rs:318:25
    |
318 |     fn consensus_encode<W: io::Write>(&self, writer: &mut W) -> Result<usize, io::Error>;
    |                         ^ required by this bound in `consensus_encode`

To fix the error I had to pass the Vec directly instead of a mutable slice

let len = block_extra.consensus_encode(&mut self.buffer).unwrap();

apoelstra · 2022-06-28T16:34:52Z

@RCasatta the issue is that consensus_encode is taking a &mut W so if you give it a &mut [u8] it will infer that W is [u8]. Then because we don't have a Sized? bound on W, it complains because it doesn't know the size of [u8].

Having said this, I'm a little surprised that this code ever worked ... does &mut [u8] impl Write? How does it know what place to write new data to?

dpc · 2022-06-28T16:56:48Z

Having said this, I'm a little surprised that this code ever worked ... does &mut [u8] impl Write? How does it know what place to write new data to?

https://doc.rust-lang.org/std/io/trait.Write.html#impl-Write-6

dpc · 2022-06-28T17:52:32Z

On bitcoin_iterator examples I'm seeing 50% of size improvement?! (too good to be true?) Haven't got a chance to run the perf. RCasatta/blocks_iterator#56 (comment)

@RCasatta the issue is that consensus_encode is taking a &mut W so if you give it a &mut [u8] it will infer that W is [u8]. Then because we don't have a Sized? bound on W, it complains because it doesn't know the size of [u8].

Should I change to <W: io::Write + ?Sized> then?

apoelstra · 2022-06-28T22:47:13Z

@dpc I'm not sure.... I don't know why you'd ever not want Sized? but on the other hand there must be a reason it's not the default, right?

Anyway it's a yes vote from me.

dpc · 2022-06-28T23:56:14Z

AFAIR, : Sized? makes the T just harder to store - since it doesn't have a statically known size (though it is possible to Box it or something). That's why it's not the default, because 99% of the time people want to store their T somewhere. In this case we only ever use the writer as variable we pass around, plus we take a reference to it anyway, so should be OK.

The only downside I see (other than me having to go and do more work) is the users having to do more work by adding this : Sized?. But I guess we should do it to unlock maximum potential.

apoelstra

ACK 1fea098

RCasatta · 2022-06-29T14:32:25Z

ACK 1fea098 tested in downstream lib, space saving in compiled code confirmed

Thanks a lot for adding ?Sized I think it's very useful not for the example I previously mentioned but when you want to write on mutable slice not from the beginning, see for example https://github.com/RCasatta/blocks_iterator/blob/27d3cd2d75a36167aed3de514136383df73cbf0b/src/utxo/db.rs#L51

The only thing to mention is the slightly surprising syntax to write in a mutable slice, requiring now (IIUC) a double &mut: https://github.com/RCasatta/blocks_iterator/blob/e8ef94e3ef8f3a8723a80aa9ca980a3fee334520/src/pipe.rs#L42

dpc · 2022-06-29T15:54:06Z

AFAIU, ready to merge. @apoelstra

… consensus en/decoding 1fea098 Support unsized `R` and `W` in consensus encode/decode (Dawid Ciężarkiewicz) a24a3b0 Forward `consensus_decode` to `consensus_decode_from_finite_reader` (Dawid Ciężarkiewicz) 9c754ca Take Writer/Reader by `&mut` in consensus en/decoding (Dawid Ciężarkiewicz) Pull request description: Fix #1020 (see more relevant discussion there) This definitely makes the amount of generics compiler has to generate by avoding generating the same functions for `R`, `&mut R`, `&mut &mut R` and so on. old: ``` > ls -al target/release/deps/bitcoin-07a9dabf1f3e0266 -rwxrwxr-x 1 dpc dpc 9947832 Jun 2 22:42 target/release/deps/bitcoin-07a9dabf1f3e0266 > strip target/release/deps/bitcoin-07a9dabf1f3e0266 > ls -al target/release/deps/bitcoin-07a9dabf1f3e0266 -rwxrwxr-x 1 dpc dpc 4463024 Jun 2 22:46 target/release/deps/bitcoin-07a9dabf1f3e0266 ``` new: ``` > ls -al target/release/deps/bitcoin-07a9dabf1f3e0266 -rwxrwxr-x 1 dpc dpc 9866800 Jun 2 22:44 target/release/deps/bitcoin-07a9dabf1f3e0266 > strip target/release/deps/bitcoin-07a9dabf1f3e0266 > ls -al target/release/deps/bitcoin-07a9dabf1f3e0266 -rwxrwxr-x 1 dpc dpc 4393392 Jun 2 22:45 target/release/deps/bitcoin-07a9dabf1f3e0266 ``` In the unit-test binary itself, it saves ~100KB of data. I did not expect much performance gains, but turn out I was wrong(*): old: ``` test blockdata::block::benches::bench_block_deserialize ... bench: 1,072,710 ns/iter (+/- 21,871) test blockdata::block::benches::bench_block_serialize ... bench: 191,223 ns/iter (+/- 5,833) test blockdata::block::benches::bench_block_serialize_logic ... bench: 37,543 ns/iter (+/- 732) test blockdata::block::benches::bench_stream_reader ... bench: 1,872,455 ns/iter (+/- 149,519) test blockdata::transaction::benches::bench_transaction_deserialize ... bench: 136 ns/iter (+/- 3) test blockdata::transaction::benches::bench_transaction_serialize ... bench: 51 ns/iter (+/- 8) test blockdata::transaction::benches::bench_transaction_serialize_logic ... bench: 5 ns/iter (+/- 0) test blockdata::transaction::benches::bench_transaction_size ... bench: 3 ns/iter (+/- 0) ``` new: ``` test blockdata::block::benches::bench_block_deserialize ... bench: 1,028,574 ns/iter (+/- 10,910) test blockdata::block::benches::bench_block_serialize ... bench: 162,143 ns/iter (+/- 3,363) test blockdata::block::benches::bench_block_serialize_logic ... bench: 30,725 ns/iter (+/- 695) test blockdata::block::benches::bench_stream_reader ... bench: 1,437,071 ns/iter (+/- 53,694) test blockdata::transaction::benches::bench_transaction_deserialize ... bench: 92 ns/iter (+/- 2) test blockdata::transaction::benches::bench_transaction_serialize ... bench: 17 ns/iter (+/- 0) test blockdata::transaction::benches::bench_transaction_serialize_logic ... bench: 5 ns/iter (+/- 0) test blockdata::transaction::benches::bench_transaction_size ... bench: 4 ns/iter (+/- 0) ``` (*) - I'm benchmarking on a noisy laptop. Take this with a grain of salt. But I think at least it doesn't make anything slower. While doing all this manual labor that will probably generate conflicts, I took a liberty of changing generic type names and variable names to `r` and `R` (reader) and `w` and `W` for writer. ACKs for top commit: RCasatta: ACK 1fea098 tested in downstream lib, space saving in compiled code confirmed apoelstra: ACK 1fea098 Tree-SHA512: bc11994791dc97cc468dc9d411b9abf52ad475f23adf5c43d563f323bae0da180c8f57f2f17c1bb7b9bdcf523584b0943763742b81362880206779872ad7489f

RCasatta reviewed Jun 3, 2022

View reviewed changes

src/blockdata/transaction.rs Outdated Show resolved Hide resolved

apoelstra previously approved these changes Jun 3, 2022

View reviewed changes

tcharding reviewed Jun 5, 2022

View reviewed changes

dpc dismissed apoelstra’s stale review via 7ce06fd June 5, 2022 23:52

tcharding previously approved these changes Jun 6, 2022

View reviewed changes

apoelstra previously approved these changes Jun 7, 2022

View reviewed changes

dpc mentioned this pull request Jun 12, 2022

Add consensus_decode_from_finite_reader optimization #1023

Merged

dpc added 2 commits June 23, 2022 15:55

Forward consensus_decode to consensus_decode_from_finite_reader

a24a3b0

dpc dismissed stale reviews from apoelstra and tcharding via a24a3b0 June 23, 2022 22:55

dpc requested review from RCasatta, apoelstra and tcharding June 23, 2022 22:56

apoelstra previously approved these changes Jun 23, 2022

View reviewed changes

tcharding previously approved these changes Jun 23, 2022

View reviewed changes

dpc dismissed stale reviews from tcharding and apoelstra via 92a297f June 29, 2022 01:24

Support unsized R and W in consensus encode/decode

1fea098

dpc requested review from apoelstra and tcharding June 29, 2022 02:12

apoelstra approved these changes Jun 29, 2022

View reviewed changes

RCasatta approved these changes Jun 29, 2022

View reviewed changes

apoelstra merged commit f401cdc into rust-bitcoin:master Jun 29, 2022

Kixunil added API break This PR requires a version bump for the next release perf Performance optimizations/issues labels Aug 1, 2022

Kixunil added this to the 0.29.0 milestone Aug 1, 2022

dpc mentioned this pull request Sep 10, 2022

Sketch of modularization approach fedimint/fedimint#546

Closed

dpc mentioned this pull request Aug 15, 2023

build error: overflow evaluating the requirement F: FnMut<(&Pk,)> rust-bitcoin/rust-miniscript#549

Closed

tcharding mentioned this pull request Jan 17, 2025

Make Read and BufRead dyn compatible #3917

Closed

Conversation

dpc commented Jun 3, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dpc commented Jun 3, 2022

Uh oh!

Uh oh!

apoelstra left a comment

Choose a reason for hiding this comment

Uh oh!

tcharding left a comment

Choose a reason for hiding this comment

Uh oh!

tcharding Jun 5, 2022

Choose a reason for hiding this comment

Uh oh!

dpc Jun 5, 2022

Choose a reason for hiding this comment

Uh oh!

tcharding Jun 5, 2022

Choose a reason for hiding this comment

Uh oh!

Uh oh!

apoelstra commented Jun 6, 2022

Uh oh!

dpc commented Jun 6, 2022

Uh oh!

tcharding left a comment

Choose a reason for hiding this comment

Uh oh!

apoelstra left a comment

Choose a reason for hiding this comment

Uh oh!

apoelstra commented Jun 7, 2022

Uh oh!

dpc commented Jun 10, 2022

Uh oh!

dpc commented Jun 23, 2022

Uh oh!

dpc commented Jun 23, 2022

Uh oh!

apoelstra left a comment

Choose a reason for hiding this comment

Uh oh!

tcharding left a comment

Choose a reason for hiding this comment

Uh oh!

apoelstra commented Jun 24, 2022

Uh oh!

dpc commented Jun 24, 2022

Uh oh!

RCasatta commented Jun 28, 2022

Uh oh!

apoelstra commented Jun 28, 2022

Uh oh!

dpc commented Jun 28, 2022

Uh oh!

dpc commented Jun 28, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

apoelstra commented Jun 28, 2022

Uh oh!

dpc commented Jun 28, 2022

Uh oh!

apoelstra left a comment

Choose a reason for hiding this comment

Uh oh!

RCasatta commented Jun 29, 2022

Uh oh!

dpc commented Jun 29, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

dpc commented Jun 3, 2022 •

edited

Loading

dpc commented Jun 28, 2022 •

edited

Loading