Deprecate `StreamReader` by RCasatta · Pull Request #680 · rust-bitcoin/rust-bitcoin

RCasatta · 2021-10-20T17:11:03Z

StreamReader performance is extremely poor in case the object decoded is "big enough" for example a full Block.

In the common case, the buffer is 64k, so to successfully parse a 1MB block 16 decode attempts are made.
Even if a user increases the buffer size, read is not going to necessarily fill the buffer, as stated in the doc https://doc.rust-lang.org/stable/std/io/trait.Read.html#tymethod.read. In my tests, the reads are 64kB even with a 1MB buffer.

I think this is the root issue of the performance issue found in electrs in romanz/electrs#547 and they now have decided to decode the TCP stream with their own code in romanz/electrs@cd0531b and romanz/electrs@05e0221.

Using directly consensus_encode seems to make more sense (taking care of using BufRead if necessary) so the StreamReader is deprecated

tcharding

For what its worth, LGTM. Out of interest, did you verify that it fixes the electrs issue linked to?

I added a docs suggestion, feel free to use it or ignore it as you see fit.

Thanks.

tcharding · 2021-10-21T05:42:33Z

src/network/stream_reader.rs

In case you want to use it, I massaged the docs a bit while understanding them.

/// Construct a stream reader for a given input `stream`. Use optional parameter `buffer_size` /// to control the read buffer size (default: 64K). Specify `read_until` bytes when expecting /// big objects coming from this reader. If not specified it's 0, trying to parse `Decodable` /// every received bytes. Using the default (0) when parsing a big object like a full block can /// result in poor performance due to the repeated parsing attempts on incomplete data.

dpc · 2021-10-21T06:06:43Z

I was going to say "looks like the proper solution would be to rewrite deserialization to use streaming (std::io::Read)". But then I looked at code at it looks like it is already written to use Read. https://docs.rs/bitcoin/0.27.1/bitcoin/consensus/encode/trait.Decodable.html#tymethod.consensus_decode

So why is StreamReader even using deserialize_partial instead of the Decodable::consensus_decode(&mut reader) straight inside it?

rust-bitcoin/src/consensus/encode.rs

Line 172 in a961ab4

let rv = Decodable::consensus_decode(&mut decoder)?;

Seems to me that making StreamReader just a wrapper around R : Read, reading directly from the socket/whatever as it needs to (possibly wrapped with BufReader) would lead to exactly one time parsing and no overheads, and the whole buffering into one array before parsing is just a waste.

RCasatta · 2021-10-21T08:24:05Z

For what its worth, LGTM. Out of interest, did you verify that it fixes the electrs issue linked to?

No, I didn't test electrs, sorry. But I noticed the issue while working on blocks_iterator and I did a draft implementation there in 4dfca188f0ef7bd5d81f73664622244d22d2e2b5, the time to parse the testnet blockchain was reduced by 50%.

So why is StreamReader even using deserialize_partial instead of the Decodable::consensus_decode(&mut reader) straight inside it?

The reason for this implementation is that if there are more than one Decodable in the buffer, we know where to start for the following

Seems to me that making StreamReader just a wrapper around R : Read, reading directly from the socket/whatever as it needs to (possibly wrapped with BufReader) would lead to exactly one time parsing and no overheads, and the whole buffering into one array before parsing is just a waste.

Wrapping in BufReader is mandatory to achieve similar performance, and I think it may be risky to just add in the docs and let the caller wrap it (and for what I can see electrs is not using it after removing the StreamReader, just grepped it, not examined in depth).
Note that this PR is not more wasteful than what we have, but I'll try a version that wraps the BufReader internally and without using deserialize_partial

RCasatta · 2021-10-21T09:15:54Z

I did a branch https://github.com/rust-bitcoin/rust-bitcoin/compare/master...RCasatta:stream_reader_2?expand=1 using a BufReader and directly consensus_encode and basically make the StreamReader useless, so if we are going this way I think we should deprecate it and remove it in a following versions...

This other branch uses less memory in comparison to this PR if read_until is set to 1MB for example but at the moment it doesn't handle the WouldBlock error, but I don't know if that happens in real life (and if it happens and we want to add it to the StreamReader and make a reason to exist, to spinlock on that error)

I am not sure either if there are some errors that were handled by the previous implementation and they are not on stream_reader_2. @dr-orlovsky is the original author, maybe he can comment too

dpc · 2021-10-21T16:56:26Z

The reason for this implementation is that if there are more than one Decodable in the buffer, we know where to start for the following

I guess the expectation is that the consensus_decode(&mut reader) would leave reader pointing exactly at the next decodable data

but at the moment it doesn't handle the WouldBlock error

AFAIR WouldBlock only happens with non-blocking IO (fd-s). Dealing with non-blocking IO makes more sense with async code, so until then I think it's fine to not deal with it, and users that want to work with non-blocking IO can deal with it: by either reading to a larger buffer directly, or one could have a struct WaitAndRetryOnWouldBlock<R : Read> wrapper. But then again ... if one is going to block, why bother with non-blocking IO?

I did a branch https://github.com/rust-bitcoin/rust-bitcoin/compare/master...RCasatta:stream_reader_2?expand=1 using a BufReader

Did you have any chance to measure how it performs?

Kixunil · 2021-10-22T12:38:51Z

Why not just replace R: Read with R: BufRead? One can't forget it and if a byte slice is already available no need to force buffering.

Also it looks like most of the time in electrs is spent hashing, so while these optimizations may be useful, I think it'd be better to focus on that first.

RCasatta · 2021-10-22T13:33:21Z

Why not just replace R: Read with R: BufRead? One can't forget it and if a byte slice is already available no need to force buffering.

Make sense, I didn't know there was a trait for that, very nice.

Also it looks like most of the time in electrs is spent hashing, so while these optimizations may be useful, I think it'd be better to focus on that first.

I bring electrs as one example, but other projects are wasting resources for nothing using the StreamReader like bdk

On a side note, on a separate project, I did a bench on hashing a few bytes and the hardware accelerated sha2 is not doing so good in comparison to bicoin_hashes

test bench::bench_bitcoin_hashes_sha          ... bench:         380 ns/iter (+/- 162)
test bench::bench_sha2_crate                  ... bench:         324 ns/iter (+/- 60)

Kixunil · 2021-10-22T14:07:03Z

Thanks for that benchmark! So it seems it'd be better to avoid hashing instead of trying to optimize it. (Still would be nice to be able to use HW accelerated verion but not priority.)

RCasatta · 2021-10-22T14:24:27Z

Did you have any chance to measure how it performs?

I added a bench with a 1MB block in 76f9fd2239919ba1c602511dd34d2eee61df6876

actual

test blockdata::block::benches::bench_stream_reader                     ... bench:  33,482,849 ns/iter (+/- 1,829,374)

with 9dae6fa8e7caf5bbd4eb21682fd078092c2faffb

test blockdata::block::benches::bench_stream_reader                     ... bench:   3,654,142 ns/iter (+/- 186,499)

dr-orlovsky · 2021-11-13T09:14:25Z

Sorry for a late response; these days I am focusing on completing Taproot support in rust-bitcoin and hadn't time for a careful review here. StreamReader was my first contribution into rust-bitcoin nearly three years ago, so it may be not be an optimal one. Nevertheless I recall that there were some reasons to create this struct, like the fact that consensus_deserialize was failing if not consuming the whole buffer, which was not working with the stream of TCP messages

Kixunil

ACK 9dae6fa8e7caf5bbd4eb21682fd078092c2faffb

dr-orlovsky · 2021-12-12T14:38:59Z

@RCasatta sorry I tried to review the PR and just do not get how it will deal with the situation if the TCP stream has more than one message. I see that the test case for that is removed, and it in fact was the main reason to introduce StreamReader (since otherwise we can't talk to remote Bitcoin Core nodes in practice)

RCasatta · 2021-12-12T15:27:14Z

@RCasatta sorry I tried to review the PR and just do not get how it will deal with the situation if the TCP stream has more than one message

by calling consensus_decode instead of deserialize_partial you don't fail if there is something more in the buffer.

I see that the test case for that is removed,

I removed that test because now you can't access the unparsed buffer, do you want me to re-add something like this:

 #[test]
    fn parse_multipartmsg_test() {
        let mut multi = MSG_ALERT.to_vec();
        multi.extend(&MSG_ALERT[..]);
        let mut reader = StreamReader::new(&multi[..], None);
        let message: Result<RawNetworkMessage, _> = reader.read_next();
        assert!(message.is_ok());
        check_alert_msg(&message.unwrap());
        let message: Result<RawNetworkMessage, _> = reader.read_next();
        assert!(message.is_ok());
        check_alert_msg(&message.unwrap());
    }

dr-orlovsky · 2021-12-12T15:32:11Z

Oh, I see. Yeah, that test would be great to have!

RCasatta · 2021-12-12T16:36:11Z

test parse_multipartmsg_test re-added

rebased

apoelstra · 2021-12-22T22:19:56Z

Sorry for a late response; these days I am focusing on completing Taproot support in rust-bitcoin and hadn't time for a careful review here. StreamReader was my first contribution into rust-bitcoin nearly three years ago, so it may be not be an optimal one. Nevertheless I recall that there were some reasons to create this struct, like the fact that consensus_deserialize was failing if not consuming the whole buffer, which was not working with the stream of TCP messages

There is no such thing as consensus_deserialize. If you mean consensus_decode then it does not error when it doesn't consume the whole buffer. If you mean consensus::deserialize that is because it is a convenience method specifically for use with vectors, and if you have a general reader then you should just use consensus_decode.

I had no idea about this StreamReader stuff. I have never seen #229 or @231, which git blame attributes it to, and this is the first I'm hearing about it. Unless somebody has a clear reason why it exists, and why the existing deserialization API (which is generic over any Read) is insufficient, we should deprecate it.

Why not just replace R: Read with R: BufRead? One can't forget it and if a byte slice is already available no need to force buffering.

Why would we reduce flexibility in this way? We don't need BufRead AFAIK so we should not require it.

If people are having performance issues because they are using raw TcpSockets or whatever without wrapping it in a buffered reader, they should perhaps read the docs for TcpSocket which explicitly recommends its use.

StreamReader before this commit is trying to repeatedly parse big object like blocks at every read, causing useless overhead. consensus_encode deal with partial data by simply blocking. After this changes it doesn't look what remain of the StreamReader is really giving value, so it's deprecated

RCasatta · 2021-12-31T09:48:46Z

I changed BufRead to Read

Kixunil · 2022-01-03T11:39:54Z

src/network/stream_reader.rs

 /// Struct used to configure stream reader function
 pub struct StreamReader<R: Read> {
    /// Stream to read from
    pub stream: R,


Shouldn't we use BufReader to avoid performance issues in old code?

Andrew thinks (#680 (comment)) it's the caller's responsibility to wrap if needed.

I am not strongly for using it or not using it, but for sure the main issue this PR is solving and which is our responsibility is not attempting to decode 16 times the same 1MB block

I mean StreamReader was originally intended to be buffered. So by not using BufReader internally we create (I think significant) performance regression for those who update library without changing the deprecated code. It might be less bad to just remove it if we don't use BufReader internally.

I mean StreamReader was originally intended to be buffered

ah that's true, let's see what @apoelstra thinks

I mean, if we deprecate/remove it we might as well also change it to use BufReader internally :P

apoelstra

ACK 5dfb93d

But if we want to change this to use a BufReader internally that's fine by me too. It would probably simulate the old behaviour more accurately.

…ion on existing callers

RCasatta · 2022-01-04T09:44:07Z

But if we want to change this to use a BufReader internally that's fine by me too. It would probably simulate the old behaviour more accurately.

I Added BufReader internally in 9189539

Kixunil

ACK 9189539

Kixunil · 2022-01-04T09:52:20Z

src/network/stream_reader.rs

-    /// optional parameter `buffer_size` determining reading buffer size
-    pub fn new(stream: R, buffer_size: Option<usize>) -> StreamReader<R> {
+    /// Constructs new stream reader for a given input stream `stream`
+    #[deprecated(since="0.28.0", note="wrap you stream into a buffered reader if necessary and use consensus_encode directly")]


your, please leave this for followup PR but

Kixunil · 2022-01-04T09:52:39Z

src/network/stream_reader.rs

-    /// Reads stream and parses next message from its current input,
-    /// also taking into account previously unparsed partial message (if there was such).
+    /// Reads stream and parses next message from its current input
+    #[deprecated(since="0.28.0", note="wrap you stream into a buffered reader if necessary and use consensus_encode directly")]


apoelstra

ACK 9189539

I don't mind re-acking if you want to fix the comment typo :)

Kixunil

ACK e860333

elichai · 2022-01-05T14:06:12Z

When using consensus_decode Is it possible to continue parsing if you receive an unexpected EOF(this can happen with system interrupts) in the middle of parsing? or would you lose the data that was partially parsed?

apoelstra · 2022-01-05T15:51:30Z

@elichai you lose the data that was partially parsed.

Is there an obvious way to improve on this? Does serde handle this somehow? I suppose the user could wrap their Read object in something that would intercept certain kinds of errors and do retries rather than bubbling them up?

apoelstra

ACK e860333

Kixunil · 2022-01-05T16:02:17Z

@elichai system interrupts (I believe you meant signals) can never cause unexpected EOF.

I'd love to eventually support streaming parsing but that's a different topic.

elichai · 2022-01-06T11:21:01Z

@elichai system interrupts (I believe you meant signals) can never cause unexpected EOF.

I'd love to eventually support streaming parsing but that's a different topic.

You're right, I was thinking about EINTR(ErrorKind::Interrupted), which this doesn't currently solve.

I feel that I don't have enough knowledge about how to read from sockets robustly so ignore my comment :)

Kixunil · 2022-01-06T11:28:25Z

EINTR is usually handled by std (I believe BufReader does it too). EWOULDBLOCK is more serious but out of scope of this library, at least today.

tcharding reviewed Oct 21, 2021

View reviewed changes

RCasatta force-pushed the stream_reader branch from 4e74a90 to 1437bb1 Compare October 22, 2021 14:22

Kixunil mentioned this pull request Oct 22, 2021

Feature: optimize p2p block parsing romanz/electrs#547

Open

RCasatta force-pushed the stream_reader branch from 1437bb1 to 9dae6fa Compare October 22, 2021 14:41

romanz previously approved these changes Nov 13, 2021

View reviewed changes

Kixunil previously approved these changes Dec 2, 2021

View reviewed changes

Bench StreamReader

9ca6c75

RCasatta dismissed stale reviews from Kixunil and romanz via 017d239 December 12, 2021 16:35

RCasatta force-pushed the stream_reader branch from 9dae6fa to 017d239 Compare December 12, 2021 16:35

RCasatta force-pushed the stream_reader branch from 017d239 to 5dfb93d Compare December 31, 2021 09:46

RCasatta changed the title ~~Improve performance in the StreamReader~~ Deprecate StreamReader Jan 3, 2022

RCasatta changed the title ~~Deprecate StreamReader~~ Deprecate StreamReader Jan 3, 2022

Kixunil reviewed Jan 3, 2022

View reviewed changes

apoelstra previously approved these changes Jan 3, 2022

View reviewed changes

Use BufReader internally in StreamReader to avoid performance regress…

9189539

…ion on existing callers

RCasatta dismissed apoelstra’s stale review via 9189539 January 4, 2022 09:32

Kixunil previously approved these changes Jan 4, 2022

View reviewed changes

apoelstra previously approved these changes Jan 4, 2022

View reviewed changes

Fix typos

e860333

RCasatta dismissed stale reviews from apoelstra and Kixunil via e860333 January 5, 2022 08:40

Kixunil approved these changes Jan 5, 2022

View reviewed changes

apoelstra approved these changes Jan 5, 2022

View reviewed changes

RCasatta merged commit 9c3a27a into rust-bitcoin:master Jan 6, 2022

RCasatta mentioned this pull request Jan 6, 2022

Use test_data for big objects, add big block for benchmarking #750

Merged

dr-orlovsky mentioned this pull request Jan 15, 2022

Bump version to v0.28 #785

Closed

Conversation

RCasatta commented Oct 20, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tcharding left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dpc commented Oct 21, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

RCasatta commented Oct 21, 2021

Uh oh!

RCasatta commented Oct 21, 2021

Uh oh!

dpc commented Oct 21, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Kixunil commented Oct 22, 2021

Uh oh!

RCasatta commented Oct 22, 2021

Uh oh!

Kixunil commented Oct 22, 2021

Uh oh!

RCasatta commented Oct 22, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dr-orlovsky commented Nov 13, 2021

Uh oh!

Kixunil left a comment

Choose a reason for hiding this comment

Uh oh!

dr-orlovsky commented Dec 12, 2021

Uh oh!

RCasatta commented Dec 12, 2021

Uh oh!

dr-orlovsky commented Dec 12, 2021

Uh oh!

RCasatta commented Dec 12, 2021

Uh oh!

apoelstra commented Dec 22, 2021

Uh oh!

RCasatta commented Dec 31, 2021

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

apoelstra left a comment

Choose a reason for hiding this comment

Uh oh!

RCasatta commented Jan 4, 2022

Uh oh!

Kixunil left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

apoelstra left a comment

Choose a reason for hiding this comment

Uh oh!

Kixunil left a comment

Choose a reason for hiding this comment

Uh oh!

elichai commented Jan 5, 2022

Uh oh!

apoelstra commented Jan 5, 2022

Uh oh!

apoelstra left a comment

Choose a reason for hiding this comment

Uh oh!

RCasatta commented Oct 20, 2021 •

edited

Loading

dpc commented Oct 21, 2021 •

edited

Loading

dpc commented Oct 21, 2021 •

edited

Loading

RCasatta commented Oct 22, 2021 •

edited

Loading