Skip to content

Make Read and BufRead dyn compatible#3917

Closed
tcharding wants to merge 1 commit intorust-bitcoin:masterfrom
tcharding:01-17-io-object-safety
Closed

Make Read and BufRead dyn compatible#3917
tcharding wants to merge 1 commit intorust-bitcoin:masterfrom
tcharding:01-17-io-object-safety

Conversation

@tcharding
Copy link
Copy Markdown
Member

In order to do this we need to remove the ?Sized trait bound from consensus_encode - done in patch 1. Please note I do not know why we added it in the first place (added in #1035).

Patch 2 fixes Take and adds a unit test to verify all io traits are dyn compatible (object safe).

Close: #3833

@github-actions github-actions bot added C-bitcoin PRs modifying the bitcoin crate C-io PRs modifying the io crate test labels Jan 17, 2025
@tcharding
Copy link
Copy Markdown
Member Author

tcharding commented Jan 17, 2025

cc @dpc as the author of #1035, do you happen to recall why the ?Sized bound was used? I see that you guys don't use it in fedimint.

@apoelstra
Copy link
Copy Markdown
Member

Might've been a MSRV thing.

@apoelstra
Copy link
Copy Markdown
Member

Oh, wait, this is about adding a ?Sized bound, not a Sized bound. The reason is that it makes the API more generally useful. We should be hesitant to remove this.

Doesn't removing this bound break reading from &[u8] or writing to &mut [u8]? If not, why not?

@tcharding
Copy link
Copy Markdown
Member Author

Doesn't removing this bound break reading from &[u8] or writing to &mut [u8]? If not, why not?

hmmm, bother. I had wondered the same thing but was just lazy and cargo cult programmed based or Rust stdlib types. I'll put some more thought into this.

@tcharding tcharding marked this pull request as draft January 18, 2025 23:39
@tcharding
Copy link
Copy Markdown
Member Author

I still don't know why &[u8] can be passed into a function that is trait bound on Sized - but it looks like it can.

use std::io::Read;

fn main() -> anyhow::Result<()> {
    let data = [0x1_u8; 8];

    let mut reader = &data[..4];

    // Use the reader without mutating it, second read reads same data again.
    use_sized_reader(reader)?;
    use_sized_reader(reader)?;

    // Use the reader while mutating it, second read reads no data.
    use_sized_reader(&mut reader)?;
    use_sized_reader(&mut reader)?;

    let mut reader = &data[..4];

    // Use the reader without mutating it, second read reads same data again.
    // This does not build.
    // use_unsized_reader(reader)?;
    // use_unsized_reader(reader)?;

    // Use the reader while mutating it, second read reads no data.
    use_unsized_reader(&mut reader)?;
    use_unsized_reader(&mut reader)?;

    Ok(())
}

// From Rust docs: https://rust-lang.github.io/api-guidelines/interoperability.html#c-rw-value
//
// > That means any function that accepts R: Read or W: Write generic parameters by value can be called with a mut reference if necessary.
fn use_sized_reader<R: Read>(mut r: R) -> anyhow::Result<()> {
    let mut buf = [0_u8; 16];

    r.read(&mut buf)?;
    println!("{:?}", buf);
    Ok(())
}

fn use_unsized_reader<R: Read + ?Sized>(r: &mut R) -> anyhow::Result<()> {
    let mut buf = [0_u8; 16];

    r.read(&mut buf)?;
    println!("{:?}", buf);
    Ok(())
}

Outputs

[1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

This also begs the question is the Sized function strictly better because it is more general?

@apoelstra
Copy link
Copy Markdown
Member

&[u8] is sized. Only [u8] is unsized.

The problem occurs because these methods take &mut R, so if you pass &mut [u8] directly then compilation will fail. Passing &mut &[u8], as you're doing, is fine.

@tcharding
Copy link
Copy Markdown
Member Author

tcharding commented Jan 19, 2025

So we have to make a trade off then. Either we cannot make our Read trait object safe (because we cannot add where Self: Sized (tracked in #3833) or we cannot consensus decode from [u8] because we call take().

I clearly don't understand all the subtlety between &[u8] and [u8] but I would of thought we want to mimic std::io unless we have a pretty strong reason not to.

@tcharding
Copy link
Copy Markdown
Member Author

I'm not understanding why one would need to be able to do the first two calls here when we can do either of the second two? I.e. when is the more general ?Sized useful?

    // First two don't build.
    //    use_sized_reader(data[..4])?;
    //    use_sized_reader(&mut data[..4])?;
    use_sized_reader(&mut &data[..4])?;
    use_sized_reader(&data[..4])?;

@tcharding
Copy link
Copy Markdown
Member Author

tcharding commented Jan 19, 2025

EDIT: I deleted this post

FYI std::io::take is an explicitly non-dispatchable function. That is why it has Self: Sized and Take is still dyn-compatible.

@apoelstra
Copy link
Copy Markdown
Member

So we have to make a trade off then. Either we cannot make our Read trait object safe (because we cannot add where Self: Sized (tracked in #3833) or we cannot consensus decode from [u8] because we call take().

We may be able to put the bound only on the take method or something, which would make the trait object safe but the method non-callable from trait objects. Unsure.

I clearly don't understand all the subtlety between &[u8] and [u8] but I would of thought we want to mimic std::io unless we have a pretty strong reason not to.

std::io does not have any Encodable or Decodeable traits. It may be helpful for you to split this into two PRs -- one that modifies the io traits to be similar to std (BTW, this is not mentioned anywhere in the PR description or commit message; these say you want to "make Read and BufRead dyn-compatible, don't say why you want this, don't mention the costs, and also the PR changes non-io traits) and one that makes changes to the consensus traits.

@tcharding
Copy link
Copy Markdown
Member Author

Hah! commit log fail - thanks for pulling me up.

It may be helpful for you to split this into two PRs -- one that modifies the io traits to be similar to std

We can't do that without changing bitcoin. We can however change bitcoin differently to this current PR if we want. Its the call to take() that prevents use changing the Read trait (take method). I've thought a bit and I think the usage is correct - we explicitly want DOS protection so we want to use take(). Hence why we have to resolve the ?Sized thing.

What we could do is have a PR that discusses and tackles the ?Sized vs Sized thing on its own first though.

@apoelstra
Copy link
Copy Markdown
Member

We should PR to change the bitcoin methods first, and provide some justification there that we're not breaking stuff. (It sounds like the extra ?Sized bound would not even work with std::io and only provides some extra benefit for direct users of bitcoin::io who have created unsized readers?)

Copy link
Copy Markdown
Collaborator

@Kixunil Kixunil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You must be confused somehow. Only adding the bound to take should be required. The code that needs to call take on trait objects can just use .by_ref().take(). But LMK if I'm missing something.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NACK, this is a needless breaking change.

io/src/lib.rs Outdated
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should be able to call take() on &mut R, so this is not needed.

Copy link
Copy Markdown
Member Author

@tcharding tcharding Jan 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had two goes and couldn't work it out.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, that's because of the missing impl. We need to have type ByRef<'a, R> = &'a mut R in some future release. Anyway, ByRef should work for now.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whaaat? You reintroduced these... :(

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You weren't around to write the tooling required to run cargo public-api in CI so we could use the output without committing the files. And I couldn't be bothered to ...

@apoelstra
Copy link
Copy Markdown
Member

can just use .by_ref().take().

Ah, derp, I should've thought of this.

@tcharding
Copy link
Copy Markdown
Member Author

lol, I literally said above "why would anyone need Sized when they can use a reference" and then did not realise we could use as_ref() before calling take(). Epic comeback @Kixunil

Currently our `Read` and `BufRead` traits are not dyn compatible (object
safe) because of `Take` having an unsized trait bound. We just removed
the requirement for this by removing the bound from `consensus_encode`.

Remove the `?Sized` trait bound from `Take` and add a test that shows
that `Read` and `BufRead` are now dyn compatible.

Fix: rust-bitcoin#3833
@tcharding tcharding force-pushed the 01-17-io-object-safety branch from e587be8 to c588053 Compare January 22, 2025 04:33
@github-actions github-actions bot removed the C-bitcoin PRs modifying the bitcoin crate label Jan 22, 2025
@tcharding
Copy link
Copy Markdown
Member Author

tcharding commented Jan 22, 2025

We don't have by_ref but std::io::Read has it but its is Sized as well

ref: https://doc.rust-lang.org/std/io/trait.Read.html

@tcharding tcharding force-pushed the 01-17-io-object-safety branch from c588053 to 940d947 Compare January 22, 2025 04:37
@Kixunil
Copy link
Copy Markdown
Collaborator

Kixunil commented Jan 22, 2025

We don't have by_ref

Oh, I thought my PR got merged. I strongly suspect it also addressed the same issue you're trying to address here as I remember doing stuff around Sized.

but its is Sized as well

Whaaat?! Looks like an oversight in std but it doesn't matter there since you can just use the reference there.

@apoelstra
Copy link
Copy Markdown
Member

apoelstra commented Jan 22, 2025

I'm ok with deviating from std to drop the Sized requirement on by_ref (unless we really do need it for reasons I'm not seeing). We should document it though.

Edit actually, is the Sized bound needed so that you can call by_ref on a Box<dyn std::io::Read>? We may want to keep that property. We can add an alternate method that doesn't have the sized ref maybe. Or maybe we can just use &/&mut rather than calling the method?

@tcharding
Copy link
Copy Markdown
Member Author

tcharding commented Jan 23, 2025

I'm struggling to see why we should provide an io::Read trait that is not dyn compatible which implies its different to std::io::Read. Am I missing something?

@tcharding
Copy link
Copy Markdown
Member Author

No need to answer me if #3945 is possible.

@Kixunil
Copy link
Copy Markdown
Collaborator

Kixunil commented Jan 23, 2025

@apoelstra by_ref is literally no-op, so there's no reason why it would need Sized. I'm pretty sure that if it wasn't a breaking change they would've removed the restriction already. So we should just remove it. My PR had a newtype instead of ByRef to make transition between versions easier with a clear plan to cleanly remove it.

I'm struggling to see why we should provide an io::Read trait that is not dyn compatible

We shouldn't and nobody is proposing that.

@apoelstra
Copy link
Copy Markdown
Member

I'm struggling to see why we should provide an io::Read trait that is not dyn compatible which implies its different to std::io::Read. Am I missing something?

We're not -- we're proposing that the Take impl retain a Sized bound, like std, and enabling users to get that bound even for unsized types by calling by_ref, unlike std.

Then our consensus encoding code won't need a sized bound.

@tcharding
Copy link
Copy Markdown
Member Author

We're not -- we're proposing that the Take impl retain a Sized bound,

And we construct in manually instead of using Read::take in bitcoin? Its the take() function being called in bitcoin that is causing the scope leakage in this PR (AFAIU).

@apoelstra
Copy link
Copy Markdown
Member

We have a couple choices:

  • If we are using bitcoin::io then we can just call w.by_ref().take()
  • We can "semi-manually" do (&mut w).take() which I believe will work even with std::io
  • Other, even more manual things, which I don't think are necessary to consider.

@Kixunil
Copy link
Copy Markdown
Collaborator

Kixunil commented Jan 24, 2025

I think this order makes most sense:

  • merge the by_ref PR
  • use it in bitcoin in places where ?Sized argument is accepted
  • add the where Self: Sized bound to Read::take with the dyn compatibility check

This should be it and nothing else should be needed.

@tcharding
Copy link
Copy Markdown
Member Author

No sweat, I'm on it.

@tcharding
Copy link
Copy Markdown
Member Author

Closing, io is out of vogue now.

@tcharding tcharding closed this Jan 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

C-io PRs modifying the io crate test

Projects

None yet

Development

Successfully merging this pull request may close these issues.

I/O traits should be object safe

3 participants