Skip to content

Alloc-free (AKA zero-copy) control block#4248

Merged
apoelstra merged 6 commits intorust-bitcoin:masterfrom
Kixunil:alloc-free-control-block
Mar 20, 2025
Merged

Alloc-free (AKA zero-copy) control block#4248
apoelstra merged 6 commits intorust-bitcoin:masterfrom
Kixunil:alloc-free-control-block

Conversation

@Kixunil
Copy link
Copy Markdown
Collaborator

@Kixunil Kixunil commented Mar 16, 2025

This implements a bunch of changes needed to make ControlBlock alloc-free. In particular, this allows constructing Witness without the intermediate allocation. It is also a step towards having P2TrSpend public.

Closes #1614

This also intentionally does not address decoding of ControlBlock from Witness since I'm not sure about the API.

Rationale for doing the Buf rename: while doing it with Script was very painful it shouldn't be here since it's not used that often and also we can just backport the first commit with deprecated type alias. I was thinking of having TaprootMerkleBr but it'd be inconsistent and the name is silly.

(Also if anyone is wondering why I did this: I was too exhausted to do more important stuff but felt like doing something nice and easy like this.)

@github-actions github-actions bot added the C-bitcoin PRs modifying the bitcoin crate label Mar 16, 2025
@coveralls
Copy link
Copy Markdown

coveralls commented Mar 16, 2025

Pull Request Test Coverage Report for Build 13927232201

Details

  • 69 of 162 (42.59%) changed or added relevant lines in 5 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage decreased (-0.3%) to 82.816%

Changes Missing Coverage Covered Lines Changed/Added Lines %
bitcoin/src/taproot/merkle_branch/buf.rs 9 11 81.82%
bitcoin/src/blockdata/witness.rs 0 6 0.0%
bitcoin/src/taproot/mod.rs 12 20 60.0%
bitcoin/src/taproot/merkle_branch/mod.rs 0 20 0.0%
bitcoin/src/taproot/merkle_branch/borrowed.rs 48 105 45.71%
Totals Coverage Status
Change from base Build 13923187550: -0.3%
Covered Lines: 21827
Relevant Lines: 26356

💛 - Coveralls

@Kixunil Kixunil force-pushed the alloc-free-control-block branch from 1102a65 to d164ea6 Compare March 16, 2025 20:02
@github-actions

This comment was marked as outdated.

@github-actions github-actions bot added the API break This PR requires a version bump for the next release label Mar 16, 2025
@Kixunil Kixunil force-pushed the alloc-free-control-block branch from d164ea6 to bf5de4a Compare March 16, 2025 20:09
@github-actions

This comment was marked as outdated.

@Kixunil Kixunil force-pushed the alloc-free-control-block branch from bf5de4a to ef8c9fb Compare March 16, 2025 20:16
@github-actions
Copy link
Copy Markdown

🚨 API BREAKING CHANGE DETECTED

To see the changes click details on "Check semver breaks / PR Semver - stable toolchain" job then expand "Run semver checker script" and scroll to the end of the section.

tcharding
tcharding previously approved these changes Mar 17, 2025
Copy link
Copy Markdown
Member

@tcharding tcharding left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK ef8c9fb

@tcharding
Copy link
Copy Markdown
Member

I have a few questions/nits but they can be addressed as follow ups if needed.

  • buf module may be better named owned to mirror the script modules
  • Same as script I would have expected all the AsRef etc to be in mod.rs
  • There are a few differences in the owned vs borrow API that I don't see the reason for eg iter(), serialize()
  • So that it is easier to read side by side buffers and see the difference/similarities I rekon it would be preferable if the two modules layed out the functions in the same order.

Copy link
Copy Markdown
Member

@tcharding tcharding left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The module thing is more than a nit please unless I"m totally missing something.

Comment on lines +8 to +9
/// Makes sure only the allowed conversions are accessible to external code.
mod privacy_boundary {
Copy link
Copy Markdown
Member

@tcharding tcharding Mar 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see the point of this module?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It prevents direct access to the .0 field. Though the comment is somewhat misleading because any "disallowed" conversion would have to use unsafe anyway.


/// Returns a reference to the mutable slice of hashes.
#[inline]
pub fn as_mut_slice(&mut self) -> &mut [TapNodeHash] { &mut self.0 }
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does having this in a submodule enforce anything?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only methods in submodule can access .0, the outer ones can't.

In this specific code it's mainly a lint against doing .0 since that makes future changes annoying. (I linted against myself. :))

}
}

pub(super) fn from_mut_hashes_unchecked(hashes: &mut [TapNodeHash]) -> &mut Self {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is this different from just having a private function outside of the module?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, true, it doesn't need to be inside since it uses unsafe anyway. But I like to having it there anyway because it better shows that this function directly manipulates the internals of the type but also there was a proposal to allow as to avoid these unsafe blocks and if it ever gets into the language then we will definitely want to use it and have the function right there, so we may as well put it there right now.

(The proposed rule was that if the type is #[repr(transparent)] then whichever code sees all its fields as accessible can use as to cast the inner slice to the outer type. This is perfectly sensible and what we want, with the implications above.)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair arguments but none of that convinces me that this is better than not having the module, its unnecessary complexity that is not at all apparent what its there for. IMO simple trumps complex unless there is a good reason for the complexity.

there was a proposal to allow as to avoid these unsafe blocks and if it ever gets into the language then we will definitely want to use it and have the function right there, so we may as well put it there right now.

I'm not a fan of just-in-case code at the best of times, definitely not for a proposal that is not even in the language yet.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you talking about module or that specific function? I was talking about that specific function. It strictly doesn't need to be inside the module but it's better to have it there if we have the module anyway. And I want the module because the other function accesses .0 which is enough reason to have it. If this was the only function that belongs to the module then I would've agreed that adding the module is additional complexity.

@Kixunil
Copy link
Copy Markdown
Collaborator Author

Kixunil commented Mar 17, 2025

  • buf module may be better named owned to mirror the script modules

The reason I went with buf is because I intend to add another owned type backed by ArrayVec. I just didn't want to make the PR too big. For a while I also wondered whether ToOwned::Owned should be the current buf type or the new one until I realized how it interacts with Cow.

  • Same as script I would have expected all the AsRef etc to be in mod.rs

I didn't want to move too much code around at first, to not make the review too boring. We can do it later.

  • There are a few differences in the owned vs borrow API that I don't see the reason for eg iter(), serialize()

Frankly, I mostly focused on getting some reasonable things working. I'm pretty sure the new type can have richer API. But I also want to make some changes after #4200 merges (IIRC these PRs shouldn't conflict). However, I can already answer iter - the method was available on the Buf type via Deref coercion already, so the APIs are in fact more similar than you think (also funny thing is I had to add the method for the commit changing Target to be non-disruptive).

  • So that it is easier to read side by side buffers and see the difference/similarities I rekon it would be preferable if the two modules layed out the functions in the same order.

Because of the above, I think it's more pragmatic to compare docs, not the code.

@tcharding
Copy link
Copy Markdown
Member

the method was available on the Buf type via Deref coercion already, so the APIs are in fact more similar than you think

Oh interesting, I had a play and this means that there is no reason to have any of the methods on TaprootMerkleBranchBuf that exist on TaprootMerkleBranch that take &self. For example, there is no need for len (I removed it and ran this test

#[cfg(test)]
mod tests {
    use super::*;
    use crate::hashes::sha256;

    #[test]
    fn deref_coercion() {
        let h = sha256::hash(b"some arbitrary bytes");
        let t = vec![TapNodeHash::from_byte_array(h.to_byte_array())];

        let b = TaprootMerkleBranchBuf::from_collection(t).unwrap();

        println!("len: {}", b.len())
    }
} 

The exact same argument applies to ScriptBuf and Script and it looks like we did indeed do this over there (both is bitcoin and primitives).

@tcharding
Copy link
Copy Markdown
Member

tcharding commented Mar 17, 2025

@apoelstra you can make the call whether on not to have the privacy_boundry module. Its a trivial matter, not a hill worth dying on. My ACK stands either way you call it.

@Kixunil
Copy link
Copy Markdown
Collaborator Author

Kixunil commented Mar 18, 2025

Regarding methods, for ScriptBuf IIRC we did len because Vec also has it despite it being on the slice. I don't know why but since consistency here has trivial cost I just added it and didn't try to figure it out.

Other than that, I think we should remove the other excess methods. I will do that in a followup PR.

Your question about privacy_boundary module suggests you're still unconvinced by my argument. Is there anything particular you have in mind? It's trivial matter but I am interested in knowing if there is a good counterargument or what I messed up in my explanation. :)

@tcharding
Copy link
Copy Markdown
Member

Oh I see the misunderstanding. The module comment makes no sense to me and I spent the whole time fixating on it.

/// Makes sure only the allowed conversions are accessible to external code.

Perhaps just use the same name we used in units::amount::unsigned - encapsulate. And if you really want a comment you say 'Ensures no other code can access the private inner slice'. Or 'encapsulates the inner slice'.

Or if you really like privacy_boundary then change the one in unsigned but I rekon encapsulate well describes what the module does.

@Kixunil
Copy link
Copy Markdown
Collaborator Author

Kixunil commented Mar 18, 2025

I couldn't remember what name we were using and forgot to look it up later. :D

Do you want this in the same PR? (I'll need to leave in a minute for hopefully less than an hour if it helps with deciding.)

@tcharding
Copy link
Copy Markdown
Member

Follow up PR is fine bro - thanks.

@apoelstra
Copy link
Copy Markdown
Member

Done reviewing ef8c9fb.

Needs rebase because I merged the weekly format PR before reviewing this one (we skipped last weeks' format PR and at some point we just gotta do it).

I agree with Tobin's comments that this should be more uniform with script. I'm fine doing that later. At some point either script or this module will be pushed into 1.0, and then we should audit the other one to make sure it looks the same as the stable one, and we can rearrange code then.

As for the privacy_boundary, I think it's fine. I agree that in this case it provides zero value because we have the _unchecked accessor, but it does work as a lint against using .0 which is one of our code guidelines; it lets us try to remove accessors and immediately see what breaks; and it lets us add unsafe markers and have the compiler highlight what parts of the code need to be carefully checked.

And even if none of those things apply here, in general I'd like to have privacy boundaries like this and I'd prefer we be uniform about it.

@Kixunil
Copy link
Copy Markdown
Collaborator Author

Kixunil commented Mar 18, 2025

but it does work as a lint against using .0 which is one of our code guidelines

I think that already has non-zero value. :) My main reason to have this guideline is to allow changing the fields (adding PhantomData etc) without changing ton of lines and as a result we get fewer merge conflicts.

Kixunil added 6 commits March 18, 2025 16:20
This type actually contains a `Vec` but we would prefer to have an
unsized type. Rename it first so that we can reuse the name later.
This moves the content of the module into `buf` submodule making future
changes clearer.
`TaprootMerkleBranchBuf` already had `as_slice` method and `DerefMut`
but was missing `as_slice_mut`, so this change adds it.
`TaprootMerkleBranchBuf` being a vec introduced intermediate allocation
when creating or decoding `Witness`. However the representation on the
wire is the same as in-memory (aside from `#[repr(transparent)]`) so
this allocation wasn't really needed.

This commit introduces `TaprootMerkleBranch` type which is unsized and
can be used in place of `TaprootMerkleBranchBuf` within `ControlBlock`.
Aside from removing the intermediate allocation, this improves the API a
bit: the conversion from array to other type is no longer needed because
it's performed by `ControlBlock` in its methods. Thus, consumers who
have an array can simply set it as `merkle_branch` field and then encode
the `ControlBlock` into witness. A convenience method is also provided
to push the `ControlBlock` along with other parts at the end of the
`Witness`.
`TaprootMerkleBranchBuf` previously derefed to a slice which lost the
information about length being valid. This commit changes the type
which, while API-breaking, is not disruptive because the type has API
very similar to slice.
The new unsized type is more flexible and so are the references to it.
Just like we pass around `&str` instead of `&String` we should be
passing `&TaprootMerkleBranch` instead of `&TaprootMerkleBranchBuf`.
@Kixunil Kixunil force-pushed the alloc-free-control-block branch from ef8c9fb to 9ea2e92 Compare March 18, 2025 15:33
@github-actions
Copy link
Copy Markdown

🚨 API BREAKING CHANGE DETECTED

To see the changes click details on "Check semver breaks / PR Semver - stable toolchain" job then expand "Run semver checker script" and scroll to the end of the section.

@Kixunil
Copy link
Copy Markdown
Collaborator Author

Kixunil commented Mar 18, 2025

I was thinking of moving the trait impls in another commit given that I had to rebase anyway but I'll definitely want to clean up the macro for multiple sizes which isn't as trivial and I have more urgent things to do today.

@apoelstra
Copy link
Copy Markdown
Member

Looks good -- will run local CI on 9ea2e92.

I'd like to have privacy boundaries like this and I'd prefer we be uniform about it.

If you need to rebase again though can you rename the module though to be called encapsulate to match what's done in amount/signed.rs and amount/unsigned.rs.

@Kixunil
Copy link
Copy Markdown
Collaborator Author

Kixunil commented Mar 18, 2025

IIRC we have such modules elsewhere and they are called differently. I'll open an issue about it to decide on the module name.

Copy link
Copy Markdown
Member

@tcharding tcharding left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 9ea2e92

Copy link
Copy Markdown
Member

@apoelstra apoelstra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 9ea2e92; successfully ran local tests


/// Appends elements to proof.
pub(super) fn push(&mut self, h: TapNodeHash) -> Result<(), InvalidMerkleTreeDepthError> {
pub(in super::super) fn push(&mut self, h: TapNodeHash) -> Result<(), InvalidMerkleTreeDepthError> {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wow, very super!

@apoelstra apoelstra merged commit 20c50e3 into rust-bitcoin:master Mar 20, 2025
24 checks passed
@Kixunil Kixunil deleted the alloc-free-control-block branch March 20, 2025 19:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

API break This PR requires a version bump for the next release C-bitcoin PRs modifying the bitcoin crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Unnecessary allocation in ControlBlock

5 participants