Skip to content

Add taproot data structures#677

Merged
dr-orlovsky merged 7 commits intorust-bitcoin:masterfrom
sanket1729:taptree_utils
Nov 12, 2021
Merged

Add taproot data structures#677
dr-orlovsky merged 7 commits intorust-bitcoin:masterfrom
sanket1729:taptree_utils

Conversation

@sanket1729
Copy link
Copy Markdown
Member

@sanket1729 sanket1729 commented Oct 8, 2021

I am still working on adding tests and testing these changes against bitcoin core. Feedback on code and API design is welcome.

Supercedes #666

  • Builder API to construct trees
  • Huffman encoding API that outputs optimal taproot tree given odds.

TODO:

  • Add control block APIs. Verify merkle proofs, create a control block from spend data
  • Add tests and verify them against bitcoin consensus

@sanket1729 sanket1729 force-pushed the taptree_utils branch 4 times, most recently from 637ee89 to 032894b Compare October 11, 2021 01:50
@sanket1729 sanket1729 marked this pull request as ready for review October 11, 2021 01:53
@sanket1729 sanket1729 added this to the 0.28.0 milestone Oct 11, 2021
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe more compact as: self.0.iter().map(|e| e.as_inner()).flatten() ?
Not sure if it's not as efficient...

Also, isn't maybe better to have the serialization logic inside a non-allocating method using a writer and calling it here?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, will add another method that has serialization logic inside a writer.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Iterators should be more efficient because of TrustedLen specialization. (My understanding is that it pre-allocates correct size and then fills it without additional bounds check.) However I'm not sure if it can see that all those slices are equal-sized.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe more rusty to have first let var_len =sl.len().checked_sub(TAPROOT_CONTROL_BASE_SIZE).unwrap_or_else(...) and then this if simplified?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC there is a naming convention in rust for acronyms to be like this NodeNotInDfsOrder

@GeneFerneau
Copy link
Copy Markdown

Concept ACK 032894b

Some small suggestions from an initial review, will spend more time reviewing against the spec.

Will also take some time to experiment with building Taproot contracts with this API, and report back.

Comment on lines 585 to 622
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: check for error condition before mutating internal state

Suggested change
self.0.push(h);
if self.0.len() > TAPROOT_CONTROL_MAX_NODE_COUNT {
return Err(TaprootBuilderError::InvalidMerkleTreeDepth(self.0.len()));
} else {
Ok(())
}
if self.0.len() + 1 > TAPROOT_CONTROL_MAX_NODE_COUNT {
Err(TaprootBuilderError::InvalidMerkleTreeDepth(self.0.len() + 1))
} else {
self.0.push(h);
Ok(())
}

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch!

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rewriting it to self.0.len() >= TAPROOT_CONTROL_MAX_NODE_COUNT would avoid overflow if the invariant that self.0.len() <= TAPROOT_CONTROL_MAX_NODE_COUNT is ever accidentally broken.

Copy link
Copy Markdown
Member

@tcharding tcharding left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few little comments, I didn't get all the way through the review.

Copy link
Copy Markdown
Member

@tcharding tcharding left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple more minor comments, thanks.

Comment on lines 659 to 660
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it meant to be 'This serialization does not include the VarInt prefix'? Also: s/why/when

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turning the slice into iterator would make it easier to reason about this code IMO, but we really need the extension trait I mentioned somewhere already.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you suggest approaching this? I don't see a way to process an iterator u8 and ask it to yield slices without explicating allocating expected size arrays.

@sanket1729
Copy link
Copy Markdown
Member Author

Sorry for the force push, addressed the feedback from @GeneFerneau and @tcharding, did not address comments from @Kixunil. I will add fixup commits from now on for easier review and later squash them when it is ready.

@sanket1729
Copy link
Copy Markdown
Member Author

@dr-orlovsky addressed all of your comments. I think if we get another review for this. We should just merge this and proceed with other taproot stuff. I can work on fixing up and making sure changes in #691 are reflected here.

Finally, I think about the API guidelines, I think we should have a broader discussion. My rule is to always follow the current convention in the codebase. Using new_* and from_* over with_*. If we both(with and new) of these it would be even worse. If we want to switch to with_ (assuming we have consensus), we should do it for all APIs in a non-breaking way in a future release.

@Kixunil
Copy link
Copy Markdown
Collaborator

Kixunil commented Nov 10, 2021

My understanding of naming:

  • new - obvious constructor, may be empty (String::new(), Box::new(T))
  • from_ - takes same information as new just in different representation (e.g. String::from_utf8(Vec<u8>))
  • with_ - takes additional information along with what new takes String::with_capacity(usize)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would this be better as Option? Or I suppose ScriptPath always only in Tap context (in which case, small nit, but maybe TapScriptPath would be clearer name)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a part of the current PR under discussion. But I like the name TapScriptPath over ScriptPath

@tcharding
Copy link
Copy Markdown
Member

My understanding of naming:

* `new` - obvious constructor, may be empty (`String::new()`, `Box::new(T)`)
* `from_` - takes _same_ information as `new` just in different representation (e.g. `String::from_utf8(Vec<u8>)`)
* `with_` - takes _additional_ information along with what `new` takes `String::with_capacity(usize)`

I thought String was a bit of a special case but a quick look at the std lib shows its not. Perhaps conventions have changed for more recent code? I was under the impression that

  • Constructors with zero args should use Default
  • Constructors with one arg should use From<Foo> (unless the foo needs further restriction like the String;:from_utf8 case)
  • Constructors with multiple args should use new.
  • Constructors using the builder pattern should use with(self, foo: Foo)

I've never managed to work out the pattern for times when we need multiple constructors with multiple args?

Amusingly I was looking at all the instances of single arg constructors for Script yesterday wondering why they used the pattern new_.

I don't mean to bike shed here, I'm genuinely interested in these sorts of conventions.

Copy link
Copy Markdown
Contributor

@JeremyRubin JeremyRubin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did some light review

Comment on lines 107 to 111
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and_then would also work, but i'm indifferent (fine as is, altho superfluos else)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: or the most fee sensitive one? e.g. contested closing?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this a BtreeSet internally in case there are >1 instances of a given script?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is to allow > 1 instances of a given script. In practice, this should never be used, but since descriptors spec does not disallow it, we have to support it.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm i think you generally shouldn't construct the vec tuple from a btreemap/hashmap as it can introduce subtle bugs (e.g., when you might want a script to appear > 1 time)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this shouldn't be an error condition IMO, should fallback to creating a binary tree of all failing (>128) scripts and attaching at the largest possible depth?

but I think it's OK to have it this way for now...

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an improvement. I can create a good first issue for this. If someone really wants this, they are welcome to implement this and I am happy to review it.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the optimal huffman tree has depth exceeding 128 it means the user has set insane weights. We should error rather than silently doing the wrong thing.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

given this return type, this should either be a free function or bound to the TapBranchHash impl.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The return type is TaprootSpendInfo

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah i misread as TapBranchHash

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

recommend doing a checked_add here because users could be dumb with what values they put in.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch!

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: avoid usize in API, prefer u64 or u32

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thinking: otherwise the serialization/deserialization of types for pluggin in to this function would be platform dependent.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probability is usually expressed as float in range 0..=1, so it may be unintuitive/confusing to use integer. Perhaps worth a newtype with some helper methods? Or maybe more details explaining how would one compute the numbers?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack this seeming safe, as PK is fed into the hash.

Would prefer the comment clarifying the hash cycle, but a 'Safe' result API might be future proof if this would ever have any other reason to fail in the future?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can be Self

Add tests for taproot Builder
Add tests for taproot huffman tree encoding
Add tests for merkle proof verification
@apoelstra
Copy link
Copy Markdown
Member

This PR has exceeded Github's ability to reasonably display comments. I would prefer future reviews to be in the form of followup PRs.

Suggestions for non-github platforms are also welcome but that's a long term thing.

@sanket1729
Copy link
Copy Markdown
Member Author

sanket1729 commented Nov 12, 2021

So far, I think the only things I see remaining are:

Anything else that I am missing? If there is nothing critical I would request to proceed on this and I can address all of the nits in a followup PR.

Copy link
Copy Markdown
Member

@apoelstra apoelstra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK fa8c3f6

@sanket1729
Copy link
Copy Markdown
Member Author

@GeneFerneau @RCasatta, @dr-orlovsky, sorry for dismissing your Acks. if you are interested in taking another look.

Copy link
Copy Markdown
Collaborator

@dr-orlovsky dr-orlovsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK fa8c3f6

@dr-orlovsky dr-orlovsky merged commit 425c616 into rust-bitcoin:master Nov 12, 2021
This was referenced Nov 12, 2021
@sanket1729 sanket1729 mentioned this pull request Nov 12, 2021
dr-orlovsky added a commit that referenced this pull request Dec 2, 2021
…g tweak for UntweakedPublicKey

5b21a9c Use TapTweakHash method for computing tweak (Noah)

Pull request description:

  Quick follow up PR to #691 using a method from #677.

  ### Changes
  - Updated `UntweakedPublicKey::tap_tweak(...)` to use `TapTweakHash::from_key_and_tweak(...)`

ACKs for top commit:
  Kixunil:
    ACK 5b21a9c
  dr-orlovsky:
    utACK 5b21a9c

Tree-SHA512: d00455bba51981e9ec942a6cf69672666e227850d073b1fdcd92d2eb6ad553659fb2967aec2ce12d3ed109cee5fa125cdda649cddb25404f08adae2bfd3e19bb
RCasatta added a commit that referenced this pull request Dec 23, 2021
7aacc37 Add tests from BIP341 (sanket1729)
61629cc Make taproot hashes forward display (sanket1729)

Pull request description:

  Add tests for taproot.
  - ~Also fixes one bug in #677, namely, I was returning `LeafVersion::default()` instead of given version~
  - ~ Fixes a bug in #691 about taking secp context as a reference instead of consuming it. This should have not passed my review, but this is easy to miss. ~
  - Makes the display on taproot hashes forward instead of the reverse (because the BIP prints in a forward way, I think we should too and it is more natural. )

ACKs for top commit:
  RCasatta:
    ACK 7aacc37
  apoelstra:
    ACK 7aacc37

Tree-SHA512: 2e0442131fc036ffa10f88c91c8fc02d9b67ff6c16c592aa6f4e6a220c26a00fc6ca95a288f14aa40667a289fb0446219fd6c76c0196ead766252356592b9941
dr-orlovsky added a commit that referenced this pull request Dec 30, 2021
7d982fa Add all tests from BIP 371 (sanket1729)
d22e014 Taproot psbt impl BIP 371 (sanket1729)
108fc3d Impl encodable traits for TapLeafhash (sanket1729)
c7478d8 Derive serde for taproot stuctures (sanket1729)

Pull request description:

  Built on top of #677 . Will rebase and mark ready for review after #677 is merged.

ACKs for top commit:
  apoelstra:
    ACK 7d982fa
  dr-orlovsky:
    re-tACK 7d982fa basing on `git range-diff`. The original PR before last re-base was tested commit-by-commit.

Tree-SHA512: feb30e4b38d13110a9c0fabf6466d8f0fb7df09a82f4e01d70b8371b34ab0187004a6c63f9796c6585ee30841e8ee765ae9becae139d2e1e3d839553d64c3d1e
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

API break This PR requires a version bump for the next release

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants