Skip to content

[blockdata::Witness] New Features: Index<usize>, nth(index), and get_tapscript#1323

Merged
apoelstra merged 2 commits intorust-bitcoin:masterfrom
junderw:add/p2tr-script-method
Nov 7, 2022
Merged

[blockdata::Witness] New Features: Index<usize>, nth(index), and get_tapscript#1323
apoelstra merged 2 commits intorust-bitcoin:masterfrom
junderw:add/p2tr-script-method

Conversation

@junderw
Copy link
Copy Markdown
Contributor

@junderw junderw commented Oct 18, 2022

Ref: #672 (comment)

Add Index and nth(index) to Witness
4226d60
Arbitrary indexing into Witness fixes the API of last and second_to_last to be more flexible.
This patch started off as an addition of third_to_last, but ended up evolving
into arbitrary indexing to allow for future use cases.

A list of the indices of the start byte for each witness element is stored as an ordered
contiguous group of u32s represented as 4 bytes each in the Vec contents.
The bytes are stored using to_ne_bytes for performance reasons. A helper function is added
to the tests to allow for easier contruction of the contents Vec in test vectors. u32 was
chosen because 22 bits are needed to store 4,000,000 which is the maximum weight limit for
a block. This might need to be reworked in the event of consensus limits increasing, but
u32 can hold 1000x the current limit, so it should be fine for the forseeable future.

The push and consensus_deserialize functions utilize rotate_left and rotate_right to move
the indices to the end of the new allocation. Depending on the size of the data, this
might be more of a performance hit than just allocating a new temporary Vec to store the
indices and append them after parsing is completed. However, for a majority of cases
rotating the indices should be faster. Suggestions to use VecDeque instead of Vec for
contents necessitate other considerations, since it is not a public facing change,
those optimizations can be dealt with in future patches.

The Index trait is implemented using the new nth method with expect.

The Iter struct is reworked to make use of the new data representation. This new data
structure makes it trivial to implement DoubleEndedIterator and other such traits, but
I have decided to leave this as out of scope for this patch.


Add get_tapscript to Witness
a7501d9
This new method will check the last witness element to see if it starts with 0x50, and
depending on the result it will return the second to last or third to last witness
element according to BIP341.

In its current state, Witness can not know what type of script it is fulfilling,
so it is up to the caller to verify if the previous output is a taproot output or not.


Edit: This is the previous PR body:

In a taproot script payment with annex, quick access to the 3rd to last element (which is the actual script in this case) is convenient.

This feels like kicking the can down the road again, but I think it's a nice to have method.

RCasatta dr-orlovsky were discussing this issue. I would like to ask if they have any thoughts on the addition of this.

@apoelstra
Copy link
Copy Markdown
Member

lol. concept ACK.

I think if we ever hit four, then we should just give in and have a Vec of indices, which would at least be much cheaper than having a Vec<Vec<u8>>. But given the slowness of bitcoin and the general nature of the annex, I'll bet that we don't hit four for many years.

apoelstra
apoelstra previously approved these changes Oct 18, 2022
Copy link
Copy Markdown
Member

@apoelstra apoelstra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 38060fb4bf51fa421166deaa9126febc1eb4fad1

@junderw
Copy link
Copy Markdown
Contributor Author

junderw commented Oct 18, 2022

I was also debating adding a small helper method for getting the byte slice of the tapscript as an Option.

It would basically be:

if w.len() >= 2 && w.last().get(0)? == &0x50 { w.third_to_last() } else { w.second_to_last() }

Whether or not the Witness stack is actually p2tr or not, we don't know, so just blindly follow the BIP. The caller will have to ensure they only call it on a p2tr Witness stack.

Thoughts?

@Kixunil
Copy link
Copy Markdown
Collaborator

Kixunil commented Oct 18, 2022

Not opposed to caching another index but the API is starting to look ridiculous. You know, there is this thing called an array that allows you to reference multiple items by a number. :)

I suggest having a single field indices_cache storing the indices. Then instead of having whatever_to_last specialize nth(), last() on the iterator to use the cache and document this property.

If witness.iter().last() looks too scary because of apparent iteration we can also have nth method directly but I'm not sure how much it'll help.

Also agree fully with looking into having Vec if there are many elements - I'd still like to cache common sizes without allocation though.

Concept ACK adding helper method for tapscript.

The caller will have to ensure they only call it on a p2tr Witness stack.

We don't know because we would have to know the script_pubkey of previous transaction so validating this is impossible right?

@sanket1729
Copy link
Copy Markdown
Member

I suggest having a single field indices_cache storing the indices.

This would be the correct way to do it. I would like to see what @RCasatta thinks as he originally authored the new Witness struct.

@RCasatta
Copy link
Copy Markdown
Collaborator

RCasatta commented Oct 18, 2022

I didn't know what to say because I didn't like the API and neither adding indices_cache because it will require other allocations impacting deserializing performance.

But I think we can have a nicer API without impacting performance!

We can add the indexes at the beginning of the content field as fixed-size integers (I think u32 since u32::MAX>MAX_VEC_SIZE).

When deserializing, the number of elements in the witness is known at the beginning so we can skip the first size(u32)*witness_elements bytes of content and fill it while deserializing the elements. The current initial size of content (128) should still be enough to accommodate more than 75% of witnesses in mainnet but this should be double-checked.

When serializing, after encoding the witness_elements varint we can simply skip the first size(u32)*witness_elements bytes of content.

At this point, we can implement an efficient Index (and also first() last() and nth()) by doing a lookup in the indexes at the beginning of content like index = u32::from_xx_bytes( content[i * size(u32)] ) as usize
The index points at the ith element before the varint storing the length, once the length is read we have the start and end of the slice of the requested witness element.

@junderw
Copy link
Copy Markdown
Contributor Author

junderw commented Oct 18, 2022

We don't know because we would have to know the script_pubkey of previous transaction so validating this is impossible right?

What I meant was, we assume the caller to have all the external info available to discern that this is p2tr. This includes the script_pubkey of the prev_out.

@apoelstra
Copy link
Copy Markdown
Member

Neat. It took me a moment to understand what you're saying @RCasatta but I like it. Basically we widen the allocation during deserialization and copy each witness element's length prefix into the extra area where it can be efficiently indexed into.

@junderw
Copy link
Copy Markdown
Contributor Author

junderw commented Oct 18, 2022

I have tried implementing @RCasatta idea and one issue that is a bit hairy I found is the push() method.

If we have the indices at the beginning of the content, we'll need to slide everything to the right to make room for the new u32 at the beginning, and all the indices already encoded previously will need to be updated by +4.

If we have indices at the end of the content, it'll be harder to deterministically find the index without knowing the total size of the actual content.

Perhaps having the indices at the end, and include a new usize in the struct that points to the beginning of the indices?

@apoelstra
Copy link
Copy Markdown
Member

I don't think we should have a push method on this type. If you want to modify it you should have to convert to a Vec<Vec<u8>> and back.

@junderw
Copy link
Copy Markdown
Contributor Author

junderw commented Oct 19, 2022

Perhaps the indexing fix with push() removed can land in a future breaking version update.

Any objections to merging this as-is for now? Or does anyone else want to take a crack at adding the indexing with push() present?

@junderw
Copy link
Copy Markdown
Contributor Author

junderw commented Oct 19, 2022

I'll also add the helper method for getting the tapscript.

@Kixunil
Copy link
Copy Markdown
Collaborator

Kixunil commented Oct 19, 2022

How about adding the indices at the end? It'll still require moving things around during push but not updating the indices and there will be generally less bytes to move. It still needs to be documented as somewhat slow but people should already expect at least allocations from push. We could optimize it by providing some reserve API.

Any objections to merging this as-is for now?

Adding APIs that will be removed in the future is not nice. I suggest you add nth(n: usize) method and then conditionally iterate or not, hiding the cache. If we implement @RCasatta suggestion we will keep it and just change the implementation.

@apoelstra
Copy link
Copy Markdown
Member

I wouldn't mind having a push method, especially since we already have one, but it does seem like extra work and complexity for a niche feature. (Usually Witnesses are constructed in one shot from a Vec<Vec<u8>> and converted, or parsed from a transaction and never changed.)

OTOH it would be remarkably easy to fuzz this against the push-to-Vec<Vec<u8>> strategy so I'm not too worried about bugs.

@junderw junderw force-pushed the add/p2tr-script-method branch from 38060fb to d8cff11 Compare October 19, 2022 15:55
@junderw junderw changed the title Add third_to_last to Witness (for p2tr script spend with annex) Add Index<usize>, nth(index), and get_tapscript to Witness Oct 19, 2022
@junderw junderw force-pushed the add/p2tr-script-method branch from d8cff11 to 7fec66f Compare October 19, 2022 15:59
@junderw
Copy link
Copy Markdown
Contributor Author

junderw commented Oct 19, 2022

TODO:

  • Fix fuzzer errors (le bytes vs ne bytes?)
  • Write tests for get_tapscript

@Kixunil
Copy link
Copy Markdown
Collaborator

Kixunil commented Oct 19, 2022

Usually Witnesses are constructed in one shot from a Vec<Vec<u8>>

I never implemented it myself but I'd expect at least some people would like to collect an iterator or something else that makes fewer allocations.

@junderw
Copy link
Copy Markdown
Contributor Author

junderw commented Oct 20, 2022

I'm assuming the fuzzer errors mean my logic is causing a panic instead of handling garbage input gracefully...

I'll have to look over the changes again. If anyone has time to review I'd appreciate it.

Thanks.

Copy link
Copy Markdown
Member

@tcharding tcharding left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No comment on the concept but I did code review and the logic looks good to me. Will leave acking to others more knowledgeable of taproot than myself.

@junderw junderw changed the title Add Index<usize>, nth(index), and get_tapscript to Witness [blockdata::Witness] New Features: Index<usize>, nth(index), and get_tapscript Oct 26, 2022
@junderw
Copy link
Copy Markdown
Contributor Author

junderw commented Oct 26, 2022

Not sure why the fuzzer is failing... I have no clue how to read the output.

I am assuming there's some sort of panic when the deserializer is fed garbage.

@Kixunil
Copy link
Copy Markdown
Collaborator

Kixunil commented Oct 26, 2022

CI shows compile errors though.

@junderw
Copy link
Copy Markdown
Contributor Author

junderw commented Oct 26, 2022

These are unrelated to the PR. Probably need to rebase for some reason...?

@junderw junderw force-pushed the add/p2tr-script-method branch from 17fd3ca to 60a28c0 Compare October 26, 2022 10:55
@junderw
Copy link
Copy Markdown
Contributor Author

junderw commented Oct 26, 2022

Looks like master merged some changes to the test suite...................... not sure how/why that would affect this branch though...?

Looks like incorrect imports in bitcoin/examples/taproot-psbt.rs

@apoelstra
Copy link
Copy Markdown
Member

See #1345

Copy link
Copy Markdown
Collaborator

@Kixunil Kixunil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't check the entire thing thoroughly.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could avoid rotation if we used VecDequeue. The problem then though is indices will be added in reverse order. Rotating just them is still better than moving the entire vec. We could also store VecDequeue directly in Witness but we need the left part to be contiguous which I don't think VecDequeue can gurantee.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I was wondering why the hell are we resizing the Vec and thus writing a bunch of zeroes every time and making the code crazy complicated. The reason is we accept Read instead of BufRead which I previously suggested. So we're wasting performance. I will think about improving this.

@tcharding
Copy link
Copy Markdown
Member

FYI, the nightly CI fail (due to benches failing) is unrelated to this PR - I'm investigating it.

@tcharding
Copy link
Copy Markdown
Member

Nightly bug is fixed now.

@junderw
Copy link
Copy Markdown
Contributor Author

junderw commented Nov 2, 2022

All tests are passing.

Please review when you have the time. Thanks!

@tcharding
Copy link
Copy Markdown
Member

Thanks for your contribution @junderw. This PR is hard to review because there are patches later in the series that change code written earlier in the series, worse still, one patch changes ne -> le then a later patch changes le -> ne and neither patches has any justification as to why the change is being made. Can you please squash this PR down in to a set of patches that each do a single thing with description in each patch of whats being done and why - otherwise it is very hard to work out if I should ack or not.

Sorry to be a pain.

@junderw junderw force-pushed the add/p2tr-script-method branch from 68abc66 to 0a221a7 Compare November 3, 2022 02:08
Arbitrary indexing into Witness fixes the API of last and second_to_last to be more flexible.
This patch started off as an addition of third_to_last, but ended up evolving
into arbitrary indexing to allow for future use cases.

A list of the indices of the start byte for each witness element is stored as an ordered
contiguous group of u32s represented as 4 bytes each in the Vec<u8> contents.
The bytes are stored using to_ne_bytes for performance reasons. A helper function is added
to the tests to allow for easier contruction of the contents Vec in test vectors. u32 was
chosen because 22 bits are needed to store 4,000,000 which is the maximum weight limit for
a block. This might need to be reworked in the event of consensus limits increasing, but
u32 can hold 1000x the current limit, so it should be fine for the forseeable future.

The push and consensus_deserialize functions utilize rotate_left and rotate_right to move
the indices to the end of the new allocation. Depending on the size of the data, this
might be more of a performance hit than just allocating a new temporary Vec to store the
indices and append them after parsing is completed. However, for a majority of cases
rotating the indices should be faster. Suggestions to use VecDeque instead of Vec for
contents necessitate other considerations, since it is not a public facing change,
those optimizations can be dealt with in future patches.

The Index<usize> trait is implemented using the new nth method with expect.

The Iter struct is reworked to make use of the new data representation. This new data
structure makes it trivial to implement DoubleEndedIterator and other such traits, but
I have decided to leave this as out of scope for this patch.
@junderw junderw force-pushed the add/p2tr-script-method branch from 0a221a7 to a7501d9 Compare November 3, 2022 02:41
This new method will check the last witness element to see if it starts with 0x50, and
depending on the result it will return the second to last or third to last witness
element according to BIP341.

In its current state, Witness can not know what type of script it is fulfilling,
so it is up to the caller to verify if the previous output is a taproot output or not.
@junderw junderw force-pushed the add/p2tr-script-method branch from a7501d9 to 3c0d5ae Compare November 3, 2022 02:48
@junderw
Copy link
Copy Markdown
Contributor Author

junderw commented Nov 3, 2022

Thanks @tcharding I have separated it into two different patches. The index patch was fairly intertwined. Perhaps I could have teased out Index<usize>, but it was a small enough diff and is basically a simple wrapper around nth(usize), so I left it in.

@tcharding
Copy link
Copy Markdown
Member

You legend, thanks a lot man you went above and beyond, that was super nice to review!

Copy link
Copy Markdown
Member

@tcharding tcharding left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 3c0d5ae

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we don't need the [..] (I checked it against 1.41.1)

Copy link
Copy Markdown
Member

@tcharding tcharding Nov 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, this comment was left from my review this morning (was still pending I guess) but the issue is still there. No need to fix it though unless you rebase for some other reason.

use crate::prelude::*;
use crate::VarInt;

const U32_SIZE: usize = core::mem::size_of::<u32>();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In 4226d60:

I think this constant hurts readability. Just use 4.

}


/// Safety Requirements: value must always fit within u32
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In 4226d60:

This is not a safety requirement, but a correctness requirement.

fn encode_cursor(bytes: &mut [u8], start_of_indices: usize, index: usize, value: usize) {
let start = start_of_indices + index * U32_SIZE;
let end = start + U32_SIZE;
bytes[start..end].copy_from_slice(&(value as u32).to_ne_bytes()[..]);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In 4226d60:

I think we should use .try_into().unwrap() (or expect) rather than a cast.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or maybe we should instead work with u32s and convert to usize when indexing using the proposed method assuming at least 32-bit platform. Not sure if it affects perf.

@apoelstra
Copy link
Copy Markdown
Member

utACK (will ACK for real once my tests finish). This looks great! My comments are only nits.

Some of these new functions are great candidates for property testing, a la #1370.

Copy link
Copy Markdown
Member

@apoelstra apoelstra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 3c0d5ae

@junderw
Copy link
Copy Markdown
Contributor Author

junderw commented Nov 6, 2022

I will make another PR fixing the nits later.

Thanks!

Copy link
Copy Markdown
Collaborator

@Kixunil Kixunil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My comments are mainly fancy optimizations, nits and unrelated observations. The code looks correct.

ACK 3c0d5ae

That being said, I'd really appreciate simplification of the push method and API cleanup in subsequent PR(s).

@@ -80,13 +89,17 @@ impl Decodable for Witness {
max: MAX_VEC_SIZE,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated but I noticed the code above would be simpler by using saturating_add and relying on the check below.

fn encode_cursor(bytes: &mut [u8], start_of_indices: usize, index: usize, value: usize) {
let start = start_of_indices + index * U32_SIZE;
let end = start + U32_SIZE;
bytes[start..end].copy_from_slice(&(value as u32).to_ne_bytes()[..]);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or maybe we should instead work with u32s and convert to usize when indexing using the proposed method assuming at least 32-bit platform. Not sure if it affects perf.

.map(|el| el.len() + VarInt(el.len() as u64).len())
.sum();
let mut content = vec![0u8; content_size];
let mut content = vec![0u8; content_size + index_size];
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, this was wasteful before and "still is". We know the exact content length and not using std::io::Read so the reasons for zeroing are not valid here. However because adding indices at the end requires unsafe zeroing became kinda justified. Maybe VecDequeue would work here better than in the other places but still not sure.

@@ -133,18 +170,17 @@ impl Witness {
/// Creates [`Witness`] object from an array of byte-arrays
pub fn from_vec(vec: Vec<Vec<u8>>) -> Self {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, why are we even requiring Vec? pub fn from_slice<B: AsRef<[u8]>>(slice: &[B]) would be more useful. (Unrelated ofc)


/// Push a new element on the witness, requires an allocation
pub fn push<T: AsRef<[u8]>>(&mut self, new_element: T) {
let new_element = new_element.as_ref();
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated: we could de-monomorphise this.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if clippy could lint about this?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.consensus_encode(&mut &mut self.content[previous_content_end..end_varint])
.expect("writers on vec don't error, space granted through previous resize");
self.content[end_varint..].copy_from_slice(new_element);
self.content[end_varint..end_varint + new_element.len()].copy_from_slice(new_element);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code seem to be more complicated than needed. I'd write:

let element_len_varint = VarInt(new_element.len() as u64);
 let new_item_total_len = element_len_varint.len() + new_element.len();
self.content.reserve(new_item_total_len + 4);
let mut varint_buf = [u8; 9];
element_len_varint.consensus_encode(&mut &mut varint_buf)
                .expect("varint is at most 9B long");
self.content.splice(self.indices_start..self.indices_start, varint_buf[..element_len_varint.len()].iter().chain(new_element).copied());
self.content.extend_from_slice(&u32::try_from(self.indices_start).unwrap().to_ne_bytes());
self.indices_start += new_item_total_len;
self.witness_elements += 1;

Which also should be faster (no zeroing of vec).

@@ -246,29 +289,70 @@ impl Witness {
if self.witness_elements <= 1 {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should deprecate second_to_last now. If getting arbitrary elements from the back is useful we should have nth_rev method for that.

@apoelstra
Copy link
Copy Markdown
Member

I'll merge this. Nits can be addressed in a later PR.

@apoelstra apoelstra merged commit 1d0b0e6 into rust-bitcoin:master Nov 7, 2022
Copy link
Copy Markdown
Member

@sanket1729 sanket1729 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

post-merge 3c0d5ae. Thanks for the excellent contribution. Left some minor comments that I work upon

// If there are at least two witness elements, and the first byte of
// the last element is 0x50, this last element is called annex a
// and is removed from the witness stack.
if len >= 2 && last_elem.first().filter(|&&v| v == 0x50).is_some() {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: use const TAPROOT_ANNEX_PREFIX from taproot.rs

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stylistic nit(feel free to ignore): Can we directly written as last_elem.first() == &Some(TAPROOT_ANNEX_PREFIX)

/// This does not guarantee that this represents a P2TR [`Witness`].
/// It merely gets the second to last or third to last element depending
/// on the first byte of the last element being equal to 0x50.
pub fn get_tapscript(&self) -> Option<&[u8]> {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: typically, we don't use get_ prefix. But I don't hold it too strong.

sanket1729 added a commit that referenced this pull request Nov 12, 2022
00c7b6e Witness: Fix nits from PR 1323 (junderw)

Pull request description:

  Ref: #1323

  This is just to quickly fix some of the smaller nits. Larger changes (deprecations, adding / refactoring of methods) should be in a separate PR.

ACKs for top commit:
  Kixunil:
    ACK 00c7b6e
  tcharding:
    ACK 00c7b6e
  sanket1729:
    ACK 00c7b6e

Tree-SHA512: 5f661187a7003060669d15d873e323c017c905a00b62eb56ca3afc2fc27084b245ad62dfcf6d2fd14eac361430be954e7636f6b9ff668aefaad0424789a2f826
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants