[blockdata::Witness] New Features: Index<usize>, nth(index), and get_tapscript#1323
Conversation
|
lol. concept ACK. I think if we ever hit four, then we should just give in and have a |
apoelstra
left a comment
There was a problem hiding this comment.
ACK 38060fb4bf51fa421166deaa9126febc1eb4fad1
|
I was also debating adding a small helper method for getting the byte slice of the tapscript as an Option. It would basically be: if w.len() >= 2 && w.last().get(0)? == &0x50 { w.third_to_last() } else { w.second_to_last() } Whether or not the Witness stack is actually p2tr or not, we don't know, so just blindly follow the BIP. The caller will have to ensure they only call it on a p2tr Witness stack. Thoughts? |
|
Not opposed to caching another index but the API is starting to look ridiculous. You know, there is this thing called an array that allows you to reference multiple items by a number. :) I suggest having a single field If Also agree fully with looking into having Concept ACK adding helper method for tapscript.
We don't know because we would have to know the |
This would be the correct way to do it. I would like to see what @RCasatta thinks as he originally authored the new Witness struct. |
|
I didn't know what to say because I didn't like the API and neither adding But I think we can have a nicer API without impacting performance! We can add the indexes at the beginning of the When deserializing, the number of elements in the witness is known at the beginning so we can skip the first When serializing, after encoding the At this point, we can implement an efficient |
What I meant was, we assume the caller to have all the external info available to discern that this is p2tr. This includes the script_pubkey of the prev_out. |
|
Neat. It took me a moment to understand what you're saying @RCasatta but I like it. Basically we widen the allocation during deserialization and copy each witness element's length prefix into the extra area where it can be efficiently indexed into. |
|
I have tried implementing @RCasatta idea and one issue that is a bit hairy I found is the push() method. If we have the indices at the beginning of the content, we'll need to slide everything to the right to make room for the new u32 at the beginning, and all the indices already encoded previously will need to be updated by +4. If we have indices at the end of the content, it'll be harder to deterministically find the index without knowing the total size of the actual content. Perhaps having the indices at the end, and include a new usize in the struct that points to the beginning of the indices? |
|
I don't think we should have a |
|
Perhaps the indexing fix with push() removed can land in a future breaking version update. Any objections to merging this as-is for now? Or does anyone else want to take a crack at adding the indexing with push() present? |
|
I'll also add the helper method for getting the tapscript. |
|
How about adding the indices at the end? It'll still require moving things around during
Adding APIs that will be removed in the future is not nice. I suggest you add |
|
I wouldn't mind having a OTOH it would be remarkably easy to fuzz this against the push-to- |
38060fb to
d8cff11
Compare
d8cff11 to
7fec66f
Compare
|
TODO:
|
I never implemented it myself but I'd expect at least some people would like to |
|
I'm assuming the fuzzer errors mean my logic is causing a panic instead of handling garbage input gracefully... I'll have to look over the changes again. If anyone has time to review I'd appreciate it. Thanks. |
tcharding
left a comment
There was a problem hiding this comment.
No comment on the concept but I did code review and the logic looks good to me. Will leave acking to others more knowledgeable of taproot than myself.
|
Not sure why the fuzzer is failing... I have no clue how to read the output. I am assuming there's some sort of panic when the deserializer is fed garbage. |
|
CI shows compile errors though. |
|
These are unrelated to the PR. Probably need to rebase for some reason...? |
17fd3ca to
60a28c0
Compare
|
Looks like master merged some changes to the test suite...................... not sure how/why that would affect this branch though...? Looks like incorrect imports in |
|
See #1345 |
Kixunil
left a comment
There was a problem hiding this comment.
Didn't check the entire thing thoroughly.
bitcoin/src/blockdata/witness.rs
Outdated
There was a problem hiding this comment.
I think we could avoid rotation if we used VecDequeue. The problem then though is indices will be added in reverse order. Rotating just them is still better than moving the entire vec. We could also store VecDequeue directly in Witness but we need the left part to be contiguous which I don't think VecDequeue can gurantee.
bitcoin/src/blockdata/witness.rs
Outdated
There was a problem hiding this comment.
Oh, I was wondering why the hell are we resizing the Vec and thus writing a bunch of zeroes every time and making the code crazy complicated. The reason is we accept Read instead of BufRead which I previously suggested. So we're wasting performance. I will think about improving this.
|
FYI, the nightly CI fail (due to benches failing) is unrelated to this PR - I'm investigating it. |
|
Nightly bug is fixed now. |
|
All tests are passing. Please review when you have the time. Thanks! |
|
Thanks for your contribution @junderw. This PR is hard to review because there are patches later in the series that change code written earlier in the series, worse still, one patch changes ne -> le then a later patch changes le -> ne and neither patches has any justification as to why the change is being made. Can you please squash this PR down in to a set of patches that each do a single thing with description in each patch of whats being done and why - otherwise it is very hard to work out if I should ack or not. Sorry to be a pain. |
68abc66 to
0a221a7
Compare
Arbitrary indexing into Witness fixes the API of last and second_to_last to be more flexible. This patch started off as an addition of third_to_last, but ended up evolving into arbitrary indexing to allow for future use cases. A list of the indices of the start byte for each witness element is stored as an ordered contiguous group of u32s represented as 4 bytes each in the Vec<u8> contents. The bytes are stored using to_ne_bytes for performance reasons. A helper function is added to the tests to allow for easier contruction of the contents Vec in test vectors. u32 was chosen because 22 bits are needed to store 4,000,000 which is the maximum weight limit for a block. This might need to be reworked in the event of consensus limits increasing, but u32 can hold 1000x the current limit, so it should be fine for the forseeable future. The push and consensus_deserialize functions utilize rotate_left and rotate_right to move the indices to the end of the new allocation. Depending on the size of the data, this might be more of a performance hit than just allocating a new temporary Vec to store the indices and append them after parsing is completed. However, for a majority of cases rotating the indices should be faster. Suggestions to use VecDeque instead of Vec for contents necessitate other considerations, since it is not a public facing change, those optimizations can be dealt with in future patches. The Index<usize> trait is implemented using the new nth method with expect. The Iter struct is reworked to make use of the new data representation. This new data structure makes it trivial to implement DoubleEndedIterator and other such traits, but I have decided to leave this as out of scope for this patch.
0a221a7 to
a7501d9
Compare
This new method will check the last witness element to see if it starts with 0x50, and depending on the result it will return the second to last or third to last witness element according to BIP341. In its current state, Witness can not know what type of script it is fulfilling, so it is up to the caller to verify if the previous output is a taproot output or not.
a7501d9 to
3c0d5ae
Compare
|
Thanks @tcharding I have separated it into two different patches. The index patch was fairly intertwined. Perhaps I could have teased out |
|
You legend, thanks a lot man you went above and beyond, that was super nice to review! |
bitcoin/src/blockdata/witness.rs
Outdated
There was a problem hiding this comment.
nit: we don't need the [..] (I checked it against 1.41.1)
There was a problem hiding this comment.
Oh, this comment was left from my review this morning (was still pending I guess) but the issue is still there. No need to fix it though unless you rebase for some other reason.
| use crate::prelude::*; | ||
| use crate::VarInt; | ||
|
|
||
| const U32_SIZE: usize = core::mem::size_of::<u32>(); |
There was a problem hiding this comment.
In 4226d60:
I think this constant hurts readability. Just use 4.
| } | ||
|
|
||
|
|
||
| /// Safety Requirements: value must always fit within u32 |
There was a problem hiding this comment.
In 4226d60:
This is not a safety requirement, but a correctness requirement.
| fn encode_cursor(bytes: &mut [u8], start_of_indices: usize, index: usize, value: usize) { | ||
| let start = start_of_indices + index * U32_SIZE; | ||
| let end = start + U32_SIZE; | ||
| bytes[start..end].copy_from_slice(&(value as u32).to_ne_bytes()[..]); |
There was a problem hiding this comment.
In 4226d60:
I think we should use .try_into().unwrap() (or expect) rather than a cast.
There was a problem hiding this comment.
Or maybe we should instead work with u32s and convert to usize when indexing using the proposed method assuming at least 32-bit platform. Not sure if it affects perf.
|
utACK (will ACK for real once my tests finish). This looks great! My comments are only nits. Some of these new functions are great candidates for property testing, a la #1370. |
|
I will make another PR fixing the nits later. Thanks! |
| @@ -80,13 +89,17 @@ impl Decodable for Witness { | |||
| max: MAX_VEC_SIZE, | |||
There was a problem hiding this comment.
Unrelated but I noticed the code above would be simpler by using saturating_add and relying on the check below.
| fn encode_cursor(bytes: &mut [u8], start_of_indices: usize, index: usize, value: usize) { | ||
| let start = start_of_indices + index * U32_SIZE; | ||
| let end = start + U32_SIZE; | ||
| bytes[start..end].copy_from_slice(&(value as u32).to_ne_bytes()[..]); |
There was a problem hiding this comment.
Or maybe we should instead work with u32s and convert to usize when indexing using the proposed method assuming at least 32-bit platform. Not sure if it affects perf.
| .map(|el| el.len() + VarInt(el.len() as u64).len()) | ||
| .sum(); | ||
| let mut content = vec![0u8; content_size]; | ||
| let mut content = vec![0u8; content_size + index_size]; |
There was a problem hiding this comment.
Oh, this was wasteful before and "still is". We know the exact content length and not using std::io::Read so the reasons for zeroing are not valid here. However because adding indices at the end requires unsafe zeroing became kinda justified. Maybe VecDequeue would work here better than in the other places but still not sure.
| @@ -133,18 +170,17 @@ impl Witness { | |||
| /// Creates [`Witness`] object from an array of byte-arrays | |||
| pub fn from_vec(vec: Vec<Vec<u8>>) -> Self { | |||
There was a problem hiding this comment.
Oh, why are we even requiring Vec? pub fn from_slice<B: AsRef<[u8]>>(slice: &[B]) would be more useful. (Unrelated ofc)
|
|
||
| /// Push a new element on the witness, requires an allocation | ||
| pub fn push<T: AsRef<[u8]>>(&mut self, new_element: T) { | ||
| let new_element = new_element.as_ref(); |
There was a problem hiding this comment.
Unrelated: we could de-monomorphise this.
There was a problem hiding this comment.
I wonder if clippy could lint about this?
There was a problem hiding this comment.
| .consensus_encode(&mut &mut self.content[previous_content_end..end_varint]) | ||
| .expect("writers on vec don't error, space granted through previous resize"); | ||
| self.content[end_varint..].copy_from_slice(new_element); | ||
| self.content[end_varint..end_varint + new_element.len()].copy_from_slice(new_element); |
There was a problem hiding this comment.
The code seem to be more complicated than needed. I'd write:
let element_len_varint = VarInt(new_element.len() as u64);
let new_item_total_len = element_len_varint.len() + new_element.len();
self.content.reserve(new_item_total_len + 4);
let mut varint_buf = [u8; 9];
element_len_varint.consensus_encode(&mut &mut varint_buf)
.expect("varint is at most 9B long");
self.content.splice(self.indices_start..self.indices_start, varint_buf[..element_len_varint.len()].iter().chain(new_element).copied());
self.content.extend_from_slice(&u32::try_from(self.indices_start).unwrap().to_ne_bytes());
self.indices_start += new_item_total_len;
self.witness_elements += 1;Which also should be faster (no zeroing of vec).
| @@ -246,29 +289,70 @@ impl Witness { | |||
| if self.witness_elements <= 1 { | |||
There was a problem hiding this comment.
We should deprecate second_to_last now. If getting arbitrary elements from the back is useful we should have nth_rev method for that.
|
I'll merge this. Nits can be addressed in a later PR. |
sanket1729
left a comment
There was a problem hiding this comment.
post-merge 3c0d5ae. Thanks for the excellent contribution. Left some minor comments that I work upon
| // If there are at least two witness elements, and the first byte of | ||
| // the last element is 0x50, this last element is called annex a | ||
| // and is removed from the witness stack. | ||
| if len >= 2 && last_elem.first().filter(|&&v| v == 0x50).is_some() { |
There was a problem hiding this comment.
nit: use const TAPROOT_ANNEX_PREFIX from taproot.rs
There was a problem hiding this comment.
stylistic nit(feel free to ignore): Can we directly written as last_elem.first() == &Some(TAPROOT_ANNEX_PREFIX)
| /// This does not guarantee that this represents a P2TR [`Witness`]. | ||
| /// It merely gets the second to last or third to last element depending | ||
| /// on the first byte of the last element being equal to 0x50. | ||
| pub fn get_tapscript(&self) -> Option<&[u8]> { |
There was a problem hiding this comment.
nit: typically, we don't use get_ prefix. But I don't hold it too strong.
00c7b6e Witness: Fix nits from PR 1323 (junderw) Pull request description: Ref: #1323 This is just to quickly fix some of the smaller nits. Larger changes (deprecations, adding / refactoring of methods) should be in a separate PR. ACKs for top commit: Kixunil: ACK 00c7b6e tcharding: ACK 00c7b6e sanket1729: ACK 00c7b6e Tree-SHA512: 5f661187a7003060669d15d873e323c017c905a00b62eb56ca3afc2fc27084b245ad62dfcf6d2fd14eac361430be954e7636f6b9ff668aefaad0424789a2f826
Ref: #672 (comment)
Add Index and nth(index) to Witness
4226d60
Arbitrary indexing into Witness fixes the API of last and second_to_last to be more flexible.
This patch started off as an addition of third_to_last, but ended up evolving
into arbitrary indexing to allow for future use cases.
A list of the indices of the start byte for each witness element is stored as an ordered
contiguous group of u32s represented as 4 bytes each in the Vec contents.
The bytes are stored using to_ne_bytes for performance reasons. A helper function is added
to the tests to allow for easier contruction of the contents Vec in test vectors. u32 was
chosen because 22 bits are needed to store 4,000,000 which is the maximum weight limit for
a block. This might need to be reworked in the event of consensus limits increasing, but
u32 can hold 1000x the current limit, so it should be fine for the forseeable future.
The push and consensus_deserialize functions utilize rotate_left and rotate_right to move
the indices to the end of the new allocation. Depending on the size of the data, this
might be more of a performance hit than just allocating a new temporary Vec to store the
indices and append them after parsing is completed. However, for a majority of cases
rotating the indices should be faster. Suggestions to use VecDeque instead of Vec for
contents necessitate other considerations, since it is not a public facing change,
those optimizations can be dealt with in future patches.
The Index trait is implemented using the new nth method with expect.
The Iter struct is reworked to make use of the new data representation. This new data
structure makes it trivial to implement DoubleEndedIterator and other such traits, but
I have decided to leave this as out of scope for this patch.
Add get_tapscript to Witness
a7501d9
This new method will check the last witness element to see if it starts with 0x50, and
depending on the result it will return the second to last or third to last witness
element according to BIP341.
In its current state, Witness can not know what type of script it is fulfilling,
so it is up to the caller to verify if the previous output is a taproot output or not.
Edit: This is the previous PR body: