descriptors: Be able to specify change and receiving in a single descriptor string#22838
descriptors: Be able to specify change and receiving in a single descriptor string#22838glozow merged 14 commits intobitcoin:masterfrom
Conversation
|
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. Code CoverageFor detailed information about the code coverage, see the test coverage report. ReviewsSee the guideline for information on the review process.
If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
|
Strong concept ACK! Glad this is getting a fix. As discussed in 17190 and OP, there's more improvements that can come after this. |
|
Concept and Approach ACK. I plan on updating the test and docs introduced in #22067 to take advantage of this in a follow up PR. So I will try to do a more thorough review and testing of this after that is merged and I start on the followup based on this branch. (or if it looks like this will get merged first I will do that earlier) |
As an example Allowing a mix of hardened and unhardened like |
The conflict is actually with the |
|
I'm doing some more review/testing of this PR, but when I checkout this PR and rebase to master, I get build error: wallet/rpcwallet.cpp:3387:45: error: no viable conversion from 'std::pair<std::unique_ptr<Descriptor>, std::unique_ptr<Descriptor> >' to 'std::unique_ptr<Descriptor>'
std::unique_ptr<Descriptor> desc = Parse(desc_str, desc_out, error, true); |
|
Fixed the silent merge conflict. |
e8a6785 to
f4a9ed5
Compare
|
So in my tests, from this multipath descriptor string: I have a descriptor that looks like this: {
'descriptor': 'wsh(sortedmulti(2,tpubDCcZnBitiPCBqsdLWx8Lkc2BrsvVTdZU2Gcioi8yY76HrGSvp4oWyxZ5GUKGVcjdv4wzWbfntfLdcEEfVRamhSgwi1ZxgfNtayQ2QEEjfTv/0/*,tpubDCMtV5vMS9Zn2paRcw7o83ytLY7WHMCGYRSDxhz7HWnZ4ix19EQpi7g3wvYxEvuoXRfLy5XpXJP5VEr7qPzS92TWnVrBMPJYzfupdACC6vP/0/*,tpubDDBi5pEGAmG7k8t2udh8vbYb1Zpwnt5E9bxd86z6xR66r6tYH1aXy5J3kxR2NDBWhVzgcyiAjFw3kCbkf8YcxDx4wjsnvKjKu3SfPVi4dzf/0/*))#shfp6kvs',
'descriptor_change': 'wsh(sortedmulti(2,tpubDCcZnBitiPCBqsdLWx8Lkc2BrsvVTdZU2Gcioi8yY76HrGSvp4oWyxZ5GUKGVcjdv4wzWbfntfLdcEEfVRamhSgwi1ZxgfNtayQ2QEEjfTv/1/*,tpubDCMtV5vMS9Zn2paRcw7o83ytLY7WHMCGYRSDxhz7HWnZ4ix19EQpi7g3wvYxEvuoXRfLy5XpXJP5VEr7qPzS92TWnVrBMPJYzfupdACC6vP/1/*,tpubDDBi5pEGAmG7k8t2udh8vbYb1Zpwnt5E9bxd86z6xR66r6tYH1aXy5J3kxR2NDBWhVzgcyiAjFw3kCbkf8YcxDx4wjsnvKjKu3SfPVi4dzf/1/*))#ghxep260',
'checksum': 'e82r9xv3',
'isrange': True,
'issolvable': True,
'hasprivatekeys': False
}But now, 2021-10-18T22:26:50.595000Z TestFramework (INFO): Check that every participant's multisig generates the same addresses...
2021-10-18T22:26:50.602000Z TestFramework (ERROR): JSONRPC error
Traceback (most recent call last):
File "/Users/michaeldietz/Documents/bitcoin/test/functional/test_framework/test_framework.py", line 132, in main
self.run_test()
File "/Users/michaeldietz/Documents/bitcoin/test/functional/wallet_multisig_descriptor_psbt.py", line 95, in run_test
change_addresses = [multisig.getrawchangeaddress() for multisig in participants["multisigs"]]
File "/Users/michaeldietz/Documents/bitcoin/test/functional/wallet_multisig_descriptor_psbt.py", line 95, in <listcomp>
change_addresses = [multisig.getrawchangeaddress() for multisig in participants["multisigs"]]
File "/Users/michaeldietz/Documents/bitcoin/test/functional/test_framework/coverage.py", line 49, in __call__
return_val = self.auth_service_proxy_instance.__call__(*args, **kwargs)
File "/Users/michaeldietz/Documents/bitcoin/test/functional/test_framework/authproxy.py", line 144, in __call__
raise JSONRPCException(response['error'], status)
test_framework.authproxy.JSONRPCException: Error: This wallet has no available keys (-4)Any ideas? I am updating #22067 to use multipath descriptors and using it as a chance to test this PR more |
|
@mjdietzx Is your branch pushed so I can have a look? |
|
In my code I actually have: result = multisig.importdescriptors([
{
"desc": descriptor["descriptor"],
"active": True,
"timestamp": "now",
},
])
assert all(r["success"] for r in result)And that doesn't catch any errors. I will push my branch shortly and reply w/ it |
|
I had an error in my tests - it seems everything in this PR is working properly. This works: result = multisig.importdescriptors([
{
"desc": descsum_create(f"wsh(sortedmulti({self.M},{f'/<0;1>/*,'.join(xpubs)}/<0;1>/*))"),
"active": True,
"timestamp": "now",
},
])re the error I posted eariler, I was trying to do this (using descriptor = multisig.getdescriptorinfo(f"wsh(sortedmulti({self.M},{f'/<0;1>/*,'.join(xpubs)}/<0;1>/*))")
result = multisig.importdescriptors([
{
"desc": descriptor["descriptor"],
"active": True,
"timestamp": "now",
},
])But I was not importing or doing anything with We used {
descriptor // still the multipath descriptor, similar to output of descsum_create
// now the broken out descriptors:
descriptor_receiving
descriptor_change
// ... and the other fields are unchanged
} |
The |
There was a problem hiding this comment.
I'm still not convinced this is the best interface for getdescriptorinfo
I'd expect descriptor to always be the full descriptor (whether multipath or not). In the case of multipath, I'd then expect two optional attributes descriptor_receiving and descriptor_change to be present in the response.
There was a problem hiding this comment.
nit: descriptor_change should be optional
There was a problem hiding this comment.
descriptor cannot currently be the multipath descriptor as generating the string output for one is not yet implemented. I'm not sure on a good approach for doing that yet.
There was a problem hiding this comment.
Are you saying something like:
- The multipath
descriptorstring input togetdescriptorinfois parsed/expanded to the two descriptors - Now each of these descriptors is converted back
ToStringand returned asdescriptor(first) anddescriptor_change(second)
And that we currently don't have a way to go from the two parsed/expanded descriptors back to the single multipath descriptor string?
That does sound tricky, just trying to make sure I understand
There was a problem hiding this comment.
Ok, so what's the downside of something like:
- Each of these parsed descriptors is converted back
ToStringand returned asdescriptor_receiving(first) and descriptor_change (second) - The same multipath
descriptorstring input togetdescriptorinfois just returned for thedescriptoroutput (with a checksum added)
There was a problem hiding this comment.
@achow101 could you entertain me on this, if nothing else it'll help me understand this better as I finish up my review
There was a problem hiding this comment.
#15986 provides a bit more context for this.
There are a few reasons I would prefer to not change the behavior of the descriptor result. The first is for backwards compatibility. This RPC may already be used in a way that expects the canonicalized descriptor produced by getdescriptorinfo in the descriptor field. Additionally, if a descriptor with a private key were provided, if we were to just output that same string again, then we would be echoing private keys which is something that we want to avoid doing.
There was a problem hiding this comment.
descriptorcannot currently be the multipath descriptor as generating the string output for one is not yet implemented. I'm not sure on a good approach for doing that yet.
Mmm, that has me a bit worried. If a user imports a multipath descriptor, I think we should remember it in that form. And if we embrace multipath descriptors, we should probably also use them for new wallets.
Perhaps a better design would be a MultipathDescriptor subclass that only has a receive and change method, or more neutral: a pair method that returns a pair of Descriptors that then work as usual.
In the future std::pair<Descriptor> can be generalized to std::map<uint32, Descriptor> to allow <NUM,NUM,...,NUM>, etc. That may even be appropriate now: we could treat <0;1> as a typical receive/change setup, and refuse to import any other combination (though that's an RPC level thing, and we might as well use std::pair<Descriptor> given such an import restriction).
f4a9ed5 to
796c439
Compare
mjdietzx
left a comment
There was a problem hiding this comment.
Getting very close to signing off on this PR, just a few minor things/questions
There was a problem hiding this comment.
I've been thinking through the response returned here. It doesn't seem ideal that the shape of this response differs between a single descriptor and a multipath descriptor. Ie this returns an array of addresses if we pass in a descriptor, but it returns an object if we pass in a multipath descriptor.
Have you thought through an api where the response is more uniform and "feels" the same for both types? I don't necessarily know a better way to do it, wondering if you've thought this through though
There was a problem hiding this comment.
The primary reason for having the two different return results is to maintain backwards compatibility with anyone who may be using the RPC now.
There was a problem hiding this comment.
Yeah, I figured there'd need to be a deprecation process if we wanted to do this. Anyways, if you think it's enough of an improvement to warrant that I'd be happy to do give it a go as a followup PR. Just lmk
There was a problem hiding this comment.
Related to my above comment, if the response is uniform we could potentially simplify this a bit and have a single return at the end of the function
|
Review ACK 796c439. I also lightly used/tested this by running #23308 on top of this as additional test coverage. |
|
I think there may be another silent merge conflict w/ master: wallet/test/util.cpp:33:37: error: no viable conversion from 'std::pair<std::unique_ptr<Descriptor>, std::unique_ptr<Descriptor> >' to 'std::unique_ptr<Descriptor>'
std::unique_ptr<Descriptor> desc = Parse("combo(" + EncodeSecret(key) + ")", provider, error, /* require_checksum=*/ false);Can you try rebasing again? |
darosior
left a comment
There was a problem hiding this comment.
re-ACK 8208454eae7484d34a162ffa7701157e56e3cb80
Although i agree it's more readable now it's unfortunate to ask to rewrite part of the parsing logic only for style purpose after it had 3 ACKs and underwent significant fuzzing.
That said i didn't find any issue and neither did my fuzzer (for now). I'll keep fuzzing this PR rebased on master for a day.
There was a problem hiding this comment.
Dropping another out loud thinking:
The multipath specifier can be used in a way it degrades wallet performance for no reason if it is used in the last segment of single key constructions. For instance, a pkh(xpub../0/<0;1;2;3;4;5;6;7;8;9;10,...>) imports 10 different descriptors into the wallet while the same can be expressed using a single ranged descriptor pkh(xpub../0/*) with range 0-10.
Topic here is whether we should disable it for top-level single key functions, alert users about the degraded performance or do nothing. I am not suggesting disabling it for all functions because the functionality is useful for multi-key constructions. Thoughts?
There was a problem hiding this comment.
still reviewing, only nits so far.
Although i agree it's more readable now it's unfortunate to ask to rewrite part of the parsing logic only for style purpose after it had 3 ACKs and underwent significant fuzzing.
I think it depends on the proposed changes. If readability is improved substantially (bolding substantially), it helps maintenance and might also uncover logical issues that the fuzzer might not easily encounter. I don't think we should settle anything in stone because of a good fuzzing coverage. We should have enough time for fuzzing this properly during the freeze period.
Yet, thats my general opinion. I understand that it is annoying to re-review code you've checked dozens of times over the past years and want to move on once for all. So.. will just continue reviewing 🚜.
There was a problem hiding this comment.
Almost finish, left a question #22838 (comment). The rest are nits.
There was a problem hiding this comment.
In 7ca8b5b5c:
This could be the default behavior but when the "internal" argument is provided, I think all descriptors should follow that argument (all internal or all external if the flag is set). Then, on a follow-up, we could make "internal" accept an array, allowing users to precisely determine which descriptor in the multipath is internal and which is not.
There was a problem hiding this comment.
The intended and expected usage is for 2 multipath elements for receiving and change. I think diverging from that for default behavior will be confusing, especially since multipath descriptors are in use by other software and use this pattern. I'm fine with a followup that allows the user to override that and specify which should be internal.
To prepare for returning multipath descriptors which will be a shorthand for specifying multiple descriptors, change ParseScript's signature to return a vector.
Multipath specifiers are derivation path indexes of the form `<i;j;k;...>` used for specifying multiple derivation paths for a descriptor. Only one multipath specifier is allowed per PubkeyProvider. This is syntactic sugar which is parsed into multiple distinct descriptors. One descriptor will have all of the `i` paths, the second all of the `j` paths, the third all of the `k` paths, and so on. ParseKeypath will always return a vector of keypaths with the same size as the multipath specifier. The callers of this function are updated to deal with this case and return multiple PubkeyProviders. Their callers have also been updated to handle vectors of PubkeyProviders. Co-Authored-By: furszy <matiasfurszyfer@protonmail.com>
When given a descriptor which contins a multipath derivation specifier, a vector of descriptors will be returned.
When given a multipath descriptor, derive all of the descriptors. The derived addresses will be returned in an object consisting of multiple arrays. For compatibility, when given a single path descriptor, the addresses are provided in a single array as before.
Instead of applying internal-ness to all keys being imported at the same time, apply it on a per key basis. So each key that is imported will carry with it whether it is for the change keypool.
Multipath descriptors will be imported as multiple separate descriptors. When there are exactly 2 multipath items, the first descriptor will be for receiving addreses, and the second for change addresses. When importing a multipath descriptor, 'internal' cannot be specified.
Multipath descriptors will be imported as multiple separate descriptors. When there are 2 multipath items, the first descriptor will be for receiving addresses and the second for change. This mirrors importmulti.
Test that both importmulti and importdescriptors behave as expected when importing a multipath descriptor.
glozow
left a comment
There was a problem hiding this comment.
light code review ACK a0abcbd
I'm unfamiliar with the descriptor code, but the refactoring commits look like pure refactoring, behavior change commits look correct, RPC changes seem intuitive, and tests seem to cover things when I introduce mutations.
|
|
||
| multi(2,xpub.../<0;1;2>/0/*,xpub.../<2;3;4>/*) | ||
|
|
||
| will expand to the two descriptors |
| const auto& parsed_desc = parsed_descs.at(j); | ||
| bool desc_internal = internal.has_value() && internal.value(); | ||
| if (parsed_descs.size() == 2) { | ||
| desc_internal = j == 1; |
There was a problem hiding this comment.
Could also add a comment here on the j==1, "first is receiving, second is change" or something
| { | ||
| return RPCHelpMan{"importdescriptors", | ||
| "\nImport descriptors. This will trigger a rescan of the blockchain based on the earliest timestamp of all descriptors being imported. Requires a new wallet backup.\n" | ||
| "When importing descriptors with multipath key expressions, if the multipath specifier contains exactly two elements, the descriptor produced from the second elements will be imported as an internal descriptor.\n" |
There was a problem hiding this comment.
nit "second elements". Also could be helpful that apart from this special case, all multipath are non-internal
| std::optional<size_t> multipath_segment_index; | ||
| std::vector<uint32_t> multipath_values; | ||
| std::unordered_set<uint32_t> seen_multipath; |
There was a problem hiding this comment.
Could add some comments for future wallet reviewers, maybe mentioning that they should be consistent with each other (multipath_segment_index.has_value() == !multipath_values.empty() == !seen_multipath.empty())
| if (!multipath_segment_index) { | ||
| out.emplace_back(std::move(path)); | ||
| } else { | ||
| // Replace the multipath placeholder with each value while generating paths |
There was a problem hiding this comment.
Can Assume(!multipath_values.empty()) here
|
@pythcoiner @mjdietzx Can you please re-review? |
|
reACK a0abcbd |
|
reACK a0abcbd |
There was a problem hiding this comment.
With this merge, would it be a good idea to update relevant BIPs (possibly 88, 388, 389)?
e.g. https://github.com/bitcoin/bips/blob/master/bip-0088.mediawiki#compatibility
"There is a discussion on path templating for bitcoin script descriptors at #17190, which proposes the format xpub...{0,1}/, of which the {0,1}/ part would correspond to the partial path template in the format of this BIP."
| apostrophe = last == '\''; | ||
| const Span<const char>& elem = split[i]; | ||
|
|
||
| // Check if element contain multipath specifier |
There was a problem hiding this comment.
nit s/contain/contains (if anyone updates for the other review suggestions)
| out.emplace_back(std::move(path)); | ||
| } else { | ||
| // Replace the multipath placeholder with each value while generating paths | ||
| for (size_t i = 0; i < multipath_values.size(); i++) { |
There was a problem hiding this comment.
nit, compilers may auto-optimize here but if anyone handles the other feedback suggestions: s/i++/++i/
It is convenient to have a descriptor which specifies both receiving and change addresses in a single string. However, as discussed in #17190 (comment), it is not feasible to use a generic multipath specification like BIP 88 due to combinatorial blow up and that it would result in unexpected descriptors.
To resolve that problem, this PR proposes a targeted solution which allows only a single pair of 2 derivation indexes to be inserted in the place of a single derivation index. So instead of two descriptor
wpkh(xpub.../0/0/*)andwpkh(xpub.../0/1/*)to represent receive and change addresses, this could be written aswpkh(xpub.../0/<0;1>/*). The multipath specifier is of the form<NUM;NUM>. EachNUMcan have its own hardened specifier, e.g.<0;1h>is valid. The multipath specifier can also only appear in one path index in the derivation path.This results in the parser returning two descriptors. The first descriptor uses the first
NUMin all pairs present, and the second uses the secondNUM. In our implementation, if a multipath descriptor is not provided, a pair is still returned, but the second element is justnullptr.The wallet will not output the multipath descriptors (yet). Furthermore, when a multipath descriptor is imported, it is expanded to the two descriptors and each imported on its own, with the second descriptor being implicitly for internal (change) addresses. There is no change to how the wallet stores or outputs descriptors (yet).
Note that the path specifier is different from what was proposed. It uses angle brackets and the semicolon because these are unused characters available in the character set and I wanted to avoid conflicts with characters already in use in descriptors.
Closes #17190