-
Notifications
You must be signed in to change notification settings - Fork 38.7k
descriptors: improve descriptor cache and cache xpubs #18204
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
Sjors
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concept ACK.
I'm seeing errors on Travis and locally:
AssertionError: Unexpected stderr Assertion failed: (fValid), function GetPubKey, file key.cpp, line 185.
instagibbs
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is getting importmulti test failures
a10bb99 to
6e9fc0c
Compare
|
Fixed the assertion. Travis seems to be seeing a memory leak but I'm having trouble figuring out where that is. |
6e9fc0c to
a544f0e
Compare
|
I've reworked how the caching works to get rid of the Instead of having This consolidates cache reads and writes and encapsulates them in the classes that actually use the cache. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a544f0e7459f346a1bf129c9dc978fe46c6abf73 looks correct. Much clearer than before, but still some confusing bits.
I find the separation of 4f24907f4007b2786ee39dcb8d6f86e52494db10 a6feb5d1aee6ae08b87029561873471cdd62a36a confusing, with the ExpandFromCache signature changing in two steps.
Switching from pubkeys to xpubs plus origin info increases the cache size by ~40 bytes per key, or 240 KB for a typical descriptor wallet with change & receive for three output types, with the first 1000 keys expanded. That's probably acceptable, though maybe the serialization could be optimized to leave out the chain code for "leaf" keys (meh, complexity).
This could be a good time to rename m_derive to m_range_derive to make it clear that this refers to whether a given (final?) path element is (hardened) ranged or not.
src/script/descriptor.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it worth skipping the parent write once it's done?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A lookup is just as expensive as an insertion, so I don't think so.
a544f0e to
08186b1
Compare
|
I've squashed the middle three commits together (the ones that just changed to using DescriptorCache) so that the signature of
We could do what I did originally and have a pubkey cache, but I felt that to be unnecessary. We would still be storing
Why? I find it to be clear. |
08186b1 to
d90075e
Compare
src/script/descriptor.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make sense to assert(m_xpubs.count(origin_info) == 0)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not at the moment: #18204 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Performance penalty aside, makes sense to assert it's the same xpub if already cached?
Sjors
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could do what I did originally and have a pubkey cache, but I felt that to be unnecessary.
That's fine. Unless future benchmarks show that DescriptorScriptPubKeyMan::TopUp() is particularly slow (76a355e), compared to the other work that happens when loading a wallet.
Can you move 568a90f908313466f5a57c46dc3a16cc01117f70 and 23aef1932c7f8f580928eac595ad3928f306ae35 from #15528 to here? 568a90f908313466f5a57c46dc3a16cc01117f70 could use a test to show that nothing is added to cache.
src/script/descriptor.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This CExtPubKey juggling still confuses me :-)
I added some comments for my own sanity. Will probably need to read this again later.
CExtPubKey extkey_out = m_extkey; // to hold extended key for *key (m/.../k)
CExtPubKey extkey_parent = m_extkey; // to hold extended key for the parent of *key (m/.../k/pos)
src/script/descriptor.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Check if the parent at m/.../k is cached
src/script/descriptor.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Get the extended private key at m/.../k
src/script/descriptor.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Get the extended public key at m/.../k/i or m/.../k/i'
src/script/descriptor.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(one line up)
// Set parent to the extended public key at m/.../k
src/script/descriptor.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(above)
// If this is not a ranged descriptor, we return the public key at `m/.../k`, otherwise at `m/.../k/i
|
Unable to "review" at the moment due to github issues, but d90075e looks logically correct (will swing back here when back up and drop comments) |
src/script/keyorigin.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
future work: Noticed there are no unit tests for KeyOriginInfo, would be nice to have
src/script/descriptor.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any? The whole path has to be unhardened right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just the last range step. The range derivation type has to be unhardened or none. This means it either ends with * or with an absolute path. There can be hardened steps in between, and even have the last step be hardened if it is not a range.
This thing about a non-ranged path is kind of a hack to make sure that we cache the pubkey for descriptors that have keys like xprv../1'. We have some examples of this in the tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to clarify, there's no test for this AFAIK, but xpriv/../*/... isn't allowed, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, so we don't do any intermediate caching if we can always generate from "root" anyways with pubkey.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to clarify, there's no test for this AFAIK, but
xpriv/../*/...isn't allowed, right?
Yes. Maybe we should add a CheckUnparsable for that.
I see, so we don't do any intermediate caching if we can always generate from "root" anyways with pubkey.
Actually no. We always cache if there are xpubs. If the descriptor has just an xpub, e.g. wpkh(xpub...), we actually would cache that xpub. If it were wpkh(xpub.../0), we would cache the derived xpub. Same if it is hardened.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'm agreeing with you here.
2be7987 to
31cd940
Compare
Done. Also squashed a bit and added tests. |
|
Travis and AppVeyor didn't run. Probably because of the Github problems yesterday. It might help to force push some trivial change. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Keeping you busy :-)
I like the progression in the commits now: b434a8769956573b2ee8eaa2b4826f0e82d39f1f maintains the existing behaviour of caching individual pub keys, except that it switches to xpubs. The next commits then reduce the amount of stuff cached.
31cd940 to
a492402
Compare
|
I've added another commit to internally cache that parent xpub inside of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That last commit may not be necessary if (nvm, there are plenty of other places where Expand() is used, e.g. Topup() tries ExpandFromCache first: #16528 (comment) (but if I'm wrong about that, then it's fine to keep it).importdescriptors, deriveaddresses)
instagibbs
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
code review ACK 196c976
196c976 to
d7b2411
Compare
|
code review re-ACK d7b2411 |
Sjors
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code review ACK d7b241109f74b3c1c407929930e7d035cdcf60c9 modulo intermediate commit compile issue.
src/script/descriptor.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
m_base_extkey could be a nice alternative
d7b2411 to
31304fc
Compare
|
re-ACK 31304fc4f8f0d7870794d8c675831d8c73c959cb |
Renaming clarifies that m_extkey is actually the root extkey that keys are derived from.
…xpand and GetPubKey Have Expand, ExpandFromCache, and ExpandHelper take additional DescriptorCache parameters. These are then passed into PubkeyProvider::GetPubKey which also takes them as arguments. Reading and writing to the cache is pushed down into GetPubKey. The old cache where pubkeys are serialized to a vector is completely removed and instead xpubs are being cached in DescriptorCache.
If unhardened derivation is used, cache the immediate derivation parent xpub and use it for unhardened derivation
Also adds tests for this: For ranged descriptors with unhardened derivation, we expect to find parent keys in the cache but no child keys. For descriptors containing an xpub but do not have unhardened derivation (i.e. hardened derivation or single xpub with or without derivation), we expect to find all of the keys in the cache, and the same number of keys in the cache as in the SigningProvider. For everything else (no xpub), nothing should be cached at all.
Optimize Expand by having BIP32PubkeyProvider also cache the parent (or only) xpub within itself. Since Expand does not provide a read cache, it is useful to internally cache this xpub to avoid re-deriving the same xpub.
31304fc to
09e2507
Compare
|
code review re-re-ACK 09e2507 |
Sjors
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re-ACK 09e2507
…xpubs 09e2507 Cache parent xpub inside of BIP32PubkeyProvider (Andrew Chow) deb791c Only cache xpubs that have a hardened last step (Andrew Chow) f76733e Cache the immediate derivation parent xpub (Andrew Chow) 58f54b6 Add DescriptorCache* read_cache and DescriptorCache* write_cache to Expand and GetPubKey (Andrew Chow) 66c2cad Rename BIP32PubkeyProvider.m_extkey to m_root_extkey (Andrew Chow) df55d44 Track the index of the key expression in PubkeyProvider (Andrew Chow) 474ea3b Introduce DescriptorCache struct which caches xpubs (Andrew Chow) Pull request description: Improves the descriptor cache by changing it from a `std::vector<unsigned char>` to a newly introduced `DescriptorCache` class. Instead of serializing pubkeys and whatever else we would want to cache in a way that may not be backwards compatible, we instead create a `DescriptorCache` object and populate it. This object contains only an xpub cache. Since the only `PubkeyProvider` that used the cache is the `BIP32PubkeyProvider` we just have it store the xpubs instead of the pubkeys. This allows us to have both the parent xpub and the child xpubs in the same container. The map is keyed by `KeyOriginInfo`. Sine we are caching `CExtPubKey`s in `DescriptorCache`, `BIP32PubKeyProviders` can use the cached parent xpubs to derive the children if unhardened derivation is used in the last step. This also means that we can still derive the keys for a `BIP32PubkeyProvider` that has hardened derivation steps. When combined with descriptor wallets, this should allow us to be able to import a descriptor with an `xprv` and hardened steps and still be able to derive from it. In that sense, this is an alternative to bitcoin#18163 To test that this works, the tests have been updated to do an additional `Expand` at the `i + 1` position. This expansion is not cached. We then do an `ExpandFromCache` at `i + 1` and use the cache that was produced by the expansion at `i`. This way, we won't have the child xpubs for `i + 1` but we will have the parent xpubs. So this checks whether the parent xpubs are being stored and can be used to derive the child keys. Descriptors that have a hardened last step are skipped for this part of the test because that will always require private keys. ACKs for top commit: instagibbs: code review re-re-ACK bitcoin@09e2507 Sjors: re-ACK 09e2507 Tree-SHA512: 95c8d0092274cdf115ce39f6d49dec767679abf3758d5b9e418afc308deca9dc6f67167980195bcc036cd9c09890bbbb39ec1dacffbfacdc03efd72a7e23b276
…xpubs 09e2507 Cache parent xpub inside of BIP32PubkeyProvider (Andrew Chow) deb791c Only cache xpubs that have a hardened last step (Andrew Chow) f76733e Cache the immediate derivation parent xpub (Andrew Chow) 58f54b6 Add DescriptorCache* read_cache and DescriptorCache* write_cache to Expand and GetPubKey (Andrew Chow) 66c2cad Rename BIP32PubkeyProvider.m_extkey to m_root_extkey (Andrew Chow) df55d44 Track the index of the key expression in PubkeyProvider (Andrew Chow) 474ea3b Introduce DescriptorCache struct which caches xpubs (Andrew Chow) Pull request description: Improves the descriptor cache by changing it from a `std::vector<unsigned char>` to a newly introduced `DescriptorCache` class. Instead of serializing pubkeys and whatever else we would want to cache in a way that may not be backwards compatible, we instead create a `DescriptorCache` object and populate it. This object contains only an xpub cache. Since the only `PubkeyProvider` that used the cache is the `BIP32PubkeyProvider` we just have it store the xpubs instead of the pubkeys. This allows us to have both the parent xpub and the child xpubs in the same container. The map is keyed by `KeyOriginInfo`. Sine we are caching `CExtPubKey`s in `DescriptorCache`, `BIP32PubKeyProviders` can use the cached parent xpubs to derive the children if unhardened derivation is used in the last step. This also means that we can still derive the keys for a `BIP32PubkeyProvider` that has hardened derivation steps. When combined with descriptor wallets, this should allow us to be able to import a descriptor with an `xprv` and hardened steps and still be able to derive from it. In that sense, this is an alternative to bitcoin#18163 To test that this works, the tests have been updated to do an additional `Expand` at the `i + 1` position. This expansion is not cached. We then do an `ExpandFromCache` at `i + 1` and use the cache that was produced by the expansion at `i`. This way, we won't have the child xpubs for `i + 1` but we will have the parent xpubs. So this checks whether the parent xpubs are being stored and can be used to derive the child keys. Descriptors that have a hardened last step are skipped for this part of the test because that will always require private keys. ACKs for top commit: instagibbs: code review re-re-ACK bitcoin@09e2507 Sjors: re-ACK 09e2507 Tree-SHA512: 95c8d0092274cdf115ce39f6d49dec767679abf3758d5b9e418afc308deca9dc6f67167980195bcc036cd9c09890bbbb39ec1dacffbfacdc03efd72a7e23b276
Summary: Backport of Core [[bitcoin/bitcoin#18204 | PR18204]] part [1/7] : bitcoin/bitcoin@474ea3b Depends on D8380 Test Plan: ninja all check Reviewers: #bitcoin_abc, majcosta Reviewed By: #bitcoin_abc, majcosta Differential Revision: https://reviews.bitcoinabc.org/D8381
Summary: Backport of Core [[bitcoin/bitcoin#18204 | PR18204]] part [2/7] : bitcoin/bitcoin@df55d44 Depends on D8381 Test Plan: ninja all check Reviewers: #bitcoin_abc, majcosta Reviewed By: #bitcoin_abc, majcosta Differential Revision: https://reviews.bitcoinabc.org/D8382
Summary: Renaming clarifies that m_extkey is actually the root extkey that keys are derived from. Backport of Core [[bitcoin/bitcoin#18204 | PR18204]] part [3/7] : bitcoin/bitcoin@66c2cad Depends on D8382 Test Plan: ninja all check Reviewers: #bitcoin_abc, majcosta Reviewed By: #bitcoin_abc, majcosta Differential Revision: https://reviews.bitcoinabc.org/D8383
…xpand and GetPubKey Summary: Have Expand, ExpandFromCache, and ExpandHelper take additional DescriptorCache parameters. These are then passed into PubkeyProvider::GetPubKey which also takes them as arguments. Reading and writing to the cache is pushed down into GetPubKey. The old cache where pubkeys are serialized to a vector is completely removed and instead xpubs are being cached in DescriptorCache. Backport of Core [[bitcoin/bitcoin#18204 | PR18204]] part [4/7] : bitcoin/bitcoin@58f54b6 Depends on D8383 Test Plan: ninja all check Reviewers: #bitcoin_abc, majcosta Reviewed By: #bitcoin_abc, majcosta Differential Revision: https://reviews.bitcoinabc.org/D8384
Summary: If unhardened derivation is used, cache the immediate derivation parent xpub and use it for unhardened derivation Backport of Core [[bitcoin/bitcoin#18204 | PR18204]] part [5/7] : bitcoin/bitcoin@f76733e Depends on D8384 and D8385 Test Plan: ninja all check Reviewers: #bitcoin_abc, majcosta Reviewed By: #bitcoin_abc, majcosta Differential Revision: https://reviews.bitcoinabc.org/D8386
Summary: Also adds tests for this: For ranged descriptors with unhardened derivation, we expect to find parent keys in the cache but no child keys. For descriptors containing an xpub but do not have unhardened derivation (i.e. hardened derivation or single xpub with or without derivation), we expect to find all of the keys in the cache, and the same number of keys in the cache as in the SigningProvider. For everything else (no xpub), nothing should be cached at all. Backport of Core [[bitcoin/bitcoin#18204 | PR18204]] part [6/7] : bitcoin/bitcoin@deb791c Depends on D8386 Test Plan: ninja all check Reviewers: #bitcoin_abc, majcosta Reviewed By: #bitcoin_abc, majcosta Differential Revision: https://reviews.bitcoinabc.org/D8387
Summary: Optimize Expand by having BIP32PubkeyProvider also cache the parent (or only) xpub within itself. Since Expand does not provide a read cache, it is useful to internally cache this xpub to avoid re-deriving the same xpub. Backport of Core [[bitcoin/bitcoin#18204 | PR18204]] part [7/7] : bitcoin/bitcoin@09e2507 Depends on D8387 Test Plan: ninja all check-all Reviewers: #bitcoin_abc, majcosta Reviewed By: #bitcoin_abc, majcosta Differential Revision: https://reviews.bitcoinabc.org/D8388
Improves the descriptor cache by changing it from a
std::vector<unsigned char>to a newly introducedDescriptorCacheclass. Instead of serializing pubkeys and whatever else we would want to cache in a way that may not be backwards compatible, we instead create aDescriptorCacheobject and populate it. This object contains only an xpub cache. Since the onlyPubkeyProviderthat used the cache is theBIP32PubkeyProviderwe just have it store the xpubs instead of the pubkeys. This allows us to have both the parent xpub and the child xpubs in the same container. The map is keyed byKeyOriginInfo.Sine we are caching
CExtPubKeys inDescriptorCache,BIP32PubKeyProviderscan use the cached parent xpubs to derive the children if unhardened derivation is used in the last step. This also means that we can still derive the keys for aBIP32PubkeyProviderthat has hardened derivation steps. When combined with descriptor wallets, this should allow us to be able to import a descriptor with anxprvand hardened steps and still be able to derive from it. In that sense, this is an alternative to #18163To test that this works, the tests have been updated to do an additional
Expandat thei + 1position. This expansion is not cached. We then do anExpandFromCacheati + 1and use the cache that was produced by the expansion ati. This way, we won't have the child xpubs fori + 1but we will have the parent xpubs. So this checks whether the parent xpubs are being stored and can be used to derive the child keys. Descriptors that have a hardened last step are skipped for this part of the test because that will always require private keys.