Skip to content

Conversation

@achow101
Copy link
Member

@achow101 achow101 commented Feb 25, 2020

Improves the descriptor cache by changing it from a std::vector<unsigned char> to a newly introduced DescriptorCache class. Instead of serializing pubkeys and whatever else we would want to cache in a way that may not be backwards compatible, we instead create a DescriptorCache object and populate it. This object contains only an xpub cache. Since the only PubkeyProvider that used the cache is the BIP32PubkeyProvider we just have it store the xpubs instead of the pubkeys. This allows us to have both the parent xpub and the child xpubs in the same container. The map is keyed by KeyOriginInfo.

Sine we are caching CExtPubKeys in DescriptorCache, BIP32PubKeyProviders can use the cached parent xpubs to derive the children if unhardened derivation is used in the last step. This also means that we can still derive the keys for a BIP32PubkeyProvider that has hardened derivation steps. When combined with descriptor wallets, this should allow us to be able to import a descriptor with an xprv and hardened steps and still be able to derive from it. In that sense, this is an alternative to #18163

To test that this works, the tests have been updated to do an additional Expand at the i + 1 position. This expansion is not cached. We then do an ExpandFromCache at i + 1 and use the cache that was produced by the expansion at i. This way, we won't have the child xpubs for i + 1 but we will have the parent xpubs. So this checks whether the parent xpubs are being stored and can be used to derive the child keys. Descriptors that have a hardened last step are skipped for this part of the test because that will always require private keys.

@DrahtBot
Copy link
Contributor

DrahtBot commented Feb 25, 2020

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

Copy link
Member

@Sjors Sjors left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK.

I'm seeing errors on Travis and locally:

AssertionError: Unexpected stderr Assertion failed: (fValid), function GetPubKey, file key.cpp, line 185.

Copy link
Member

@instagibbs instagibbs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is getting importmulti test failures

@achow101
Copy link
Member Author

Fixed the assertion. Travis seems to be seeing a memory leak but I'm having trouble figuring out where that is.

@achow101
Copy link
Member Author

I've reworked how the caching works to get rid of the global_pos and internal_pos stuff.

Instead of having ExpandHelper go through the cache and find the pubkeys, we pass the cache into the PubkeyProviders and let them do the cache lookups. Since really only BIP32PubkeyProvider uses the cache, we can get rid of the position stuff by having it store only xpubs in a map. So for all derived keys, we store their KeyOriginInfo and xpub so that later the BIP32PubkeyProvider can look them up. The last parent is still being stored so that additional derivation can occur.

This consolidates cache reads and writes and encapsulates them in the classes that actually use the cache.

Copy link
Member

@Sjors Sjors left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a544f0e7459f346a1bf129c9dc978fe46c6abf73 looks correct. Much clearer than before, but still some confusing bits.

I find the separation of 4f24907f4007b2786ee39dcb8d6f86e52494db10 a6feb5d1aee6ae08b87029561873471cdd62a36a confusing, with the ExpandFromCache signature changing in two steps.

Switching from pubkeys to xpubs plus origin info increases the cache size by ~40 bytes per key, or 240 KB for a typical descriptor wallet with change & receive for three output types, with the first 1000 keys expanded. That's probably acceptable, though maybe the serialization could be optimized to leave out the chain code for "leaf" keys (meh, complexity).

This could be a good time to rename m_derive to m_range_derive to make it clear that this refers to whether a given (final?) path element is (hardened) ranged or not.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth skipping the parent write once it's done?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lookup is just as expensive as an insertion, so I don't think so.

@achow101
Copy link
Member Author

I've squashed the middle three commits together (the ones that just changed to using DescriptorCache) so that the signature of Expand and ExpandHelper don't change so many times.

Switching from pubkeys to xpubs plus origin info increases the cache size by ~40 bytes per key, or 240 KB for a typical descriptor wallet with change & receive for three output types, with the first 1000 keys expanded. That's probably acceptable, though maybe the serialization could be optimized to leave out the chain code for "leaf" keys (meh, complexity).

We could do what I did originally and have a pubkey cache, but I felt that to be unnecessary. We would still be storing KeyOriginInfo.

This could be a good time to rename m_derive to m_range_derive to make it clear that this refers to whether a given (final?) path element is (hardened) ranged or not.

Why? I find it to be clear.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to assert(m_xpubs.count(origin_info) == 0)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not at the moment: #18204 (comment)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Performance penalty aside, makes sense to assert it's the same xpub if already cached?

Copy link
Member

@Sjors Sjors left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could do what I did originally and have a pubkey cache, but I felt that to be unnecessary.

That's fine. Unless future benchmarks show that DescriptorScriptPubKeyMan::TopUp() is particularly slow (76a355e), compared to the other work that happens when loading a wallet.

Can you move 568a90f908313466f5a57c46dc3a16cc01117f70 and 23aef1932c7f8f580928eac595ad3928f306ae35 from #15528 to here? 568a90f908313466f5a57c46dc3a16cc01117f70 could use a test to show that nothing is added to cache.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This CExtPubKey juggling still confuses me :-)

I added some comments for my own sanity. Will probably need to read this again later.

CExtPubKey extkey_out = m_extkey;    // to hold extended key for *key (m/.../k)
CExtPubKey extkey_parent = m_extkey; // to hold extended key for the parent of *key (m/.../k/pos)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// Check if the parent at m/.../k is cached

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// Get the extended private key at m/.../k

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// Get the extended public key at m/.../k/i or m/.../k/i'

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(one line up)

// Set parent to the extended public key at m/.../k

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(above)

// If this is not a ranged descriptor, we return the public key at `m/.../k`, otherwise at `m/.../k/i

@instagibbs
Copy link
Member

Unable to "review" at the moment due to github issues, but d90075e looks logically correct (will swing back here when back up and drop comments)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

future work: Noticed there are no unit tests for KeyOriginInfo, would be nice to have

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any? The whole path has to be unhardened right?

Copy link
Member Author

@achow101 achow101 Feb 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just the last range step. The range derivation type has to be unhardened or none. This means it either ends with * or with an absolute path. There can be hardened steps in between, and even have the last step be hardened if it is not a range.

This thing about a non-ranged path is kind of a hack to make sure that we cache the pubkey for descriptors that have keys like xprv../1'. We have some examples of this in the tests.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to clarify, there's no test for this AFAIK, but xpriv/../*/... isn't allowed, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, so we don't do any intermediate caching if we can always generate from "root" anyways with pubkey.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to clarify, there's no test for this AFAIK, but xpriv/../*/... isn't allowed, right?

Yes. Maybe we should add a CheckUnparsable for that.

I see, so we don't do any intermediate caching if we can always generate from "root" anyways with pubkey.

Actually no. We always cache if there are xpubs. If the descriptor has just an xpub, e.g. wpkh(xpub...), we actually would cache that xpub. If it were wpkh(xpub.../0), we would cache the derived xpub. Same if it is hardened.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'm agreeing with you here.

@achow101 achow101 force-pushed the desc-xpub-cache branch 2 times, most recently from 2be7987 to 31cd940 Compare February 27, 2020 18:25
@achow101
Copy link
Member Author

Can you move 568a90f and 23aef19 from #16528 to here? 568a90f could use a test to show that nothing is added to cache.

Done. Also squashed a bit and added tests.

@Sjors
Copy link
Member

Sjors commented Feb 28, 2020

Travis and AppVeyor didn't run. Probably because of the Github problems yesterday. It might help to force push some trivial change.

Copy link
Member

@Sjors Sjors left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keeping you busy :-)

I like the progression in the commits now: b434a8769956573b2ee8eaa2b4826f0e82d39f1f maintains the existing behaviour of caching individual pub keys, except that it switches to xpubs. The next commits then reduce the amount of stuff cached.

@achow101
Copy link
Member Author

I've added another commit to internally cache that parent xpub inside of BIP32PubkeyProvider so the Expand doesn't constantly re-derive the same xpub.

Copy link
Member

@Sjors Sjors left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That last commit may not be necessary if Topup() tries ExpandFromCache first: #16528 (comment) (but if I'm wrong about that, then it's fine to keep it). (nvm, there are plenty of other places where Expand() is used, e.g. importdescriptors, deriveaddresses)

Copy link
Member

@instagibbs instagibbs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code review ACK 196c976

@instagibbs
Copy link
Member

code review re-ACK d7b2411

Copy link
Member

@Sjors Sjors left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review ACK d7b241109f74b3c1c407929930e7d035cdcf60c9 modulo intermediate commit compile issue.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

m_base_extkey could be a nice alternative

@Sjors
Copy link
Member

Sjors commented Mar 7, 2020

re-ACK 31304fc4f8f0d7870794d8c675831d8c73c959cb

achow101 added 7 commits March 7, 2020 10:13
Renaming clarifies that m_extkey is actually the root
extkey that keys are derived from.
…xpand and GetPubKey

Have Expand, ExpandFromCache, and ExpandHelper take additional DescriptorCache
parameters. These are then passed into PubkeyProvider::GetPubKey which
also takes them as arguments.

Reading and writing to the cache is pushed down into GetPubKey. The old cache where
pubkeys are serialized to a vector is completely removed and instead xpubs are being
cached in DescriptorCache.
If unhardened derivation is used, cache the immediate derivation
parent xpub and use it for unhardened derivation
Also adds tests for this:
For ranged descriptors with unhardened derivation, we expect to
find parent keys in the cache but no child keys.

For descriptors containing an xpub but do not have unhardened derivation
(i.e. hardened derivation or single xpub with or without derivation),
we expect to find all of the keys in the cache, and the same
number of keys in the cache as in the SigningProvider.

For everything else (no xpub), nothing should be cached at all.
Optimize Expand by having BIP32PubkeyProvider also cache the parent
(or only) xpub within itself. Since Expand does not provide a read
cache, it is useful to internally cache this xpub to avoid re-deriving
the same xpub.
@instagibbs
Copy link
Member

code review re-re-ACK 09e2507

@fanquake fanquake requested a review from Sjors March 10, 2020 01:47
Copy link
Member

@Sjors Sjors left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re-ACK 09e2507

@fanquake fanquake requested a review from sipa March 10, 2020 23:49
@laanwj laanwj merged commit 7f8176a into bitcoin:master Mar 13, 2020
sidhujag pushed a commit to syscoin/syscoin that referenced this pull request Mar 14, 2020
…xpubs

09e2507 Cache parent xpub inside of BIP32PubkeyProvider (Andrew Chow)
deb791c Only cache xpubs that have a hardened last step (Andrew Chow)
f76733e Cache the immediate derivation parent xpub (Andrew Chow)
58f54b6 Add DescriptorCache* read_cache and DescriptorCache* write_cache to Expand and GetPubKey (Andrew Chow)
66c2cad Rename BIP32PubkeyProvider.m_extkey to m_root_extkey (Andrew Chow)
df55d44 Track the index of the key expression in PubkeyProvider (Andrew Chow)
474ea3b Introduce DescriptorCache struct which caches xpubs (Andrew Chow)

Pull request description:

  Improves the descriptor cache by changing it from a `std::vector<unsigned char>` to a newly introduced `DescriptorCache` class. Instead of serializing pubkeys and whatever else we would want to cache in a way that may not be backwards compatible, we instead create a `DescriptorCache` object and populate it. This object contains only an xpub cache. Since the only `PubkeyProvider` that used the cache is the `BIP32PubkeyProvider` we just have it store the xpubs instead of the pubkeys. This allows us to have both the parent xpub and the child xpubs in the same container. The map is keyed by `KeyOriginInfo`.

  Sine we are caching `CExtPubKey`s in `DescriptorCache`, `BIP32PubKeyProviders` can use the cached parent xpubs to derive the children if unhardened derivation is used in the last step. This also means that we can still derive the keys for a `BIP32PubkeyProvider` that has hardened derivation steps. When combined with descriptor wallets, this should allow us to be able to import a descriptor with an `xprv` and hardened steps and still be able to derive from it. In that sense, this is an alternative to bitcoin#18163

  To test that this works, the tests have been updated to do an additional `Expand` at the `i + 1` position. This expansion is not cached. We then do an `ExpandFromCache` at `i + 1` and use the cache that was produced by the expansion at `i`. This way, we won't have the child xpubs for `i + 1` but we will have the parent xpubs. So this checks whether the parent xpubs are being stored and can be used to derive the child keys. Descriptors that have a hardened last step are skipped for this part of the test because that will always require private keys.

ACKs for top commit:
  instagibbs:
    code review re-re-ACK bitcoin@09e2507
  Sjors:
    re-ACK 09e2507

Tree-SHA512: 95c8d0092274cdf115ce39f6d49dec767679abf3758d5b9e418afc308deca9dc6f67167980195bcc036cd9c09890bbbb39ec1dacffbfacdc03efd72a7e23b276
sidhujag pushed a commit to syscoin-core/syscoin that referenced this pull request Nov 10, 2020
…xpubs

09e2507 Cache parent xpub inside of BIP32PubkeyProvider (Andrew Chow)
deb791c Only cache xpubs that have a hardened last step (Andrew Chow)
f76733e Cache the immediate derivation parent xpub (Andrew Chow)
58f54b6 Add DescriptorCache* read_cache and DescriptorCache* write_cache to Expand and GetPubKey (Andrew Chow)
66c2cad Rename BIP32PubkeyProvider.m_extkey to m_root_extkey (Andrew Chow)
df55d44 Track the index of the key expression in PubkeyProvider (Andrew Chow)
474ea3b Introduce DescriptorCache struct which caches xpubs (Andrew Chow)

Pull request description:

  Improves the descriptor cache by changing it from a `std::vector<unsigned char>` to a newly introduced `DescriptorCache` class. Instead of serializing pubkeys and whatever else we would want to cache in a way that may not be backwards compatible, we instead create a `DescriptorCache` object and populate it. This object contains only an xpub cache. Since the only `PubkeyProvider` that used the cache is the `BIP32PubkeyProvider` we just have it store the xpubs instead of the pubkeys. This allows us to have both the parent xpub and the child xpubs in the same container. The map is keyed by `KeyOriginInfo`.

  Sine we are caching `CExtPubKey`s in `DescriptorCache`, `BIP32PubKeyProviders` can use the cached parent xpubs to derive the children if unhardened derivation is used in the last step. This also means that we can still derive the keys for a `BIP32PubkeyProvider` that has hardened derivation steps. When combined with descriptor wallets, this should allow us to be able to import a descriptor with an `xprv` and hardened steps and still be able to derive from it. In that sense, this is an alternative to bitcoin#18163

  To test that this works, the tests have been updated to do an additional `Expand` at the `i + 1` position. This expansion is not cached. We then do an `ExpandFromCache` at `i + 1` and use the cache that was produced by the expansion at `i`. This way, we won't have the child xpubs for `i + 1` but we will have the parent xpubs. So this checks whether the parent xpubs are being stored and can be used to derive the child keys. Descriptors that have a hardened last step are skipped for this part of the test because that will always require private keys.

ACKs for top commit:
  instagibbs:
    code review re-re-ACK bitcoin@09e2507
  Sjors:
    re-ACK 09e2507

Tree-SHA512: 95c8d0092274cdf115ce39f6d49dec767679abf3758d5b9e418afc308deca9dc6f67167980195bcc036cd9c09890bbbb39ec1dacffbfacdc03efd72a7e23b276
jasonbcox pushed a commit to Bitcoin-ABC/bitcoin-abc that referenced this pull request Nov 13, 2020
Summary:
Backport of Core [[bitcoin/bitcoin#18204 | PR18204]] part [1/7] : bitcoin/bitcoin@474ea3b

Depends on D8380

Test Plan:
  ninja all check

Reviewers: #bitcoin_abc, majcosta

Reviewed By: #bitcoin_abc, majcosta

Differential Revision: https://reviews.bitcoinabc.org/D8381
jasonbcox pushed a commit to Bitcoin-ABC/bitcoin-abc that referenced this pull request Nov 13, 2020
Summary:
Backport of Core [[bitcoin/bitcoin#18204 | PR18204]] part [2/7] : bitcoin/bitcoin@df55d44

Depends on D8381

Test Plan:
  ninja all check

Reviewers: #bitcoin_abc, majcosta

Reviewed By: #bitcoin_abc, majcosta

Differential Revision: https://reviews.bitcoinabc.org/D8382
jasonbcox pushed a commit to Bitcoin-ABC/bitcoin-abc that referenced this pull request Nov 13, 2020
Summary:
Renaming clarifies that m_extkey is actually the root
extkey that keys are derived from.

Backport of Core [[bitcoin/bitcoin#18204 | PR18204]] part [3/7] : bitcoin/bitcoin@66c2cad

Depends on D8382

Test Plan:
  ninja all check

Reviewers: #bitcoin_abc, majcosta

Reviewed By: #bitcoin_abc, majcosta

Differential Revision: https://reviews.bitcoinabc.org/D8383
jasonbcox pushed a commit to Bitcoin-ABC/bitcoin-abc that referenced this pull request Nov 13, 2020
…xpand and GetPubKey

Summary:
Have Expand, ExpandFromCache, and ExpandHelper take additional DescriptorCache
parameters. These are then passed into PubkeyProvider::GetPubKey which
also takes them as arguments.

Reading and writing to the cache is pushed down into GetPubKey. The old cache where
pubkeys are serialized to a vector is completely removed and instead xpubs are being
cached in DescriptorCache.

Backport of Core [[bitcoin/bitcoin#18204 | PR18204]] part [4/7] : bitcoin/bitcoin@58f54b6

Depends on D8383

Test Plan:
  ninja all check

Reviewers: #bitcoin_abc, majcosta

Reviewed By: #bitcoin_abc, majcosta

Differential Revision: https://reviews.bitcoinabc.org/D8384
jasonbcox pushed a commit to Bitcoin-ABC/bitcoin-abc that referenced this pull request Nov 13, 2020
Summary:
If unhardened derivation is used, cache the immediate derivation
parent xpub and use it for unhardened derivation

Backport of Core [[bitcoin/bitcoin#18204 | PR18204]] part [5/7] : bitcoin/bitcoin@f76733e

Depends on D8384 and D8385

Test Plan:
  ninja all check

Reviewers: #bitcoin_abc, majcosta

Reviewed By: #bitcoin_abc, majcosta

Differential Revision: https://reviews.bitcoinabc.org/D8386
deadalnix pushed a commit to Bitcoin-ABC/bitcoin-abc that referenced this pull request Nov 13, 2020
Summary:
Also adds tests for this:
For ranged descriptors with unhardened derivation, we expect to
find parent keys in the cache but no child keys.

For descriptors containing an xpub but do not have unhardened derivation
(i.e. hardened derivation or single xpub with or without derivation),
we expect to find all of the keys in the cache, and the same
number of keys in the cache as in the SigningProvider.

For everything else (no xpub), nothing should be cached at all.

Backport of Core [[bitcoin/bitcoin#18204 | PR18204]] part [6/7] : bitcoin/bitcoin@deb791c

Depends on D8386

Test Plan:
  ninja all check

Reviewers: #bitcoin_abc, majcosta

Reviewed By: #bitcoin_abc, majcosta

Differential Revision: https://reviews.bitcoinabc.org/D8387
deadalnix pushed a commit to Bitcoin-ABC/bitcoin-abc that referenced this pull request Nov 13, 2020
Summary:
Optimize Expand by having BIP32PubkeyProvider also cache the parent
(or only) xpub within itself. Since Expand does not provide a read
cache, it is useful to internally cache this xpub to avoid re-deriving
the same xpub.

Backport of Core [[bitcoin/bitcoin#18204 | PR18204]] part [7/7] : bitcoin/bitcoin@09e2507

Depends on D8387

Test Plan:
  ninja all check-all

Reviewers: #bitcoin_abc, majcosta

Reviewed By: #bitcoin_abc, majcosta

Differential Revision: https://reviews.bitcoinabc.org/D8388
@bitcoin bitcoin locked as resolved and limited conversation to collaborators Feb 15, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants