-
Notifications
You must be signed in to change notification settings - Fork 38.7k
wallet: reduce loading time by using unordered maps #16910
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
e963dd5 to
4ed9dd6
Compare
|
Shuffled some things around to get rid of the circular dependency. Also fixed a typo |
|
Concept ACK, especially improvements targeting big wallets (either lots of keys or lots of transactions). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concept ACK, will review and run valgrind+massif visualizer and flamegraphs to compare. Anywhere we can get a big wallet file? EDIT: Saw the script you posted in the PR body above... running.
|
To benchmark this, I applied this commit: achow101@16ebe13. It adds logging of the load time and also stops bitcoind after the wallet has been loaded in order to get more consistent results. Here are a couple graphs that may be interesting to people. Flame graph (svg hosted on my website, couldn't inline in this comment) of master: https://achow101.com/bitcoin-master-flame.svg Flame graph of this PR: https://achow101.com/bitcoin-reduce-wallet-load-flame.svg |
src/wallet/wallet.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like a bug here. Previously it would return true if any output was spent, now it will only return true if the first output was spent. Could fix pretty easily by passing const CTransaction& instead of txid, and looping over the outputs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like a bug here. Previously it would return true if any output was spent, now it will only return true if the first output was spent. Could fix pretty easily by passing const CTransaction& instead of txid, and looping over the outputs.
A suggestion for a followup: it could be useful to add a test for it to prevent regressions in the future.
f97e249 to
45a3ec5
Compare
|
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
|
I did 1000 runs of my wallet loading benchmark. The mean loading time on master was 18408.14423 ms and the mean loading time on with this PR was 17498.71984. This is ~1 second faster, which is a ~5% performance increase with this PR. I also did a 2 sample t-test to check that these results are statistically significant, and it appears that they are. |
45a3ec5 to
73bb122
Compare
73bb122 to
de0f388
Compare
3324a42 to
c26b49f
Compare
c26b49f to
fe0b951
Compare
f5428d4 to
05afffb
Compare
|
Concept ACK on using hashed containers. |
hebasto
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not using auto for iterators type? It could improve readability a bit :)
hebasto
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
05afffb4396b3a542d9a001642cad5b1afbc7978 changes look correct.
In commit 05afffb4396b3a542d9a001642cad5b1afbc7978 message s/mapScriptMetadata/m_script_metadata/
src/saltedhash.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
42bc524e82e627909ef4d4825728cd0f30ac740f
This comment could be dropped as all available reference docs refer to the size_t return type for a hasher operator().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is just moved code, I'm going to leave as is.
05afffb to
f9304b6
Compare
Done |
f9304b6 to
4b3439f
Compare
|
Had to rebase onto master to pick up the uint160 changes to CKeyID and CScriptID |
hebasto
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK 4b3439ffa93e7de1ce2fefcb16e989e5a1805569, tested on Linux Mint 20 (x86_64).
Did not make benchmarks, though.
e89c19a157d504492e0665fee81cac496d5fc7f6
Why are operator() definitions not placed in the header, that could imply inlining?
I'm not sure that it matters since we never call |
SaltedKeyIDHasher uses CSipHasher for hashing CKeyIDs SaltedScriptIDHasher uses CSipHasher for hashing CScriptIDs SaltedPubkeyasher uses CSipHasher for hashing CPubKeys SaltedScriptHasher uses CSipHasher for hashing CScripts
…e wallet Change mapKeys, mapScripts, mapCryptedKeys, mapKeyMetadata, m_script_metadata, mapWatchKeys to use std::unordered_map. Change setWatchOnly to use std::unordered_set
4b3439f to
8e3d1e7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
This has been open for a while without much review. Closing for now, I'll try to integrate some of these changes the next time I touch the relevant places (e.g. |


For large wallets with many transactions and keys, inserting and fetching from a
std::mapcan have a performance impact. Since we largely do not rely on the sorting properties ofstd::map, we can usestd::unordered_mapinstead which is a hash table and insertion and retrieval are constant time. We also usestd::unordered_multimapfor some things that werestd::multimap.The changed maps are:
mapWallet: Map of all transactionsmapTxSpends: Map of outpoints to the txids of the txs that spent them*
mapKeyMetadata: Map of key metadata*
mapScriptMetadata: Map of script metadata*
mapKeys: Map of private keys*
mapCryptedKeys: Map of encrypted private keys*
mapWatchKeys: Set of watch only keys.*
setWatchOnly: Set of watch only scripts (std::unordered_set, not a map)Change
mapWalletandmapTxSpendsdid require a few other changes and thus they are in their own commits. Additionally, the GUI relied on getting transactions from the wallet in a sorted order which unordered_map does not provide, so the commit "Change getWalletTxs to return a set instead of a vector" is needed in order to put all of the txs into a std::set (which is ordered) instead of a vector in order to retain the same behavior.mapTxSpendsalso relied on the sorted order to have some quick lookups, but these were changed to just do normal lookups over the same thing. It should be just as fast, or even faster, since std::unordered_map is a hash table.The hash function used for these unordered maps and sets is SipHash, using the SipHash module that we already have.
SaltedTxidHasherandSaltedOutPointHasherwere moved from their original locations in utxo set and validation code in order to also be used from the wallet. AdditionallySaltedIDHasherwas added to hash uint160s (i.e.CKeyIDandCScriptID) andSaltedScriptHasherwas added to hashCScripts.I did some becnhmarks with a large wallet I created on regtest using this script. This wallet is 169 MB in size with 117035 transactions and 103264 keys. It took ~20 secs to load on master, and ~18 secs with this change. So this change reduces wallet loading time by ~10%.