-
Notifications
You must be signed in to change notification settings - Fork 38.7k
Change UpdateForDescendants to use Epochs #18120
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change UpdateForDescendants to use Epochs #18120
Conversation
|
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ConflictsNo conflicts as of last run. |
sdaftuar
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK, this looks correct to me.
I tried to benchmark some sort of "worst-case" scenario, where all of a block's transactions have all of the mempool as descendants, making the number of in-mempool transactions and number of in-block transactions as large as possible. I came up with a scenario of a block that has 250 transactions and a mempool of 77,500 transactions, and measured how long UpdateTransactionsFromBlock takes on my machine after disconnecting that block. In the 0.19 branch, this took about 54 seconds. In master, this takes about 48 seconds, and after this commit it takes about 25 seconds.
(25 seconds is still terrible, of course, and we should continue to find ways to make structural improvements so that we don't have long delays in the critical path of block acceptance, but this change makes sense to me as a significant algorithmic improvement.)
| vecEntries update_cache; | ||
| update_cache.reserve(direct_children.size()); | ||
| // mark every direct_child as visited so that we don't accidentally re-add them | ||
| // to the cache in the grandchild is child case |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: This comment could be improved, for code readers not as familiar with the issues here. Perhaps something like:
mark every direct child as visited, so that if we encounter one of these transactions again
(as the descendant of some other direct child), we don't re-add them to the cache.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe adding an example helps?
mark every direct child as visited, so that if we encounter one of these transactions again
(as the descendant of some other direct child), we don't re-add them to the cache. e.g.,
Tx A: L1 -> {O1, O2}
Tx B: {O1} -> {P1}
Tx C: {P1, O2} -> {Q1}
In this case transaction C is both a direct child of A and a grandchild via B.
I can improve the documentation with this in the follow up PR where we clean this up more (because maybe we'd just end up removing the caching anyways).
|
Dismaying but I noticed another flaw here; I don't think we ever check if the direct children of a transaction are in the cache. I'll fix this in the follow up PR for review there. |
Refactors UpdateForDescendants to use Epochs instead of sets to de-duplicate traversal, and replaces the cache entries with a vector instead of a set. This is a straightforward win. The algorithm is a bit clever so as not to require two separate vectors for txiters that need expanding and txiters that have already been processed, but I think it's relatively easy to reason about.
extracted from #18063 as a standalone because it turns out the DoS issues with this code perhaps merit more extensive review http://gnusha.org/bitcoin-core-dev/2020-02-11.log. That PR will stay open for review, but this PR is designed as an "obviously better" win that can be merged now to keep progress up on the mempool project.