-
Notifications
You must be signed in to change notification settings - Fork 38.7k
validation: Add missing mempool locks #14193
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
|
Is the idea that you're trying to prevent RPC users from having an inconsistent state, like if they call getbestblockhash() and then call getrawmempool(), the latter should be consistent with the former? That may be reasonable, but I think it'd be good to have some documentation (whether a project document or even a gist explaining our thinking) explaining the consistency guarantees for RPC calls with respect to the mempool, validation, and the wallet. I feel like this kind of thing has come up before in a haphazard way and caused issues when we didn't think things through or missed an implication of a change. (Also, I think it would aid RPC users if they knew what our consistency guarantees were.) |
From the description, I assumed the intent of the PR was to prevent RPCs like But actually it seems like in all the places where the mempool is updated, the code already has both So maybe this PR is not doing what it intended to do, and instead it would be better just to add comments and fix thread safety annotations somehow. Maybe it would be possible to annotate |
|
@ryanofsky With inconsistent I don't mean invalid reads, but rather transactions that are invalid (and still in the process of removal) as of our current block tip. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code utACK fa3e0471a81252e333886b97d8d33bb275061d42. Code change seems perfectly fine, and it could perhaps lead to more consistency in some cases, but like the previous comment #14193 (comment) says, it's unclear which cases are better off after this change or what case motivated this PR. More comments and documentation would be helpful.
faf10f4 to
732a25a
Compare
|
@sdaftuar, @ryanofsky: Added a test with comments to illustrate my motivation a bit more. |
faecda3 to
fa27735
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Started review:
- dea5dbf2d5ca0177d3a4be665790288bef40e27e validation: Add missing mempool locks (1/2)
- fa27735e0f1fd52f03a17d7e9b94a66f310ddfa1 [test] Add test to check mempool consistency in case of reorgs (2/2)
I didn't look closely at the test yet, since it seems pretty complex. If you could a write a one or two sentence comment before the test describing how it works, that might be helpful.
I also think it would be useful to have a comment on the mempool::cs variable. It's definitely being used in a way I wouldn't have expected. From first appearance, it looks like it's just being used to protect mapTx and allow maybe reads & writes from multiple threads:
Lines 488 to 489 in 5d605b2
| mutable CCriticalSection cs; | |
| indexed_transaction_set mapTx GUARDED_BY(cs); |
But apparently it's supposed to not just provide thread safety, but also guarantee consistency between mempool and chainActive? Would suggest maybe:
/**
* Lock to guarantee consistency between mempool and external chain state.
*
* Lock must be held whenever mempool is read from or written to and
* whenever a block is connected. Lock can be released when the mempool
* contains no transactions that conflict internally or externally. It
* should be acquired after `cs_main` if both locks need to be held at the
* same time.
*/
mutable CCriticalSection cs;re: #14193 (comment)
I think it'd be good to have some documentation (whether a project document or even a gist explaining our thinking) explaining the consistency guarantees for RPC calls with respect to the mempool, validation, and the wallet
Documentation was added in #14592.
|
The test is trivial, but looks scary because of the boilerplate overhead of the cpp unit tests. Pretty much the only thing it is doing is to check that during a reorg a (simulated) rpc thread (e.g. getrawmempool) would either return a consistent mempool as of before the reorg or a consistent mempool as of after the reorg and never something inconsistent in the middle of a reorg. |
fa0515e to
fad0712
Compare
|
Added a commit with further comments and documentation. |
jamesob
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change in general seems fine to me (mempool state should update atomically after a contiguous batch of chain maintenance and not during) and the test is great, but it might be nice to:
- explain what the motivations are. There's an argument for RPC client consistency, but IIRC this may also be a prereq for Dandelion?
- add comments for new lock acquisitions explaining why we're grabbing a seemingly unrelated lock at a coarse grain (in line with what @ryanofsky is saying).
In practice, ActivateBestChain calls are very short and so I don't think there are any real-world downsides to locking ::mempool.cs for their duration - although I'm curious if this affects mempool access during IBD/reindex.
fad0712 to
fab7d54
Compare
I think I already did that. Let me know if any of them are still unclear. |
fab7d54 to
fae71ff
Compare
jamesob
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
utACK fae71ff
Just a few clarifying questions.
src/validation.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We require the pool.cs lock for the duration of this function (instead of just relying on the lock in pool.Expire) because we want to keep the mempool from changing while pcoinsTip modifications happen, correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, calling LimitMempoolSize only makes sense after you have written to the mempool, in which case you already have acquired the lock and can keep it for LimitMempoolSize.
fae7ec9 to
7d68982
Compare
7d68982 to
fa2b083
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
utACK fa2b083 [EDIT: was e284e422e75189794e24fe482819d8b1407857c3, from bad copy and paste]. Changes since last review: rebase after #15976, adding vTxHashes lock annotation, adding new commit dropping mempool lock for nTransactionsUpdated and making it atomic to avoid deadlock between mempool lock and g_best_block_mutex
promag
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concept ACK, some comments.
| CTxMemPool::CTxMemPool(CBlockPolicyEstimator* estimator) : | ||
| nTransactionsUpdated(0), minerPolicyEstimator(estimator) | ||
| CTxMemPool::CTxMemPool(CBlockPolicyEstimator* estimator) | ||
| : nTransactionsUpdated(0), minerPolicyEstimator(estimator) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👀
| // is consistent. That is, it has all txs that were there | ||
| // before the reorg. | ||
| assert(::mempool.mapTx.size() == txs.size()); | ||
| continue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why continue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the thread should continue and not exit (for now)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would it exit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would exit when it breaks. I think this is well documented in the test case and I am not sure what you are asking exactly.
If you are asking about cpp syntax, I find cppreference really helpful. E.g. https://en.cppreference.com/w/cpp/language/break
If you are asking about this test case logic, please write a more verbose question.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't get this either. If you have a loop, and an unconditional continue is the last statement in the loop, the continue isn't doing anything and could be removed.
bitcoin/src/test/validation_block_tests.cpp
Line 316 in fa2b083
| continue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I now I understand that you were asking why I put a redundant continue there.
That was intentional, because I felt it was more clear back when I wrote the test. Happy to remove, but I think we shouldn't spend too much time on style of tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't mean to discuss style! I honestly though something was wrong because of the pointless continue, like missing code or something else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, good point. Sorry for the confusion.
| // This thread is checking that the mempool either contains all of | ||
| // the transactions invalidated by the reorg, or none of them, and | ||
| // not some intermediate amount. | ||
| while (true) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is flawless, what guarantees that there's a non-atomic change once mempool.cs is released?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code changes in this pull request should guarantee that this doesn't happen here in the test (you can check it by running the test before the code changes and after), as well as for "real" rpc polling threads. Though, I didn't write a functional test for this.
|
code review ACK fa2b083 @ryanofsky didn't you ack the wrong commit above ? e284e422e75189794e24fe482819d8b1407857c3 ("Remove getBlockDepth method from Chain::interface") isn't part of this PR |
Sorry, bad copy and paste from the command line. Edited #14193 (review) to fix. |
|
Thanks! |
fa2b083 [test] Add test to check mempool consistency in case of reorgs (MarcoFalke) fabeb1f validation: Add missing mempool locks (MarcoFalke) fa0c9db txpool: Make nTransactionsUpdated atomic (MarcoFalke) Pull request description: Take the mempool read lock during reorgs, so that we don't accidentally read an inconsistent mempool. ACKs for top commit: laanwj: code review ACK fa2b083 ryanofsky: utACK fa2b083 [EDIT: was ~e284e422e75189794e24fe482819d8b1407857c3~, from bad copy and paste]. Changes since last review: rebase after #15976, adding vTxHashes lock annotation, adding new commit dropping mempool lock for nTransactionsUpdated and making it atomic to avoid deadlock between mempool lock and g_best_block_mutex Tree-SHA512: cfe7777993589087753e000e3736d79d320dca412383fb77b56bef8946a04049722bf888c11b6f722adf677165185c7e58b4a269f7c5fa25e84dda375f6c8a7d
fa2b083 [test] Add test to check mempool consistency in case of reorgs (MarcoFalke) fabeb1f validation: Add missing mempool locks (MarcoFalke) fa0c9db txpool: Make nTransactionsUpdated atomic (MarcoFalke) Pull request description: Take the mempool read lock during reorgs, so that we don't accidentally read an inconsistent mempool. ACKs for top commit: laanwj: code review ACK fa2b083 ryanofsky: utACK fa2b083 [EDIT: was ~e284e422e75189794e24fe482819d8b1407857c3~, from bad copy and paste]. Changes since last review: rebase after bitcoin#15976, adding vTxHashes lock annotation, adding new commit dropping mempool lock for nTransactionsUpdated and making it atomic to avoid deadlock between mempool lock and g_best_block_mutex Tree-SHA512: cfe7777993589087753e000e3736d79d320dca412383fb77b56bef8946a04049722bf888c11b6f722adf677165185c7e58b4a269f7c5fa25e84dda375f6c8a7d
6b6be41 gui: Make polling in ClientModel asynchronous (João Barbosa) Pull request description: After #14193 `ClientModel::updateTimer` can take some time, as such the GUI hangs, like #17112. Fixes this by polling in a background thread and updating the GUI asynchronously. ACKs for top commit: laanwj: ACK 6b6be41 Sjors: Code review re-ACK 6b6be41; only replaced the scary cast with `{ timer->start(); }` Tree-SHA512: fd98b0c6535441aee3ee03c48b58b4b1f9bdd172ec6b8150da883022f719df34cabfd4c133412bf410e7f709f7bf1e9ef16dca05ef1f3689d526ceaeee51de38
Summary: This was originally feedback given to me on D6539. Since then I've decided to pass CTxMemPool to the addForBlock method too. This avoids having to declare an `extern CTxMemPool g_mempool` symbol on txmempool.h, and is nicer in general. Alternatively, I've considered adding a `CTxMemPool& m_pool` to DisconnectedBlockTransactions plus an explicit constructor on the assumption that such objects can only ever have one mempool they work with, but went with this instead for simplicity. The above changes are unrelated to [[bitcoin/bitcoin#14193 | PR14193]] however, so I'm putting them on a separate diff not to overcomplicate review of the latter. Test Plan: ninja check check-functional Reviewers: #bitcoin_abc, deadalnix Reviewed By: #bitcoin_abc, deadalnix Differential Revision: https://reviews.bitcoinabc.org/D7439
Summary: bitcoin/bitcoin@fa0c9db --- Depends on D7439 Partial backport of Core [[bitcoin/bitcoin#14193 | PR14193]] Test Plan: ninja check-all Reviewers: #bitcoin_abc, deadalnix Reviewed By: #bitcoin_abc, deadalnix Differential Revision: https://reviews.bitcoinabc.org/D6537
Summary: bitcoin/bitcoin@fabeb1f --- Depends on D6537 Partial backport of Core [[bitcoin/bitcoin#14193 | PR14193]] Test Plan: cmake .. -DCMAKE_BUILD_TYPE=Debug ninja check-all Reviewers: #bitcoin_abc, Fabien, deadalnix Reviewed By: #bitcoin_abc, Fabien, deadalnix Subscribers: deadalnix, Fabien Differential Revision: https://reviews.bitcoinabc.org/D6539
…of reorgs Summary: bitcoin/bitcoin@fa2b083 Depends on D6539 Concludes backport of Core [[bitcoin/bitcoin#14193 | PR14193]] Test Plan: cmake -DCMAKE_BUILD_TYPE=Debug ninja check check-functional Reviewers: #bitcoin_abc, deadalnix Reviewed By: #bitcoin_abc, deadalnix Differential Revision: https://reviews.bitcoinabc.org/D7440
zcash: cherry picked from commit fa0c9db zcash: bitcoin/bitcoin#14193
zcash: cherry picked from commit fabeb1f zcash: bitcoin/bitcoin#14193
Take the mempool read lock during reorgs, so that we don't accidentally read an inconsistent mempool.