fix(mempool)!: stop accepting TXs in the mempool if we can't keep up with reCheckTX#3314
Merged
fix(mempool)!: stop accepting TXs in the mempool if we can't keep up with reCheckTX#3314
Conversation
Closed
4 tasks
sergio-mena
commented
Jun 21, 2024
sergio-mena
commented
Jun 21, 2024
Co-authored-by: Sergio Mena <sergio@informal.systems>
… into sergio/bound-rechecktx
hvanz
reviewed
Jun 21, 2024
hvanz
reviewed
Jun 21, 2024
greg-szabo
reviewed
Jun 21, 2024
greg-szabo
reviewed
Jun 21, 2024
greg-szabo
approved these changes
Jun 21, 2024
Collaborator
greg-szabo
left a comment
There was a problem hiding this comment.
I'd rather have Debug level for the newly introduced log message, because it's a bit cryptic for validators. For n end-user, I could consider a "mempool is soft full" message useful, although there's not much they can do about it.
melekes
reviewed
Jun 24, 2024
Collaborator
melekes
left a comment
There was a problem hiding this comment.
Shouldn't we respect the Recheck config setting?
mergify bot
pushed a commit
that referenced
this pull request
Jun 24, 2024
…with reCheckTX (#3314) This PR is a combination of ideas from @ValarDragon, @hvanz and @sergio-mena to alleviate nodes that, while not having their mempool full "officially", they have too many TXs lingering in the mempool which causes them to fall behind. The mechanism works as follows: * We mark when we start and end reChecking * If, by the time a new block is decided we are still running the previous `reCheckTx`, we declare the mempool as rechecktx-full * Otherwise, we declare the mempool as not rechecktx-full We have tested this and it fixes the failing nightlies that are blocking us from cutting `v1.0.0-rc1`. Some UTs need to be fixed, hence posting as draft for the moment. --- #### PR checklist - [ ] Tests written/updated - [x] Changelog entry added in `.changelog` (we use [unclog](https://github.com/informalsystems/unclog) to manage our changelog) - [ ] Updated relevant documentation (`docs/` or `spec/`) and code comments - [x] Title follows the [Conventional Commits](https://www.conventionalcommits.org/en/v1.0.0/) spec --------- Co-authored-by: Andy Nogueira <me@andynogueira.dev> Co-authored-by: Dev Ojha <dojha@berkeley.edu> Co-authored-by: hvanz <hernan.vanzetto@gmail.com> Co-authored-by: Hernán Vanzetto <15466498+hvanz@users.noreply.github.com> (cherry picked from commit 0cd2907)
mergify bot
pushed a commit
that referenced
this pull request
Jun 24, 2024
…with reCheckTX (#3314) This PR is a combination of ideas from @ValarDragon, @hvanz and @sergio-mena to alleviate nodes that, while not having their mempool full "officially", they have too many TXs lingering in the mempool which causes them to fall behind. The mechanism works as follows: * We mark when we start and end reChecking * If, by the time a new block is decided we are still running the previous `reCheckTx`, we declare the mempool as rechecktx-full * Otherwise, we declare the mempool as not rechecktx-full We have tested this and it fixes the failing nightlies that are blocking us from cutting `v1.0.0-rc1`. Some UTs need to be fixed, hence posting as draft for the moment. --- #### PR checklist - [ ] Tests written/updated - [x] Changelog entry added in `.changelog` (we use [unclog](https://github.com/informalsystems/unclog) to manage our changelog) - [ ] Updated relevant documentation (`docs/` or `spec/`) and code comments - [x] Title follows the [Conventional Commits](https://www.conventionalcommits.org/en/v1.0.0/) spec --------- Co-authored-by: Andy Nogueira <me@andynogueira.dev> Co-authored-by: Dev Ojha <dojha@berkeley.edu> Co-authored-by: hvanz <hernan.vanzetto@gmail.com> Co-authored-by: Hernán Vanzetto <15466498+hvanz@users.noreply.github.com> (cherry picked from commit 0cd2907) # Conflicts: # mempool/clist_mempool.go # mempool/clist_mempool_test.go
mergify bot
pushed a commit
that referenced
this pull request
Jun 24, 2024
…with reCheckTX (#3314) This PR is a combination of ideas from @ValarDragon, @hvanz and @sergio-mena to alleviate nodes that, while not having their mempool full "officially", they have too many TXs lingering in the mempool which causes them to fall behind. The mechanism works as follows: * We mark when we start and end reChecking * If, by the time a new block is decided we are still running the previous `reCheckTx`, we declare the mempool as rechecktx-full * Otherwise, we declare the mempool as not rechecktx-full We have tested this and it fixes the failing nightlies that are blocking us from cutting `v1.0.0-rc1`. Some UTs need to be fixed, hence posting as draft for the moment. --- #### PR checklist - [ ] Tests written/updated - [x] Changelog entry added in `.changelog` (we use [unclog](https://github.com/informalsystems/unclog) to manage our changelog) - [ ] Updated relevant documentation (`docs/` or `spec/`) and code comments - [x] Title follows the [Conventional Commits](https://www.conventionalcommits.org/en/v1.0.0/) spec --------- Co-authored-by: Andy Nogueira <me@andynogueira.dev> Co-authored-by: Dev Ojha <dojha@berkeley.edu> Co-authored-by: hvanz <hernan.vanzetto@gmail.com> Co-authored-by: Hernán Vanzetto <15466498+hvanz@users.noreply.github.com> (cherry picked from commit 0cd2907) # Conflicts: # internal/consensus/replay_stubs.go # mempool/clist_mempool.go # mempool/clist_mempool_test.go # mempool/errors.go # mempool/reactor_test.go # state/execution.go # state/execution_test.go
This was referenced Jun 24, 2024
hvanz
pushed a commit
that referenced
this pull request
Jun 24, 2024
…with reCheckTX (backport #3314) (#3336) This PR is a combination of ideas from @ValarDragon, @hvanz and @sergio-mena to alleviate nodes that, while not having their mempool full "officially", they have too many TXs lingering in the mempool which causes them to fall behind. The mechanism works as follows: * We mark when we start and end reChecking * If, by the time a new block is decided we are still running the previous `reCheckTx`, we declare the mempool as rechecktx-full * Otherwise, we declare the mempool as not rechecktx-full We have tested this and it fixes the failing nightlies that are blocking us from cutting `v1.0.0-rc1`. Some UTs need to be fixed, hence posting as draft for the moment. --- #### PR checklist - [ ] Tests written/updated - [x] Changelog entry added in `.changelog` (we use [unclog](https://github.com/informalsystems/unclog) to manage our changelog) - [ ] Updated relevant documentation (`docs/` or `spec/`) and code comments - [x] Title follows the [Conventional Commits](https://www.conventionalcommits.org/en/v1.0.0/) spec <hr>This is an automatic backport of pull request #3314 done by [Mergify](https://mergify.com). Co-authored-by: Sergio Mena <sergio@informal.systems>
hvanz
added a commit
that referenced
this pull request
Jun 24, 2024
…with reCheckTX (#3314) This PR is a combination of ideas from @ValarDragon, @hvanz and @sergio-mena to alleviate nodes that, while not having their mempool full "officially", they have too many TXs lingering in the mempool which causes them to fall behind. The mechanism works as follows: * We mark when we start and end reChecking * If, by the time a new block is decided we are still running the previous `reCheckTx`, we declare the mempool as rechecktx-full * Otherwise, we declare the mempool as not rechecktx-full We have tested this and it fixes the failing nightlies that are blocking us from cutting `v1.0.0-rc1`. Some UTs need to be fixed, hence posting as draft for the moment. --- - [ ] Tests written/updated - [x] Changelog entry added in `.changelog` (we use [unclog](https://github.com/informalsystems/unclog) to manage our changelog) - [ ] Updated relevant documentation (`docs/` or `spec/`) and code comments - [x] Title follows the [Conventional Commits](https://www.conventionalcommits.org/en/v1.0.0/) spec --------- Co-authored-by: Andy Nogueira <me@andynogueira.dev> Co-authored-by: Dev Ojha <dojha@berkeley.edu> Co-authored-by: hvanz <hernan.vanzetto@gmail.com> Co-authored-by: Hernán Vanzetto <15466498+hvanz@users.noreply.github.com>
hvanz
added a commit
that referenced
this pull request
Jun 24, 2024
…with reCheckTX (backport #3314) (#3337) This PR is a combination of ideas from @ValarDragon, @hvanz and @sergio-mena to alleviate nodes that, while not having their mempool full "officially", they have too many TXs lingering in the mempool which causes them to fall behind. The mechanism works as follows: * We mark when we start and end reChecking * If, by the time a new block is decided we are still running the previous `reCheckTx`, we declare the mempool as rechecktx-full * Otherwise, we declare the mempool as not rechecktx-full We have tested this and it fixes the failing nightlies that are blocking us from cutting `v1.0.0-rc1`. Some UTs need to be fixed, hence posting as draft for the moment. --- #### PR checklist - [ ] Tests written/updated - [x] Changelog entry added in `.changelog` (we use [unclog](https://github.com/informalsystems/unclog) to manage our changelog) - [ ] Updated relevant documentation (`docs/` or `spec/`) and code comments - [x] Title follows the [Conventional Commits](https://www.conventionalcommits.org/en/v1.0.0/) spec <hr>This is an automatic backport of pull request #3314 done by [Mergify](https://mergify.com). --------- Co-authored-by: Sergio Mena <sergio@informal.systems> Co-authored-by: hvanz <hernan.vanzetto@gmail.com>
hvanz
added a commit
that referenced
this pull request
Jun 24, 2024
…with reCheckTX (backport #3314) (#3338) This PR is a combination of ideas from @ValarDragon, @hvanz and @sergio-mena to alleviate nodes that, while not having their mempool full "officially", they have too many TXs lingering in the mempool which causes them to fall behind. The mechanism works as follows: * We mark when we start and end reChecking * If, by the time a new block is decided we are still running the previous `reCheckTx`, we declare the mempool as rechecktx-full * Otherwise, we declare the mempool as not rechecktx-full We have tested this and it fixes the failing nightlies that are blocking us from cutting `v1.0.0-rc1`. Some UTs need to be fixed, hence posting as draft for the moment. --- #### PR checklist - [ ] Tests written/updated - [x] Changelog entry added in `.changelog` (we use [unclog](https://github.com/informalsystems/unclog) to manage our changelog) - [ ] Updated relevant documentation (`docs/` or `spec/`) and code comments - [x] Title follows the [Conventional Commits](https://www.conventionalcommits.org/en/v1.0.0/) spec <hr>This is an automatic backport of pull request #3314 done by [Mergify](https://mergify.com). Co-authored-by: Sergio Mena <sergio@informal.systems> Co-authored-by: Andy Nogueira <me@andynogueira.dev> Co-authored-by: Dev Ojha <dojha@berkeley.edu> Co-authored-by: hvanz <hernan.vanzetto@gmail.com> Co-authored-by: Hernán Vanzetto <15466498+hvanz@users.noreply.github.com>
ValarDragon
added a commit
to osmosis-labs/cometbft
that referenced
this pull request
Jun 27, 2024
…with reCheckTX (backport cometbft#3314) (cometbft#3338) This PR is a combination of ideas from @ValarDragon, @hvanz and @sergio-mena to alleviate nodes that, while not having their mempool full "officially", they have too many TXs lingering in the mempool which causes them to fall behind. The mechanism works as follows: * We mark when we start and end reChecking * If, by the time a new block is decided we are still running the previous `reCheckTx`, we declare the mempool as rechecktx-full * Otherwise, we declare the mempool as not rechecktx-full We have tested this and it fixes the failing nightlies that are blocking us from cutting `v1.0.0-rc1`. Some UTs need to be fixed, hence posting as draft for the moment. --- #### PR checklist - [ ] Tests written/updated - [x] Changelog entry added in `.changelog` (we use [unclog](https://github.com/informalsystems/unclog) to manage our changelog) - [ ] Updated relevant documentation (`docs/` or `spec/`) and code comments - [x] Title follows the [Conventional Commits](https://www.conventionalcommits.org/en/v1.0.0/) spec <hr>This is an automatic backport of pull request cometbft#3314 done by [Mergify](https://mergify.com). Co-authored-by: Sergio Mena <sergio@informal.systems> Co-authored-by: Andy Nogueira <me@andynogueira.dev> Co-authored-by: Dev Ojha <dojha@berkeley.edu> Co-authored-by: hvanz <hernan.vanzetto@gmail.com> Co-authored-by: Hernán Vanzetto <15466498+hvanz@users.noreply.github.com>
ValarDragon
added a commit
to osmosis-labs/cometbft
that referenced
this pull request
Jun 27, 2024
… with reCheckTX (backport cometbft#3314) cometbft#3338 (#117) * Revert "Add ResetUpdate, so non-block proposers no longer fail to sync (#113)" This reverts commit 158bcbe. * fix(mempool)!: stop accepting TXs in the mempool if we can't keep up with reCheckTX (backport cometbft#3314) (cometbft#3338) This PR is a combination of ideas from @ValarDragon, @hvanz and @sergio-mena to alleviate nodes that, while not having their mempool full "officially", they have too many TXs lingering in the mempool which causes them to fall behind. The mechanism works as follows: * We mark when we start and end reChecking * If, by the time a new block is decided we are still running the previous `reCheckTx`, we declare the mempool as rechecktx-full * Otherwise, we declare the mempool as not rechecktx-full We have tested this and it fixes the failing nightlies that are blocking us from cutting `v1.0.0-rc1`. Some UTs need to be fixed, hence posting as draft for the moment. --- #### PR checklist - [ ] Tests written/updated - [x] Changelog entry added in `.changelog` (we use [unclog](https://github.com/informalsystems/unclog) to manage our changelog) - [ ] Updated relevant documentation (`docs/` or `spec/`) and code comments - [x] Title follows the [Conventional Commits](https://www.conventionalcommits.org/en/v1.0.0/) spec <hr>This is an automatic backport of pull request cometbft#3314 done by [Mergify](https://mergify.com). Co-authored-by: Sergio Mena <sergio@informal.systems> Co-authored-by: Andy Nogueira <me@andynogueira.dev> Co-authored-by: Dev Ojha <dojha@berkeley.edu> Co-authored-by: hvanz <hernan.vanzetto@gmail.com> Co-authored-by: Hernán Vanzetto <15466498+hvanz@users.noreply.github.com> --------- Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com> Co-authored-by: Sergio Mena <sergio@informal.systems> Co-authored-by: Andy Nogueira <me@andynogueira.dev> Co-authored-by: hvanz <hernan.vanzetto@gmail.com> Co-authored-by: Hernán Vanzetto <15466498+hvanz@users.noreply.github.com>
4 tasks
hvanz
added a commit
that referenced
this pull request
Jun 28, 2024
…its logic to Lock (#3361) This PR partially reverts the backport of #3314 into the recently released `v0.38.8` (and `v0.37.7`). With this change the `Mempool` interface is the same as in previous versions. The reason is that we do not want to break the public API. We still keep in the code the feature that #3314 introduced by moving it inside the existing `Lock` method. We also keep the `RecheckFull bool` field that we added to `ErrMempoolIsFull`. <!-- Please add a reference to the issue that this PR addresses and indicate which files are most critical to review. If it fully addresses a particular issue, please include "Closes #XXX" (where "XXX" is the issue number). If this PR is non-trivial/large/complex, please ensure that you have either created an issue that the team's had a chance to respond to, or had some discussion with the team prior to submitting substantial pull requests. The team can be reached via GitHub Discussions or the Cosmos Network Discord server in the #cometbft channel. GitHub Discussions is preferred over Discord as it allows us to keep track of conversations topically. https://github.com/cometbft/cometbft/discussions If the work in this PR is not aligned with the team's current priorities, please be advised that it may take some time before it is merged - especially if it has not yet been discussed with the team. See the project board for the team's current priorities: https://github.com/orgs/cometbft/projects/1 --> --- #### PR checklist - [ ] Tests written/updated - [ ] Changelog entry added in `.changelog` (we use [unclog](https://github.com/informalsystems/unclog) to manage our changelog) - [ ] Updated relevant documentation (`docs/` or `spec/`) and code comments - [ ] Title follows the [Conventional Commits](https://www.conventionalcommits.org/en/v1.0.0/) spec
4 tasks
hvanz
added a commit
that referenced
this pull request
Jun 28, 2024
…its logic to Lock (#3363) Complements #3361 This PR partially reverts the backport of #3314 into the recently released v0.37.7. With this change the Mempool interface is the same as in previous versions. The reason is that we do not want to break the public API. We still keep in the code the feature that #3314 introduced by moving it inside the existing Lock method. We also keep the RecheckFull bool field that we added to ErrMempoolIsFull. <!-- Please add a reference to the issue that this PR addresses and indicate which files are most critical to review. If it fully addresses a particular issue, please include "Closes #XXX" (where "XXX" is the issue number). If this PR is non-trivial/large/complex, please ensure that you have either created an issue that the team's had a chance to respond to, or had some discussion with the team prior to submitting substantial pull requests. The team can be reached via GitHub Discussions or the Cosmos Network Discord server in the #cometbft channel. GitHub Discussions is preferred over Discord as it allows us to keep track of conversations topically. https://github.com/cometbft/cometbft/discussions If the work in this PR is not aligned with the team's current priorities, please be advised that it may take some time before it is merged - especially if it has not yet been discussed with the team. See the project board for the team's current priorities: https://github.com/orgs/cometbft/projects/1 --> --- #### PR checklist - [ ] Tests written/updated - [ ] Changelog entry added in `.changelog` (we use [unclog](https://github.com/informalsystems/unclog) to manage our changelog) - [ ] Updated relevant documentation (`docs/` or `spec/`) and code comments - [ ] Title follows the [Conventional Commits](https://www.conventionalcommits.org/en/v1.0.0/) spec
4 tasks
github-merge-queue bot
pushed a commit
that referenced
this pull request
Aug 9, 2024
Complements #3314 This PR adds a new error with a more clear error message when checking if the mempool is full while rechecking is in progress after a new block was committed. --- #### PR checklist - [ ] Tests written/updated - [ ] Changelog entry added in `.changelog` (we use [unclog](https://github.com/informalsystems/unclog) to manage our changelog) - [ ] Updated relevant documentation (`docs/` or `spec/`) and code comments - [ ] Title follows the [Conventional Commits](https://www.conventionalcommits.org/en/v1.0.0/) spec
mergify bot
pushed a commit
that referenced
this pull request
Aug 9, 2024
Complements #3314 This PR adds a new error with a more clear error message when checking if the mempool is full while rechecking is in progress after a new block was committed. --- #### PR checklist - [ ] Tests written/updated - [ ] Changelog entry added in `.changelog` (we use [unclog](https://github.com/informalsystems/unclog) to manage our changelog) - [ ] Updated relevant documentation (`docs/` or `spec/`) and code comments - [ ] Title follows the [Conventional Commits](https://www.conventionalcommits.org/en/v1.0.0/) spec (cherry picked from commit 987fa83)
mergify bot
pushed a commit
that referenced
this pull request
Aug 9, 2024
Complements #3314 This PR adds a new error with a more clear error message when checking if the mempool is full while rechecking is in progress after a new block was committed. --- #### PR checklist - [ ] Tests written/updated - [ ] Changelog entry added in `.changelog` (we use [unclog](https://github.com/informalsystems/unclog) to manage our changelog) - [ ] Updated relevant documentation (`docs/` or `spec/`) and code comments - [ ] Title follows the [Conventional Commits](https://www.conventionalcommits.org/en/v1.0.0/) spec (cherry picked from commit 987fa83) # Conflicts: # mempool/clist_mempool.go
This was referenced Aug 9, 2024
hvanz
added a commit
that referenced
this pull request
Aug 9, 2024
Complements #3314 This PR adds a new error with a more clear error message when checking if the mempool is full while rechecking is in progress after a new block was committed. --- #### PR checklist - [ ] Tests written/updated - [ ] Changelog entry added in `.changelog` (we use [unclog](https://github.com/informalsystems/unclog) to manage our changelog) - [ ] Updated relevant documentation (`docs/` or `spec/`) and code comments - [ ] Title follows the [Conventional Commits](https://www.conventionalcommits.org/en/v1.0.0/) spec <hr>This is an automatic backport of pull request #3654 done by [Mergify](https://mergify.com). Co-authored-by: Hernán Vanzetto <15466498+hvanz@users.noreply.github.com>
hvanz
added a commit
that referenced
this pull request
Aug 9, 2024
Complements #3314 This PR adds a new error with a more clear error message when checking if the mempool is full while rechecking is in progress after a new block was committed. --- #### PR checklist - [ ] Tests written/updated - [ ] Changelog entry added in `.changelog` (we use [unclog](https://github.com/informalsystems/unclog) to manage our changelog) - [ ] Updated relevant documentation (`docs/` or `spec/`) and code comments - [ ] Title follows the [Conventional Commits](https://www.conventionalcommits.org/en/v1.0.0/) spec <hr>This is an automatic backport of pull request #3654 done by [Mergify](https://mergify.com). --------- Co-authored-by: Hernán Vanzetto <15466498+hvanz@users.noreply.github.com> Co-authored-by: hvanz <hernan.vanzetto@gmail.com>
3 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR is a combination of ideas from @ValarDragon, @hvanz and @sergio-mena to alleviate nodes that, while not having their mempool full "officially", they have too many TXs lingering in the mempool which causes them to fall behind.
The mechanism works as follows:
reCheckTx, we declare the mempool as rechecktx-fullWe have tested this and it fixes the failing nightlies that are blocking us from cutting
v1.0.0-rc1.Some UTs need to be fixed, hence posting as draft for the moment.
PR checklist
.changelog(we use unclog to manage our changelog)docs/orspec/) and code comments