perf(consensus): Run broadcast routines out of process (backport #3180)#3477
Merged
perf(consensus): Run broadcast routines out of process (backport #3180)#3477
Conversation
Run broadcast routines out of process. Right now each broadcast routine blocks the consensus mutex for roughly `num_peers * process_creation_time`, which is genuinely notable! This PR reduces the consensus blocking overhead to just be `process_creation_time`. On the latest osmosis branch with improvements, thats 20s of blocking time out of 140s (over the course of 1 hour. This 140s includes block execution!)  Note that WAL write time should go significantly down with open PR's. For `HasVote`, this is a meaningful increase to consensus mutex lock time, so its worth reducing. --- #### PR checklist - [x] Tests written/updated - I can't think of any test to add - [x] Changelog entry added in `.changelog` (we use [unclog](https://github.com/informalsystems/unclog) to manage our changelog) - [x] Updated relevant documentation (`docs/` or `spec/`) and code comments - I don't know of any related docs here - [x] Title follows the [Conventional Commits](https://www.conventionalcommits.org/en/v1.0.0/) spec (cherry picked from commit 110817b) # Conflicts: # .changelog/unreleased/improvements/3180-lower-broadcasts-consensus-overhead.md # internal/consensus/reactor.go
This comment was marked as resolved.
This comment was marked as resolved.
4 tasks
Collaborator
|
NOTE: the changelog entry is already in |
melekes
approved these changes
Jul 10, 2024
ValarDragon
added a commit
to osmosis-labs/cometbft
that referenced
this pull request
Aug 19, 2024
…tbft#3180) (cometbft#3477) Run broadcast routines out of process. Right now each broadcast routine blocks the consensus mutex for roughly `num_peers * process_creation_time`, which is genuinely notable! This PR reduces the consensus blocking overhead to just be `process_creation_time`. On the latest osmosis branch with improvements, thats 20s of blocking time out of 140s (over the course of 1 hour. This 140s includes block execution!)  Note that WAL write time should go significantly down with open PR's. For `HasVote`, this is a meaningful increase to consensus mutex lock time, so its worth reducing. --- #### PR checklist - [x] Tests written/updated - I can't think of any test to add - [x] Changelog entry added in `.changelog` (we use [unclog](https://github.com/informalsystems/unclog) to manage our changelog) - [x] Updated relevant documentation (`docs/` or `spec/`) and code comments - I don't know of any related docs here - [x] Title follows the [Conventional Commits](https://www.conventionalcommits.org/en/v1.0.0/) spec <hr>This is an automatic backport of pull request cometbft#3180 done by [Mergify](https://mergify.com). --------- Co-authored-by: Dev Ojha <ValarDragon@users.noreply.github.com> Co-authored-by: Anton Kaliaev <anton.kalyaev@gmail.com>
Merged
7 tasks
ValarDragon
added a commit
to osmosis-labs/cometbft
that referenced
this pull request
Aug 19, 2024
…tbft#318… (#135) * perf(consensus): Run broadcast routines out of process (backport cometbft#3180) (cometbft#3477) Run broadcast routines out of process. Right now each broadcast routine blocks the consensus mutex for roughly `num_peers * process_creation_time`, which is genuinely notable! This PR reduces the consensus blocking overhead to just be `process_creation_time`. On the latest osmosis branch with improvements, thats 20s of blocking time out of 140s (over the course of 1 hour. This 140s includes block execution!)  Note that WAL write time should go significantly down with open PR's. For `HasVote`, this is a meaningful increase to consensus mutex lock time, so its worth reducing. --- #### PR checklist - [x] Tests written/updated - I can't think of any test to add - [x] Changelog entry added in `.changelog` (we use [unclog](https://github.com/informalsystems/unclog) to manage our changelog) - [x] Updated relevant documentation (`docs/` or `spec/`) and code comments - I don't know of any related docs here - [x] Title follows the [Conventional Commits](https://www.conventionalcommits.org/en/v1.0.0/) spec <hr>This is an automatic backport of pull request cometbft#3180 done by [Mergify](https://mergify.com). --------- Co-authored-by: Dev Ojha <ValarDragon@users.noreply.github.com> Co-authored-by: Anton Kaliaev <anton.kalyaev@gmail.com> * add back has vote message broadcast --------- Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com> Co-authored-by: Anton Kaliaev <anton.kalyaev@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Run broadcast routines out of process. Right now each broadcast routine blocks the consensus mutex for roughly
num_peers * process_creation_time, which is genuinely notable! This PR reduces the consensus blocking overhead to just beprocess_creation_time.On the latest osmosis branch with improvements, thats 20s of blocking time out of 140s (over the course of 1 hour. This 140s includes block execution!)

Note that WAL write time should go significantly down with open PR's. For
HasVote, this is a meaningful increase to consensus mutex lock time, so its worth reducing.PR checklist
.changelog(we use unclog to manage our changelog)docs/orspec/) and code comments - I don't know of any related docs hereThis is an automatic backport of pull request #3180 done by [Mergify](https://mergify.com).