This repository was archived by the owner on Oct 28, 2021. It is now read-only.
Fix race condition which permanently pauses sync#5865
Merged
halfalicious merged 3 commits intomasterfrom Dec 12, 2019
Merged
Conversation
Codecov Report
@@ Coverage Diff @@
## master #5865 +/- ##
==========================================
- Coverage 64.05% 63.97% -0.08%
==========================================
Files 364 364
Lines 30994 30992 -2
Branches 3435 3435
==========================================
- Hits 19852 19826 -26
- Misses 9917 9933 +16
- Partials 1225 1233 +8 |
The problem is that the knownFull -> !knownFull check in BlockQueue::drain() doesn't catch cases where the block queue is modified on a verifier thread - these cases are when a bad block is detected and the block and all of its children are removed from m_verifying, or when a block (with bytes) is moved from m_unverified to m_verifying (without bytes). The fix is to always fire the onRoomAvailable handler (renamed to onBlocksDrained) at the end of doneDrain, and in the handler to only resume syncing if it's paused and the block queue isn't full.
669810e to
5e0f7be
Compare
gumb0
approved these changes
Dec 11, 2019
Member
gumb0
left a comment
There was a problem hiding this comment.
Looks good, did you try running the sync?
Contributor
Author
|
@gumb0: Yes kicked it off last night, will let it run for a bit before merging. |
Contributor
Author
|
Ran sync for ~12 hours and sync'd from genesis to block 1.4M without issues. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fix #5312
The race condition occurs when the block queue is full so syncing is paused, and the function which detects when the block queue has room (
BlockQueue::drain) does the first of the two-part check when a block queue verifier thread has temporarily removed block data from the verification queues.As such, the "room available" detection fails and
drainremoves more blocks, which means that the block queue enters a permanently not full state and syncing is paused indefinitely.This is fixed by also performing the "room available" detection check in the body of the block queue verifier threads.