Conversation
|
No CHANGELOG entry is required. |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #148 +/- ##
==========================================
+ Coverage 57.66% 57.71% +0.04%
==========================================
Files 33 33
Lines 2277 2282 +5
==========================================
+ Hits 1313 1317 +4
- Misses 875 876 +1
Partials 89 89 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
OK, something is not clear with this upgrade, consensus service is hanging (even if dBFT master is used). |
|
OK, it looks like we still need to drain the timer's channel, otherwise consensus doesn't work properly. |
|
Just do nspcc-dev/neo-go#3553, this is not needed. |
|
I still think that timer draining related commit should be reverted because there's a possibility that timer is requested to be reset with Lines 157 to 159 in 5281a9a Consider the situation when CV happens: the timer will be reset in ChangeView message handler: Line 179 in 5281a9a Which means that dBFT eventloop is still busy with CV handling and can't read from the timer's channel (given that Lines 60 to 65 in 5281a9a At the same time, as it is said in the timer's documentation, this channel isn't buffered anymore:
So we end up in a deadlock. Take a look at the test failing in nspcc-dev/neo-go@d60f35c, or, as an example, run privnet from the same commit, kill one node and see the result: nodes won't be able to process CV properly: |
This reverts commit f7a23ed. Draining is needed because there's a possibility that timer is requested to be reset with `0` delta: https://github.com/nspcc-dev/dbft/blob/5281a9ae584e866e6dfa2a99313f7db6818c8e9d/dbft.go#L157-L159 Consider the situation when CV happens: the timer will be reset in ChangeView message handler: https://github.com/nspcc-dev/dbft/blob/5281a9ae584e866e6dfa2a99313f7db6818c8e9d/check.go#L179 Which means that dBFT eventloop is still busy with CV handling and can't read from the timer's channel (given that `d` is `0`): https://github.com/nspcc-dev/dbft/blob/5281a9ae584e866e6dfa2a99313f7db6818c8e9d/timer/timer.go#L60-L65 And this channel has a buffer of `1` in case if it's internal Timer-bound channel, which is not enough to finish CV processing. So we end up in a deadlock. Signed-off-by: Anna Shaleva <shaleva.ann@nspcc.ru>
This reverts commit 8dfda6b and f7a23ed.
This change is needed because external users of the library should be able to stop the dBFT timer:
https://github.com/nspcc-dev/neo-go/blob/3e54c46281237124da7fd89a90501a9d58a3aa60/pkg/consensus/consensus.go#L342