Conversation
|
At this moment, the unique different line is this, the lock: https://github.com/neo-project/neo/pull/422/files#diff-0285c1c12d1d492897a99ffe07f9fed9R77 I am working on recover the consensus state |
There was a problem hiding this comment.
@shargon brodas,
I saw that change view is now being blocked with if (context.State.HasFlag(ConsensusState.CommitSent)) return at CheckExpectedView method.
If the comments I mentioned are applied, then, we would send a partial signature and we would need Two Counters now (context.SignaturesPartial and context.Signatures).
After veryfing that context.SignaturesPartial is fulfilled at line
neo/neo/Consensus/ConsensusService.cs
Line 87 in 064ebc9
Then, the CN would enter in a loop waiting for the real signatures context.Signatures.
For me, the point is: The same problem of the "fork" can happen here, some CN entering in the BlockChangeView (commit phase) and others not.
However, there should be a mechnism in which Good Nodes would be able to also enter commit phase, for this purpose, the nodes in the Commit Phase (Blocking the view) should Re-Send their Partial Signatures in order that good nodes (that possible entered in a loop due to connections looses) will occasionally join the commit phase waiting for the Real Signatures.
|
|
||
| private void CheckSignatures() | ||
| { | ||
| if (!context.State.HasFlag(ConsensusState.CommitSent) && |
There was a problem hiding this comment.
@shargon,
I believe that here we should check the partial signatures instead if the complete and, then, at lines 353 (
neo/neo/Consensus/ConsensusService.cs
Line 353 in 064ebc9
neo/neo/Consensus/ConsensusService.cs
Line 52 in 064ebc9
|
I think that the priority should be fix the network errors. Obiously we need to find the right way to fix the malicious CN who can produce a fork, but i think ... if we send the complete signatures on the commit, we are doing the same as without this patch, i need to think about it 🤔 |
|
I also agree that the priority is to fix the network errors. |
|
good job @shargon, it's working on private net. Now, we have to decide if is better to lock change view on commit phase.
Also, as @vncoelho said, partial signature is important here. |
|
Hey, @belane, Jaime, I prefer 2. instead of 1.
aheuaheuahuea Let's evaluate and think about this with careful, maybe there also can be a problem...Crazy and Smart minds everywhere, we never know where they gonna find a possible entrance 📦 |
|
But another discussion is: merge the critical part and later think about this or try to add this additional important change now? |
vncoelho
left a comment
There was a problem hiding this comment.
@shargon, fast as a bolt!
The first commit looks precise to me.
The second one (f0c2ada) I do not know, because I thought that the CheckExpectedView was correctly implemented for blocking the ChangeView of those nodes with commit already sent. I think that you do not need to re-send here, just keep that same return for blocking changeview.
Now, the challenges are:
- Create a mechanism for accepting the incoming re-send (at OnPrepareResponse). Perhaps, including an exception here:
neo/neo/Consensus/ConsensusService.cs
Line 248 in f0c2ada
- Change the complete signature at line . Maybe we do not need to change the one of the Primary. Perhaps, for the Primary we could send both together:
neo/neo/Consensus/ConsensusService.cs
Line 52 in f0c2ada
context.Signatures[context.MyIndex] = context.MakeHeader().Sign(context.KeyPair);
context.SignaturesPartial[context.MyIndex] = context.SOMETHINGTOTHINGABOUT.Sign(context.KeyPair);
neo/Consensus/ConsensusService.cs
Outdated
| // If signature was sent, we send again | ||
|
|
||
| SignAndRelay(context.MakePrepareResponse(context.Signatures[context.MyIndex])); | ||
| CheckSignatures(); |
There was a problem hiding this comment.
Maybe remove this CheckSignatures here because the node is just re-sending. It should keep Checking at OnPrepareResponseReceived which also filters messages from nodes with wrong view and etc...
|
Still some repeated logs.. we must see that: |
| SignatureSent = 0x10, | ||
| BlockSent = 0x20, | ||
| ViewChanging = 0x40, | ||
| CommitSent = 0x80, |
|
|
||
| private void OnCommitAgreement(ConsensusPayload payload, CommitAgreement message) | ||
| { | ||
| Log($"{nameof(OnCommitAgreement)}: height={payload.BlockIndex} hash={message.BlockHash.ToString()} view={message.ViewNumber} index={payload.ValidatorIndex}"); |
There was a problem hiding this comment.
Can you put this message Log($"{nameof(OnCommitAgreement)}: ... after the if? So it won't be repeated.
Please, do that to Log($"{nameof(OnPrepareResponseReceived)}: ... too, because it's also repeated.
There was a problem hiding this comment.
@shargon This is why messages are repeated, easy to fix.
|
@igormcoelho try again with 14a31e6 please :) @vncoelho i removed the CheckSignatures like you said on 6b23061 |
|
|
||
| private void OnPrepareRequestReceived(ConsensusPayload payload, PrepareRequest message) | ||
| { | ||
| if (context.State.HasFlag(ConsensusState.RequestReceived)) |
There was a problem hiding this comment.
Very good move! Avoiding this repeated message.
| if (context.State.HasFlag(ConsensusState.BlockSent)) return; | ||
| if (context.Signatures[payload.ValidatorIndex] != null) return; | ||
|
|
||
| Log($"{nameof(OnPrepareResponseReceived)}: height={payload.BlockIndex} view={message.ViewNumber} index={payload.ValidatorIndex}"); |
vncoelho
left a comment
There was a problem hiding this comment.
Try to filter before RequestView
|
|
||
| private void RequestChangeView() | ||
| { | ||
| if (context.State.HasFlag(ConsensusState.CommitSent)) |
There was a problem hiding this comment.
let's remove this and filter before, @shargon.
else if ((context.State.HasFlag(ConsensusState.Primary) && context.State.HasFlag(ConsensusState.RequestSent)) || (context.State.HasFlag(ConsensusState.Backup) && !context.State.HasFlag(ConsensusState.CommitSent)) )at line
neo/neo/Consensus/ConsensusService.cs
Line 384 in 6b23061
neo/Consensus/ConsensusService.cs
Outdated
| { | ||
| // If signature was sent, we send again | ||
|
|
||
| SignAndRelay(context.MakePrepareResponse(context.Signatures[context.MyIndex])); |
There was a problem hiding this comment.
Perhaps you need to Relay your CommitAgreement again too. Not sure. Or is it implicit by PrepareResponse message?
|
Shargon, it's looking good for me! I'll try to simulate forking scenarios again, to make sure it's 100%... but congratulations in advance ;) |
neo/Consensus/ConsensusService.cs
Outdated
| { | ||
| // If signature was sent, we send again | ||
|
|
||
| SignAndRelay(context.MakePrepareResponse(context.Signatures[context.MyIndex])); |
There was a problem hiding this comment.
@shargon ,if a node (in commit phase) receives change view request, and re-send prepareResponse. From current code, the node (who have sent the changeView request ) will not accept prepareResponse because the viewNumber does not match. We may need to find a way to accept the re-send prepareResponse.
There was a problem hiding this comment.
Is just for dropped packets (network errors), maybe there are a disconnection, and this packet never arrive to your CN.
There was a problem hiding this comment.
yes. Imagine one CN(A) sent ChangeView request to other CN if it did not collect enough PrepareResponse on time because the network error. Then A increases its view number. But when other CN(B) receives the ChangeView message, it resends prepareResponse because B is in commit phase now.
In this situation, even if A re-connect with B, A will not accept B's resent prepareResponse message because view numbers do not match anymore. But we need to let A accepts this message and move on to commit phase as B, right ?
neo/Consensus/ConsensusService.cs
Outdated
| if (context.State.HasFlag(ConsensusState.BlockSent) || | ||
| !context.TryToCommit(payload, message)) return; | ||
|
|
||
| Log($"{nameof(OnCommitAgreement)}: height={payload.BlockIndex} hash={message.BlockHash.ToString()} view={message.ViewNumber} index={payload.ValidatorIndex}"); |
|
I close this for focus on #426 |
|
I would prefer a smaller pr instead of a big pr with many changes. |
|
Hey @erikzhang, I think that NGD is already testing the other PR and we also have been testing it for a couple of days.
The PR is a sum of several contributions that all of us have been discussing in the past months. |


Is the same implementation as #320
TODO:
Fixes #193
Fixes neo-project/neo-node#219