Conversation
consensus/state.go
Outdated
| // TODO: track evidence for inclusion in a block | ||
| cs.Logger.Error("Found conflicting vote. Recording evidence in the RoundState", "height", vote.Height, "round", vote.Round, "type", vote.Type, "valAddr", vote.ValidatorAddress, "valIndex", vote.ValidatorIndex) | ||
|
|
||
| // TODO: ensure we haven't seen this evidence already ! |
There was a problem hiding this comment.
Wouldn't that be the case already since we've already reported a conflicting vote and below are adding voteErr.DuplicateVoteEvidence? Or are you saying that before adding new evidence, we should check for the presence of DuplicateVoteEvidence?
There was a problem hiding this comment.
It's possible we already saw this particular evidence (ie. from the same validator) and added it to the cs.Evidence.
| if !bytes.Equal(dve.VoteA.ValidatorAddress, dve.VoteB.ValidatorAddress) { | ||
| return fmt.Errorf("DuplicateVoteEvidence Error: Validator addresses do not match. Got %X and %X", dve.VoteA.ValidatorAddress, dve.VoteB.ValidatorAddress) | ||
| } | ||
| // XXX: Should we enforce index is the same ? |
There was a problem hiding this comment.
should we? checking the address should be enough. although, validator set is sorted, so it does not matter
There was a problem hiding this comment.
I think you're right, and we don't need the index.
|
Looking good so far ) |
types/evidence.go
Outdated
| valIdx, val := valset.GetByAddress(addr) | ||
| if val == nil { | ||
| return fmt.Errorf("Address %X was not a validator at height %d", addr, height) | ||
| } else if idx != valIdx { |
There was a problem hiding this comment.
maybe we still want to punish even if indexes are wrong?!
|
Only thing left to do is add the Evidence to BeginBlock and add some simple tests in the But this is ready for more serious review. |
evidence/pool.go
Outdated
| } | ||
|
|
||
| // EvidenceChan returns an unbuffered channel on which new evidence can be received. | ||
| func (evpool *EvidencePool) EvidenceChan() chan types.Evidence { |
There was a problem hiding this comment.
How about encoding the direction of the channel here.
func (evpool *EvidencePool) EvidenceChan() <-chan types.Evidence
evidence/store.go
Outdated
| // reverse the order so highest priority is first | ||
| l := store.ListEvidence(baseKeyOutqueue) | ||
| l2 := make([]types.Evidence, len(l)) | ||
| for i, _ := range l { |
There was a problem hiding this comment.
The underscore can be omitted here.
| return ei | ||
| } | ||
|
|
||
| // AddNewEvidence adds the given evidence to the database. |
There was a problem hiding this comment.
Extra doc comment: Returns false if the evidence is already in the database.
evidence/store.go
Outdated
| if len(val) == 0 { | ||
| return nil | ||
| } | ||
| ei := new(EvidenceInfo) |
There was a problem hiding this comment.
there are at least two cases of var ei EvidenceInfo and &ei elsewhere however this is the only case of ei := new(EvidenceInfo) in this PR. since the variable is named the same maybe it is easier reading to convert this last one to the var form also for parallel consistency.
There was a problem hiding this comment.
The difference here is we return the pointer. But sure I'll be consistent and just return &ei
|
|
||
| // add it to the store | ||
| key := keyOutqueue(evidence, priority) | ||
| store.db.Set(key, eiBytes) |
There was a problem hiding this comment.
I wonder if the same problem applies here If we crash after committed but before removing/updating (i.e. if one of the set op crashes)
There was a problem hiding this comment.
These operations seem idempotent... see my other comment on idempotency
|
Also needs a |
Codecov Report
@@ Coverage Diff @@
## develop #592 +/- ##
===========================================
- Coverage 60.12% 60.05% -0.08%
===========================================
Files 110 114 +4
Lines 10273 10596 +323
===========================================
+ Hits 6177 6363 +186
- Misses 3535 3658 +123
- Partials 561 575 +14 |
| // XXX: is this thread safe ? | ||
| priority, err := evpool.state.VerifyEvidence(evidence) | ||
| if err != nil { | ||
| return err |
There was a problem hiding this comment.
// TODO if err is that we can't check it before we pruned, then ignore.
// TODO if err is that evidence is bad, return (and upstream mark peer as bad)
evidence/pool.go
Outdated
| // Blocks on the EvidenceChan. | ||
| func (evpool *EvidencePool) AddEvidence(evidence types.Evidence) (err error) { | ||
|
|
||
| // XXX: is this thread safe ? |
There was a problem hiding this comment.
State is not thread safe... it's actually a bit of a mess.
(e.g. we should probably move state.LoadValidators out into a global function and keep state a singleton snapshot). Is AddEvidence thread safe?
There was a problem hiding this comment.
Ok. This is actually broken because this state never gets updated. The goal with evidence pool was to be thread safe thanks to the threadsafety of the underlying db. But with needing to update the state, we probably need something more
|
|
||
| evpool.logger.Info("Verified new evidence of byzantine behaviour", "evidence", evidence) | ||
|
|
||
| evpool.evidenceChan <- evidence |
There was a problem hiding this comment.
NOTE: how do you know that evidenceChan isn't closed? (since it would panic)...
answer: because we don't close it. But that should be OK because channels are garbage collected as long as there are no hanging goroutines waiting on them.
evidence/reactor.go
Outdated
| func (evR *EvidenceReactor) AddPeer(peer p2p.Peer) { | ||
| // send the peer our high-priority evidence. | ||
| // the rest will be sent by the broadcastRoutine | ||
| evidence := evR.evpool.PriorityEvidence() |
There was a problem hiding this comment.
oh it's a list. that wasn't clear.
evidence/reactor.go
Outdated
| } | ||
|
|
||
| // broadcast new evidence to all peers. | ||
| // broadcasts must be non-blocking so routine is always available to read off EvidenceChan. |
| - Once committed, atomically remove from pending and update lookup. | ||
| - TODO: If we crash after committed but before removing/updating, | ||
| we'll be stuck broadcasting evidence we never know we committed. | ||
| so either share the state db and atomically MarkCommitted |
There was a problem hiding this comment.
Doesn't have to be atomic, but if we make updating this evidence store happen first before (or during) state db commit, and make sure the operation on the evidence store is idempotent, then that works too.
There was a problem hiding this comment.
(would have to comment CONTRACT: needs to be idempotent in that case)
evidence/store.go
Outdated
| } | ||
|
|
||
| // big endian padded hex | ||
| func be(h int) string { |
| // It is wrapped by PriorityEvidence and PendingEvidence for convenience. | ||
| func (store *EvidenceStore) ListEvidence(prefixKey string) (evidence []types.Evidence) { | ||
| iter := store.db.IteratorPrefix([]byte(prefixKey)) | ||
| for iter.Next() { |
There was a problem hiding this comment.
I think you're skipping the first item by calling Next().
There was a problem hiding this comment.
See documentation for tmlibs/db/Iterator:
Usage:
for itr.Seek(mykey); itr.Valid(); itr.Next() {
k, v := itr.Key(); itr.Value()
....
}
There was a problem hiding this comment.
In this case for ;itr.Valid(); itr.Next() {
There was a problem hiding this comment.
but how itr.Valid() changes this skipping the first item by calling Next()? if calling Next() progresses itr, then we must do something like
iter := store.db.IteratorPrefix([]byte(prefixKey))
// handle first item
iter.Key()
iter.Value()
for iter.Next()
I am confused
There was a problem hiding this comment.
and there is no docs for iterator in tmlibs https://github.com/tendermint/tmlibs/tree/develop/db. none that I can find
| // It returns false if the evidence is already stored. | ||
| func (store *EvidenceStore) AddNewEvidence(evidence types.Evidence, priority int) bool { | ||
| // check if we already have seen it | ||
| ei_ := store.GetEvidence(evidence.Height(), evidence.Hash()) |
There was a problem hiding this comment.
What if a byzantine validator signed a million votes for the same height/round?
There was a problem hiding this comment.
I think for simple vote evidence, we could search for whether we have evidence for a validator at a given height, and not worry about rounds or blockhashes etc. So if we already have evidence to slash a validator at a height, no need to keep collecting evidence. What do you think?
There was a problem hiding this comment.
That sounds good. So we should change the lookup scheme to include the validator-address in the key so we can quickly check if theres already evidence for the validator at the height?
| } | ||
| eiBytes := wire.BinaryBytes(ei) | ||
|
|
||
| // add it to the store |
There was a problem hiding this comment.
Seems like this function could be refactored out right underneath.
evidence/store.go
Outdated
| ei.Committed = true | ||
|
|
||
| // TODO: we should use the state db and db.Sync in state.Save instead. | ||
| // Else, if we call this before state.Save, we may never mark committed evidence as committed. |
There was a problem hiding this comment.
We should just mark it as committed before state.Save. It'll get committed again if we restart.
There was a problem hiding this comment.
There's a case during replay if the app is multiple blocks behind that we don't construct historical state and call state.ApplyBlock, we use sm.ExecCommitBlock instead, so eg. MarkEvidenceAsCommitted would never get called.
But now with historical validators etc we should be able to construct the historical state to use ApplyBlock after all.
| return s.LastValidators, s.Validators | ||
| } | ||
|
|
||
| // VerifyEvidence verifies the evidence fully by checking it is internally |
There was a problem hiding this comment.
See my other comment about State... I think we should move these kinds of functions out of State and keep State a simple snapshot. LMK what you think.
There was a problem hiding this comment.
Ok - I think that's a nice idea. All of the LoadXxx just need the db, but nothing else from the state.
This VerifyEvidence does require a bit more. Do we just pass in a state as an arg here?
There was a problem hiding this comment.
Let's deal with it in a separate PR: #1010
|
Further issues moved to #1012 |
#569