[wal] small fixes in SearchEndHeight & replay logic#1607
Conversation
Codecov Report
@@ Coverage Diff @@
## develop #1607 +/- ##
===========================================
- Coverage 60.84% 60.48% -0.36%
===========================================
Files 115 115
Lines 10013 10010 -3
===========================================
- Hits 6092 6055 -37
- Misses 3353 3384 +31
- Partials 568 571 +3
|
consensus/wal.go
Outdated
| msg, err = dec.Decode() | ||
| if err == io.EOF { | ||
| // OPTIMISATION: no need to look for height in older files if we've seen height - 1 | ||
| if lastHeightFound == height-1 { |
There was a problem hiding this comment.
what if theres multiple EndHeight in a single file ? Shouldn't we just check lastHeightFound > 0 && lastHeightFound < height ?
|
|
||
| if m, ok := msg.Msg.(EndHeightMessage); ok { | ||
| lastHeightFound = m.Height | ||
| if m.Height == height { // found |
There was a problem hiding this comment.
better yet why not just exit on the else condition here ?
There was a problem hiding this comment.
hm.. but we need to read until the end to ensure there is no height
There was a problem hiding this comment.
if we just check else if m.Height < height and return, we might return false when if fact height exists...
There was a problem hiding this comment.
but we start from the top of the file right ? and the WAL already guarantees the height is increasing. so if we find a height, and its less than what we're looking for, we should be done. no ?
There was a problem hiding this comment.
1 -> 2 -> 3
we're looking for 3. if we return on 1 or 2, it would be incorrect since there is 3
There was a problem hiding this comment.
yeh duh. larger heights are at the bottom of the file.
|
test fail ?! |
anybody knows what's this about? |
|
No. It's been bugging me too. I think it's because the proto files are from another repo ... I'd really like to eliminate that and move the KVPair into the ABCI proto spec. Not sure if we really need them in the tmlibs/cmn ... |
|
Ok, so |
2 outbound, 1 inbound ??????????? |
|
if I disable duplicate peer IP check, test passes |
msg is nil and if we continue executing, we'll get nil exception at `msg.Msg.(....)`
i.e., can't be skipped and we should only return DataCorruptionError if we can skip a msg safely
since we never write msg partially, if we've encountered io.EOF in the middle of the msg, we must abort
BEFORE: ``` E[05-24|11:55:37.229] Dialing failed pex=0 addr=022ec801d79025caab3afbbf816d92ff8450d040@127.0.0.2:6593 err="Connect to self: <nil>" attempts=0 ``` AFTER: ``` E[05-24|11:55:37.229] Dialing failed pex=0 addr=022ec801d79025caab3afbbf816d92ff8450d040@127.0.0.2:6593 err="Connect to self: 022ec801d79025caab3afbbf816d92ff8450d040@127.0.0.2:6593" attempts=0 ```
* ADR 109: Internalize specific packages to reduce surface area (tendermint#1485) * consensus: Move into internal Signed-off-by: Thane Thomson <connect@thanethomson.com> * evidence: Move into internal Signed-off-by: Thane Thomson <connect@thanethomson.com> * inspect: Move into internal Signed-off-by: Thane Thomson <connect@thanethomson.com> * state: Move into internal Signed-off-by: Thane Thomson <connect@thanethomson.com> * blocksync: Move into internal Signed-off-by: Thane Thomson <connect@thanethomson.com> * statesync: Move into internal Signed-off-by: Thane Thomson <connect@thanethomson.com> * store: Move into internal Signed-off-by: Thane Thomson <connect@thanethomson.com> * libs/async: Move into internal Signed-off-by: Thane Thomson <connect@thanethomson.com> * libs/autofile: Move into internal Signed-off-by: Thane Thomson <connect@thanethomson.com> * libs/bits: Move into internal Signed-off-by: Thane Thomson <connect@thanethomson.com> * libs/clist: Move into internal Signed-off-by: Thane Thomson <connect@thanethomson.com> * libs/cmap: Move into internal Signed-off-by: Thane Thomson <connect@thanethomson.com> * libs/events: Move into internal Signed-off-by: Thane Thomson <connect@thanethomson.com> * libs/fail: Move into internal Signed-off-by: Thane Thomson <connect@thanethomson.com> * libs/flowrate: Move into internal Signed-off-by: Thane Thomson <connect@thanethomson.com> * libs/os: Move into internal Signed-off-by: Thane Thomson <connect@thanethomson.com> * libs/progressbar: Move into internal Signed-off-by: Thane Thomson <connect@thanethomson.com> * libs/protoio: Move into internal Signed-off-by: Thane Thomson <connect@thanethomson.com> * libs/pubsub: Move into internal Signed-off-by: Thane Thomson <connect@thanethomson.com> * libs/net: Move into internal Signed-off-by: Thane Thomson <connect@thanethomson.com> * libs/rand: Move into internal Signed-off-by: Thane Thomson <connect@thanethomson.com> * libs/service: Move into internal Signed-off-by: Thane Thomson <connect@thanethomson.com> * libs/strings: Move into internal Signed-off-by: Thane Thomson <connect@thanethomson.com> * libs/sync: Move into internal Signed-off-by: Thane Thomson <connect@thanethomson.com> * libs/tempfile: Move into internal Signed-off-by: Thane Thomson <connect@thanethomson.com> * libs/timer: Move into internal Signed-off-by: Thane Thomson <connect@thanethomson.com> * Add changelog entries Signed-off-by: Thane Thomson <connect@thanethomson.com> --------- Signed-off-by: Thane Thomson <connect@thanethomson.com> * ADR 109: Modularize test infra (tendermint#1488) * test/e2e: Split out as separate module Signed-off-by: Thane Thomson <connect@thanethomson.com> * test/loadtime: Split out as separate module Signed-off-by: Thane Thomson <connect@thanethomson.com> * test/e2e: Remove optimization from Docker image construction Signed-off-by: Thane Thomson <connect@thanethomson.com> * Ensure that linter covers E2E framework and app Signed-off-by: Thane Thomson <connect@thanethomson.com> * Update CI linting to cover submodules Signed-off-by: Thane Thomson <connect@thanethomson.com> * Add changelog entry Signed-off-by: Thane Thomson <connect@thanethomson.com> * Expand linter coverage to loadtime tool Signed-off-by: Thane Thomson <connect@thanethomson.com> * Add missing phony entries Signed-off-by: Thane Thomson <connect@thanethomson.com> * test/e2e: Sync debug Dockerfile with primary Dockerfile Signed-off-by: Thane Thomson <connect@thanethomson.com> --------- Signed-off-by: Thane Thomson <connect@thanethomson.com> * chore: ADR 109: go mod tidy (tendermint#1606) * go mod tidy Signed-off-by: Thane Thomson <connect@thanethomson.com> * go.mod: Remove patch version Signed-off-by: Thane Thomson <connect@thanethomson.com> * go.mod: Remove new toolchain directives Signed-off-by: Thane Thomson <connect@thanethomson.com> --------- Signed-off-by: Thane Thomson <connect@thanethomson.com> * ADR 109: Fix mock generation (tendermint#1607) * internal: Fix mockery code generation script paths Signed-off-by: Thane Thomson <connect@thanethomson.com> * make mockery Signed-off-by: Thane Thomson <connect@thanethomson.com> --------- Signed-off-by: Thane Thomson <connect@thanethomson.com> --------- Signed-off-by: Thane Thomson <connect@thanethomson.com>
I was not able to reproduce #1600, but I did fix one possible panic and improved
SearchEndHeightbehaviour (see commit messages for details).