-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Description
Tendermint version
develop branch
What happened:
When the remote signer drops the connection:
CONSENSUS FAILURE!!! module=consensus err=EOF or
or
CONSENSUS FAILURE!!! module=consensus err=remote signer timed out
Additionally, ConsensusState/receiveRoutine shuts down.
What you expected to happen:
A clearer error message and tendermint should probably retry / continue receiving.
Probably, these methods should not panic but return an error instead:
tendermint/privval/remote_signer.go
Lines 36 to 53 in b771798
| func (sc *RemoteSignerClient) GetAddress() types.Address { | |
| pubKey, err := sc.getPubKey() | |
| if err != nil { | |
| panic(err) | |
| } | |
| return pubKey.Address() | |
| } | |
| // GetPubKey implements PrivValidator. | |
| func (sc *RemoteSignerClient) GetPubKey() crypto.PubKey { | |
| pubKey, err := sc.getPubKey() | |
| if err != nil { | |
| panic(err) | |
| } | |
| return pubKey | |
| } |
Depending on the error (timeout/eof vs unknown other), we should continue or exit.
Have you tried the latest version: yes
How to reproduce it (as minimally and precisely as possible):
./tendermint node --priv_validator_laddr=tcp://127.0.0.1:26659 --proxy_app=kvstore
and start e.g. start priv_val_server (or the kms) as a remote signer and and shut it down after a few rounds of signing.
Logs (paste a small part showing an error (< 10 lines) or link a pastebin, gist, etc. containing more of the log file):
log (click to expand)
I[27116-11-27|19:20:13.952] Executed block module=state height=492 validTxs=0 invalidTxs=0
I[27116-11-27|19:20:13.952] Committed state module=state height=492 txs=0 appHash=0000000000000000
I[27116-11-27|19:20:13.953] Indexed block module=txindex height=492
I[27116-11-27|19:20:14.954] Timed out module=consensus dur=998.016427ms height=493 round=0 step=RoundStepNewHeight
I[27116-11-27|19:20:14.954] enterNewRound(493/0). Current: 493/0/RoundStepNewHeight module=consensus height=493 round=0
I[27116-11-27|19:20:14.954] enterPropose(493/0). Current: 493/0/RoundStepNewRound module=consensus height=493 round=0c
E[27116-11-27|19:20:14.955] CONSENSUS FAILURE!!! module=consensus err=EOF stack="goroutine 93 [running]:\nruntime/debug.Stack(0xc420ca5550, 0x1, 0x1)\n\t/usr/local/Cellar/go/1.10.3/libexec/src/runtime/debug/stack.go:24 +0xa7\ngithub.com/tendermint/tendermint/consensus.(*ConsensusState).receiveRoutine.func2(0xc4200b6a80, 0x1b74350)\n\t/Users/ismail/go/src/github.com/tendermint/tendermint/consensus/state.go:583 +0xf9\npanic(0x18dc060, 0xc420074040)\n\t/usr/local/Cellar/go/1.10.3/libexec/src/runtime/panic.go:502 +0x229\ngithub.com/tendermint/tendermint/privval.(*RemoteSignerClient).GetAddress(0xc42019e560, 0x0, 0x0, 0x0)\n\t/Users/ismail/go/src/github.com/tendermint/tendermint/privval/remote_signer.go:39 +0x91\ngithub.com/tendermint/tendermint/consensus.(*ConsensusState).enterPropose(0xc4200b6a80, 0x1ed, 0x0)\n\t/Users/ismail/go/src/github.com/tendermint/tendermint/consensus/state.go:872 +0x613\ngithub.com/tendermint/tendermint/consensus.(*ConsensusState).enterNewRound(0xc4200b6a80, 0x1ed, 0x0)\n\t/Users/ismail/go/src/github.com/tendermint/tendermint/consensus/state.go:790 +0x81a\ngithub.com/tendermint/tendermint/consensus.(*ConsensusState).handleTimeout(0xc4200b6a80, 0x3b7c85ab, 0x1ed, 0x0, 0x1, 0x1ed, 0x0, 0x1, 0x38b16ab2, 0xed38f81de, ...)\n\t/Users/ismail/go/src/github.com/tendermint/tendermint/consensus/state.go:701 +0x551\ngithub.com/tendermint/tendermint/consensus.(*ConsensusState).receiveRoutine(0xc4200b6a80, 0x0)\n\t/Users/ismail/go/src/github.com/tendermint/tendermint/consensus/state.go:623 +0x430\ncreated by github.com/tendermint/tendermint/consensus.(*ConsensusState).OnStart\n\t/Users/ismail/go/src/github.com/tendermint/tendermint/consensus/state.go:307 +0x140\n"
I[27116-11-27|19:20:14.955] Stopping baseWAL module=consensus wal=/Users/ismail/.tendermint/data/cs.wal/wal impl=baseWAL
I[27116-11-27|19:20:14.955] Stopping Group module=consensus wal=/Users/ismail/.tendermint/data/cs.wal/wal impl=Group
E[27116-11-27|19:20:15.853] Ping module=privval err="remote signer timed out"
ref tendermint/tmkms#116
ref #2923