-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Description
Tendermint version (use tendermint version or git rev-parse --verify HEAD if installed from source): 0.30.1 (as reported by localhost:26657/status
ABCI app (name for built-in, URL for self-written if it's publicly available): cosmos-sdk:0.33.0
Environment:
- OS (e.g. from /etc/os-release): Ubuntu 18.04.2
- Install tools:
- Others: tendermint/KMS: 0.4.0
What happened:
After signing for a period of time (first time, about 12 hours; second time about a week), the KMS sends a message gaiad does not expect, and gaiad logs the following message:
ESC[32mgaia_1 |ESC[0m E[2019-03-22|18:36:54.561] Error signing vote module=consensus height=112215 round=0 vote="Vote{15:31920F9BC3A3 112215/00/1(Prevote) 89AA573A77A7 00000
0000000 @ 2019-03-22T18:36:51.561061017Z}" err="remote signer timed out"
ESC[32mgaia_1 |ESC[0m E[2019-03-22|18:36:54.704] Ping module=privval err="received unexpected response"
ESC[32mgaia_1 |ESC[0m E[2019-03-22|18:36:54.705] Error signing vote module=consensus height=112215 round=0 vote="Vote{15:31920F9BC3A3 112215/00/2(Precommit) 89AA573A77A7 000000000000 @ 2019-03-22T18:36:54.705527937Z}" err="received unexpected response"
Logs past this point appear normal, and the KMS logs show that gaiad was continuing to make requests to the KMS after these errors, but the prevote/precommits after height 112214 are missing from blocks until gaiad was restarted.
To summarise:
Gaiad (< 112215): OK, 100% blocks signed
KMS (< 112215): OK, 100% blocks signed
Gaiad (112215): Error messages as above
KMS (112215): Received sign requests from gaiad, no errors
Gaiad (>112215): No errors, no prevotes/precommits/proposals included in blocks
KMS (>112215): Received sign requests from gaiad, no errors
What you expected to happen:
No errors, blocks signed as expected.
Have you tried the latest version: yes/no
Unable to test current version in this environment
How to reproduce it (as minimally and precisely as possible):
Not reliably reproducible; but repeats after random period of time.
Having looked into code, this error only displays when the message returned by the external signer does not match the type it expected to deal with (e.g. Ping received instead of SignedVote, etc.) (privval/remote_signer.go; search ErrUnexpectedResponse).
This could potentially be an invalid message being emitted by the signer, or a race condition.