This repository was archived by the owner on Jun 3, 2020. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 42
This repository was archived by the owner on Jun 3, 2020. It is now read-only.
Restarting tmkms leads to Tendermint "remote signer timed out" #116
Copy link
Copy link
Closed
Description
Hi guys,
Sorry, not sure if the right place to report this is Tendermint or KMS repo. It seems very KMS specific, so trying here first.
Doing some testing and have tmkms running as a systemd service alongside gaia. When I restart tmkms, Tendermint does not recover gracefully and needs to be restarted as well.
Nov 24 22:50:04 val2 gaiad[86004]: I[24116-11-24|21:50:04.406] Starting BlockPool module=blockchain impl=BlockPool
Nov 24 22:50:04 val2 gaiad[86004]: I[24116-11-24|21:50:04.406] Starting IndexerService module=txindex impl=IndexerService
Nov 24 22:52:08 val2 systemd[1]: Stopping Tendermint KMS Service...
Nov 24 22:52:08 val2 systemd[1]: Stopped Tendermint KMS Service.
Nov 24 22:52:08 val2 systemd[1]: Started Tendermint KMS Service.
Nov 24 22:52:08 val2 gaiad[86004]: E[24116-11-24|21:52:08.402] Ping module=privval err=EOF
Nov 24 22:52:08 val2 kernel: usb 1-3: reset full-speed USB device number 23 using xhci_hcd
Nov 24 22:52:08 val2 kernel: usb 1-10: reset full-speed USB device number 5 using xhci_hcd
Nov 24 22:52:09 val2 kernel: usb 1-10: reset full-speed USB device number 5 using xhci_hcd
Nov 24 22:52:10 val2 gaiad[86004]: E[24116-11-24|21:52:10.401] Ping module=privval err="remote signer timed out"
Nov 24 22:52:12 val2 gaiad[86004]: E[24116-11-24|21:52:12.401] Ping module=privval err="remote signer timed out"
Nov 24 22:52:14 val2 gaiad[86004]: E[24116-11-24|21:52:14.401] Ping module=privval err="remote signer timed out"
Nov 24 22:52:16 val2 gaiad[86004]: E[24116-11-24|21:52:16.401] Ping module=privval err="remote signer timed out"
Nov 24 22:52:18 val2 gaiad[86004]: E[24116-11-24|21:52:18.401] Ping module=privval err="remote signer timed out"
Nov 24 22:52:20 val2 gaiad[86004]: E[24116-11-24|21:52:20.401] Ping module=privval err="remote signer timed out"
Nov 24 22:52:22 val2 gaiad[86004]: E[24116-11-24|21:52:22.401] Ping module=privval err="remote signer timed out"
Nov 24 22:52:24 val2 gaiad[86004]: E[24116-11-24|21:52:24.401] Ping module=privval err="remote signer timed out"
Nov 24 22:52:26 val2 gaiad[86004]: E[24116-11-24|21:52:26.401] Ping module=privval err="remote signer timed out"
Nov 24 22:52:28 val2 gaiad[86004]: E[24116-11-24|21:52:28.401] Ping module=privval err="remote signer timed out"
Nov 24 22:52:30 val2 gaiad[86004]: E[24116-11-24|21:52:30.401] Ping module=privval err="remote signer timed out"
Nov 24 22:52:32 val2 gaiad[86004]: E[24116-11-24|21:52:32.401] Ping module=privval err="remote signer timed out"
Nov 24 22:52:34 val2 gaiad[86004]: E[24116-11-24|21:52:34.401] Ping module=privval err="remote signer timed out"
Nov 24 22:52:36 val2 gaiad[86004]: E[24116-11-24|21:52:36.401] Ping module=privval err="remote signer timed out"
Restarting after the above errors, this also presents itself:
Nov 24 22:57:26 val2 systemd[1]: Started Gaia Service.
Nov 24 22:57:26 val2 gaiad[86067]: I[24116-11-24|21:57:26.468] Starting ABCI with Tendermint module=main
Nov 24 22:57:26 val2 gaiad[86067]: I[24116-11-24|21:57:26.480] Starting multiAppConn module=proxy impl=multiAppConn
Nov 24 22:57:26 val2 gaiad[86067]: I[24116-11-24|21:57:26.480] Starting localClient module=abci-client connection=query impl=localClient
Nov 24 22:57:26 val2 gaiad[86067]: I[24116-11-24|21:57:26.480] Starting localClient module=abci-client connection=mempool impl=localClient
Nov 24 22:57:26 val2 gaiad[86067]: I[24116-11-24|21:57:26.480] Starting localClient module=abci-client connection=consensus impl=localClient
Nov 24 22:57:26 val2 gaiad[86067]: I[24116-11-24|21:57:26.480] ABCI Handshake App Info module=consensus height=0 hash= software-version= protocol-version=0
Nov 24 22:57:26 val2 gaiad[86067]: I[24116-11-24|21:57:26.480] ABCI Replay Blocks module=consensus appHeight=0 storeHeight=0 stateHeight=0
Nov 24 22:57:26 val2 gaiad[86067]: I[24116-11-24|21:57:26.544] Completed ABCI Handshake - Tendermint and App are synced module=consensus appHeight=0 appHash=
Nov 24 22:57:26 val2 gaiad[86067]: I[24116-11-24|21:57:26.544] Starting TCPVal module=privval impl=TCPVal
Nov 24 22:57:29 val2 gaiad[86067]: E[24116-11-24|21:57:29.544] OnStart module=privval err="accept tcp 127.0.0.1:26658: i/o timeout"
Nov 24 22:57:29 val2 gaiad[86067]: ERROR: Error starting private validator client: accept tcp 127.0.0.1:26658: i/o timeout
Nov 24 22:57:29 val2 systemd[1]: gaia.service: Main process exited, code=exited, status=1/FAILURE
Nov 24 22:57:29 val2 systemd[1]: gaia.service: Unit entered failed state.
Nov 24 22:57:29 val2 systemd[1]: gaia.service: Failed with result 'exit-code'.
Nov 24 22:57:32 val2 systemd[1]: gaia.service: Service hold-off time over, scheduling restart.
Nov 24 22:57:32 val2 systemd[1]: Stopped Gaia Service.
Nov 24 22:57:32 val2 systemd[1]: Started Gaia Service.
Nov 24 22:57:33 val2 gaiad[86096]: I[24116-11-24|21:57:33.087] Starting ABCI with Tendermint module=main
Nov 24 22:57:33 val2 gaiad[86096]: I[24116-11-24|21:57:33.100] Starting multiAppConn module=proxy impl=multiAppConn
Nov 24 22:57:33 val2 gaiad[86096]: I[24116-11-24|21:57:33.100] Starting localClient module=abci-client connection=query impl=localClient
Nov 24 22:57:33 val2 gaiad[86096]: I[24116-11-24|21:57:33.100] Starting localClient module=abci-client connection=mempool impl=localClient
Nov 24 22:57:33 val2 gaiad[86096]: I[24116-11-24|21:57:33.100] Starting localClient module=abci-client connection=consensus impl=localClient
Nov 24 22:57:33 val2 gaiad[86096]: I[24116-11-24|21:57:33.100] ABCI Handshake App Info module=consensus height=0 hash= software-version= protocol-version=0
Nov 24 22:57:33 val2 gaiad[86096]: I[24116-11-24|21:57:33.100] ABCI Replay Blocks module=consensus appHeight=0 storeHeight=0 stateHeight=0
Nov 24 22:57:33 val2 gaiad[86096]: I[24116-11-24|21:57:33.163] Completed ABCI Handshake - Tendermint and App are synced module=consensus appHeight=0 appHash=
Nov 24 22:57:33 val2 gaiad[86096]: I[24116-11-24|21:57:33.163] Starting TCPVal module=privval impl=TCPVal
Nov 24 22:57:36 val2 gaiad[86096]: I[24116-11-24|21:57:36.124] This node is not a validator module=consensus addr=2369786F94AECAABEE11A1242A395EC9C6303BF9 pubKey=PubKeyEd25519{6C0B225542087B267B312F09424CF9E58C23519F9EC7B85181E036BB8E20E720}
Nov 24 22:57:36 val2 gaiad[86096]: I[24116-11-24|21:57:36.127] P2P Node ID module=p2p ID=06430257c53430df262d5010a26175db590b4154 file=/config/node_key.json
Nov 24 22:57:36 val2 gaiad[86096]: I[24116-11-24|21:57:36.127] Starting Node module=node impl=Node
Nov 24 22:57:36 val2 gaiad[86096]: I[24116-11-24|21:57:36.127] Starting EventBus module=events impl=EventBus
Nov 24 22:57:36 val2 gaiad[86096]: I[24116-11-24|21:57:36.127] Starting PubSub module=pubsub impl=PubSub
I notice that often Tendermint will time out rather quickly when waiting for tmkms, as shown in the above as well.
Thanks for the great work so far.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels