PrivValidatorAddr -> PrivValidatorListenAddr. Update ADR008#1256
PrivValidatorAddr -> PrivValidatorListenAddr. Update ADR008#1256
Conversation
|
I think maybe we can split the |
xla
left a comment
There was a problem hiding this comment.
@ebuchman It would be helpful to specify the behaviour of the node prior to the signer establishing a connection:
- Is it halting any other operations?
- Does it expect the signer to connect in a certain timeframe?
- Does it accept at most one connection?
- What is the behaviour in case the signer disconnects?
- Retry/back-off?
- When do we escalate that the signer is gone?
c8af366 to
5fb1d76
Compare
|
Really great questions thank you!
Simplest thing to do is probably to halt everything else until we've received a connection. But in an ideal world, we would still boot everything up except maybe the Consensus. I'm not sure exactly how this should be handled, since it might be hard to distinguish between not having received the first connection and the connection failing later. So probably best to just keep it simple at first :). Interested in your thoughts on how we might do it right though.
I could go either way on this one. Let's flame out for now after a minute or something. Would be valuable to get feedback from validators/users on how they want this to work. It's really a UX question.
I think we should accept more than 1. It can be a config option. We want folks to be able to set up multiple validators that Tendermint talks to in case one fails. That said, it will be critical that those validators stay synced, since if they don't, they might double signed. So maybe we default to 1 and let folks change it if they want more, but warn them that it means they have to take precaution to keep them synced by running a consensus protocol between the signers (eg. #1185)
I think we start logging errors and wait a few minutes for it to reconnect. If the signer doesn't come back after a few minutes, perhaps we should panic. Maybe @zmanian @tarcieri @nickray want to weigh in on some of this behaviour |
5fb1d76 to
87195a4
Compare
Codecov Report
@@ Coverage Diff @@
## develop #1256 +/- ##
==========================================
Coverage ? 59.82%
==========================================
Files ? 127
Lines ? 11623
Branches ? 0
==========================================
Hits ? 6954
Misses ? 4001
Partials ? 668
|
|
My personal inclination would be for it to continue to run until it receives connection, and perhaps log a warning that there are no signers connected. If it's offering the service, and panicking, it's just one more place for things to go wrong. |
87195a4 to
d4e4055
Compare
|
Rebased on latest develop. |
PrivValidator follow up re #1255