Skip to content

Close and retry a RemoteSigner on err#2876

Closed
zmanian wants to merge 2 commits intotendermint:developfrom
zmanian:zaki/retry_tcp_validator
Closed

Close and retry a RemoteSigner on err#2876
zmanian wants to merge 2 commits intotendermint:developfrom
zmanian:zaki/retry_tcp_validator

Conversation

@zmanian
Copy link
Contributor

@zmanian zmanian commented Nov 18, 2018

This allows recovery if the RemoteSigner service is restarted etc without restarting the process.

Copy link
Contributor

@melekes melekes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🥑


### IMPROVEMENTS:

Retry RemoteSigner connections on Error
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Retry RemoteSigner connections on Error
- [privval] retry RemoteSigner connections on Error

"err", err,
)
sc.conn.Close()
sc.RemoteSignerClient = NewRemoteSignerClient(sc.conn)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're closing the sc.conn, why do we create a new remote signer client with it? Shouldn't we call OnStop(), then OnStart() to wait for conn and recreate all necessary structures and goroutines?

Shouldn't we assert the err's type? We may want to shut everything down in case of ErrUnexpectedResponse, no?

Copy link
Contributor

@liamsi liamsi Nov 27, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're closing the sc.conn, why do we create a new remote signer client with it? Shouldn't we call OnStop(), then OnStart() to wait for conn and recreate all necessary structures and goroutines?

I think that makes sense! Do we need to limit the recursion depth here though (as we are already in OnStart)?

Shouldn't we assert the err's type? We may want to shut everything down in case of ErrUnexpectedResponse, no?

Probably yes. ErrUnexpectedResponse means the other side responded but with another message than expected (not PingResponse).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll continue here: #2923

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW calling OnStart won't be enough as this might time out (again) if the remote signer isn't up yet. We might need sth like retrying with exponential backoff.
Also, the whole ConsensusState shuts down if the remote signer drops (restarting the remote signer won't help with this): #2926

@liamsi
Copy link
Contributor

liamsi commented Nov 27, 2018

closing in favour of: #2923

@liamsi liamsi closed this Nov 27, 2018
cboh4 pushed a commit to scrtlabs/tendermint that referenced this pull request Apr 7, 2025
…rmint#2846) (tendermint#2876)

---

Many RPC methods require JSON marshalled responses. We saw this taking a
notable amount of heap allocation in query serving full nodes. This PR
removes some extra heap allocations that were being done. We avoided
using the more efficient encoder.Encode before, because it added a
newline. This PR changes the function signature for these private
methods to be using *bytes.Buffer, and then uses the in-buffer methods
(rather than a second copy). We then just truncate the final byte after
each such call, which does not waste any allocations.

I added a benchmark for the most complex test case. 

OLD:
```
BenchmarkJsonMarshalStruct-12              78992             15542 ns/op            4487 B/op        191 allocs/op
```
New:
```
BenchmarkJsonMarshalStruct-12              93346             11132 ns/op            3245 B/op         58 allocs/op
```

Roughly a 3-4x reduction in the number of allocations, and 20% speedup.

#### PR checklist

- [x] Tests written/updated - Existing tests cover this
- [x] Changelog entry added in `.changelog` (we use
[unclog](https://github.com/informalsystems/unclog) to manage our
changelog)
- [x] Updated relevant documentation (`docs/` or `spec/`) and code
comments
- [x] Title follows the [Conventional
Commits](https://www.conventionalcommits.org/en/v1.0.0/) spec
<hr>This is an automatic backport of pull request tendermint#2846 done by
[Mergify](https://mergify.com).

---------

Co-authored-by: Dev Ojha <ValarDragon@users.noreply.github.com>
Co-authored-by: Andy Nogueira <me@andynogueira.dev>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants