Skip to content

Refactoring Remote signers#3370

Merged
melekes merged 76 commits intotendermint:masterfrom
Zondax:jleni/privval
Aug 5, 2019
Merged

Refactoring Remote signers#3370
melekes merged 76 commits intotendermint:masterfrom
Zondax:jleni/privval

Conversation

@jleni
Copy link
Contributor

@jleni jleni commented Mar 1, 2019

This PR is related to #3107 and a continuation of #3351

It is important to emphasise that in the privval original design, client/server and listening/dialing roles are inverted and do not follow a conventional interaction.

Given two hosts A and B:

  • Host A is listener/client
  • Host B is dialer/server (contains the secret key)
    When A requires a signature, it needs to wait for B to dial in before it can issue a request.
    A only accepts a single connection and any failure leads to dropping the connection and waiting for B to reconnect.

The original rationale behind this design was based on security.

  • Host B only allows outbound connections to a list of whitelisted hosts.
  • It is not possible to reach B unless B dials in. There are no listening/open ports in B.

This PR results in the following changes:

  • Refactors ping/heartbeat to avoid previously existing race conditions.
  • Separates transport (dialer/listener) from signing (client/server) concerns to simplify workflow.
  • Unifies and abstracts away the differences between unix and tcp sockets.
  • A single signer endpoint implementation unifies connection handling code (read/write/close/connection obj)
  • The signer request handler (server side) is customizable to increase testability.
  • Updates and extends unit tests

A high level overview of the classes is as follows:

Transport (endpoints): The following classes take care of establishing a connection

  • SignerDialerEndpoint
  • SignerListeningEndpoint
  • SignerEndpoint groups common functionality (read/write/timeouts/etc.)

Signing (client/server): The following classes take care of exchanging request/responses

  • SignerClient
  • SignerServer

This PR also closes #3601

  • Updated all relevant documentation in docs
  • Updated all code comments where relevant
  • Wrote tests
  • Updated CHANGELOG_PENDING.md

@codecov-io
Copy link

codecov-io commented Mar 19, 2019

Codecov Report

Merging #3370 into master will increase coverage by 0.09%.
The diff coverage is 77.28%.

@@            Coverage Diff             @@
##           master    #3370      +/-   ##
==========================================
+ Coverage   65.97%   66.07%   +0.09%     
==========================================
  Files         234      237       +3     
  Lines       20244    20343      +99     
==========================================
+ Hits        13356    13441      +85     
- Misses       5816     5832      +16     
+ Partials     1072     1070       -2
Impacted Files Coverage Δ
types/priv_validator.go 54.54% <ø> (ø) ⬆️
privval/socket_listeners.go 86.2% <ø> (ø) ⬆️
privval/errors.go 0% <0%> (ø) ⬆️
privval/signer_server.go 100% <100%> (ø)
privval/messages.go 100% <100%> (ø) ⬆️
privval/signer_dialer_endpoint.go 100% <100%> (ø)
privval/utils.go 33.33% <16.66%> (-66.67%) ⬇️
privval/signer_client.go 48.33% <48.33%> (ø)
tools/tm-signer-harness/internal/test_harness.go 62.96% <68.42%> (-0.25%) ⬇️
node/node.go 62.29% <77.77%> (-0.46%) ⬇️
... and 18 more

@jleni
Copy link
Contributor Author

jleni commented Mar 19, 2019

This fail is unrelated but probably interesting. Maybe we should open another issue for this:
https://circleci.com/gh/tendermint/tendermint/50723#tests/containers/3

=== RUN   TestTransportMultiplexValidateNodeInfo
panic: send on closed channel

goroutine 2011 [running]:
github.com/tendermint/tendermint/p2p.TestTransportMultiplexAcceptNonBlocking.func1(0xc0002d2b40, 0xc000a3c000, 0xc000a3c0c0, 0xc000a3c060)
	/go/src/github.com/tendermint/tendermint/p2p/transport_test.go:256 +0x614
created by github.com/tendermint/tendermint/p2p.TestTransportMultiplexAcceptNonBlocking
	/go/src/github.com/tendermint/tendermint/p2p/transport_test.go:231 +0x273
FAIL	github.com/tendermint/tendermint/p2p	2.775s
Exited with code 1


@jleni
Copy link
Contributor Author

jleni commented Apr 1, 2019

@melekes @liamsi any news about this?
maybe we can discuss if we merge this or include other elements as what Tony was proposing.

@jleni
Copy link
Contributor Author

jleni commented Apr 1, 2019

I think it makes sense that we define a way to beta test changes like this. Is there a testnet planned for this purpose?

@xla xla changed the title WIP: Refactoring Remote signers [WIP] Refactoring Remote signers Apr 2, 2019
@liamsi
Copy link
Contributor

liamsi commented Apr 2, 2019

ref: #3486

@liamsi
Copy link
Contributor

liamsi commented Apr 2, 2019

@melekes @liamsi any news about this?
maybe we can discuss if we merge this or include other elements as what Tony was proposing.

Let us review it another time. I think we should merge this. But I'm not sure I do remember which changes Tony was proposing. Let's jump on call?

@jleni
Copy link
Contributor Author

jleni commented Apr 2, 2019

Here is the link to Tony's proposal

allow full node startup without a connection for improved failover
#3455

@liamsi
Copy link
Contributor

liamsi commented Apr 2, 2019

Ah Ok this one. Hmm. I'd prefer to keep this separate and implement Tony's proposal in a follow-up PR. Let's chat about this tomorrow.


func TestSignerClose(t *testing.T) {
for _, tc := range getSignerTestCases(t) {
func() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's drop func() {}() wrappers if they don't serve any specific purpose (async code is generally harder to support / test / etc.)

pingTimer *time.Ticker

pingTimer *time.Ticker
instanceMtx sync.Mutex // Ensures instance public methods access, i.e. SendRequest
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can just say

not concurrency safe

in the godoc comment and remove the instanceMtx

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This mutex is actually very important so external calls to SendRequest never mix with internal SendRequest(&PingRequest)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay

@tac0turtle tac0turtle added ready-for-review and removed WIP T:breaking Type: Breaking Change labels Jul 29, 2019
@melekes
Copy link
Contributor

melekes commented Aug 5, 2019

panic: Fail in goroutine after TestTransportHandshake has completed

goroutine 2777 [running]:
testing.(*common).Fail(0xc00054c200)
	/usr/local/go/src/testing/testing.go:565 +0x1a4
testing.(*common).Error(0xc00054c200, 0xc0001bcfa8, 0x1, 0x1)
	/usr/local/go/src/testing/testing.go:654 +0x8b
github.com/tendermint/tendermint/p2p.TestTransportHandshake.func1.2(0xc00054c200, 0xfaa060, 0xc0004ae110)
	/go/src/github.com/tendermint/tendermint/p2p/transport_test.go:540 +0x17f
created by github.com/tendermint/tendermint/p2p.TestTransportHandshake.func1
	/go/src/github.com/tendermint/tendermint/p2p/transport_test.go:531 +0x296
FAIL	github.com/tendermint/tendermint/p2p	7.962s

not sure if it's related


### IMPROVEMENTS:

- [privval] \#3370 Refactors and simplifies validator/kms connection handling. Please refer to thttps://github.com/tendermint/tendermint/pull/3370#issue-257360971
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- [privval] \#3370 Refactors and simplifies validator/kms connection handling. Please refer to thttps://github.com/tendermint/tendermint/pull/3370#issue-257360971
- [privval] \#3370 Refactors and simplifies validator/kms connection handling. Please refer to [this comment](https://github.com/tendermint/tendermint/pull/3370#issue-257360971) for details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Unify read/write in privval

9 participants