Skip to content

privval: Add IPCPV and fix SocketPV#2568

Merged
xla merged 6 commits intotendermint:developfrom
certusone:develop
Oct 17, 2018
Merged

privval: Add IPCPV and fix SocketPV#2568
xla merged 6 commits intotendermint:developfrom
certusone:develop

Conversation

@hendrikhofstadt
Copy link
Contributor

@hendrikhofstadt hendrikhofstadt commented Oct 7, 2018

Ref: #2563

I added IPC as an unencrypted alternative to SocketPV.

Besides I fixed the following aspects of SocketPV:

  • Added locking since we are operating on a single socket
  • The connection deadline is extended every time a successful packet exchange happens; otherwise the connection would always die permanently x seconds after the connection was established.
  • Added a ping/heartbeat mechanism to keep the connection alive; native TCP keepalives do not work in this use-case
  • Updated all relevant documentation in docs
  • Updated all code comments where relevant
  • Wrote tests
  • Updated CHANGELOG_PENDING.md

@hendrikhofstadt hendrikhofstadt changed the title [WIP] Add IPCPV and fix SocketPV WIP: Add IPCPV and fix SocketPV Oct 7, 2018
@hendrikhofstadt hendrikhofstadt changed the title WIP: Add IPCPV and fix SocketPV [WIP] Add IPCPV and fix SocketPV Oct 7, 2018
@hendrikhofstadt hendrikhofstadt force-pushed the develop branch 3 times, most recently from 9a954bb to 3e95047 Compare October 7, 2018 18:05
@codecov-io
Copy link

codecov-io commented Oct 7, 2018

Codecov Report

Merging #2568 into develop will increase coverage by 0.07%.
The diff coverage is 77.07%.

@@            Coverage Diff             @@
##           develop   #2568      +/-   ##
==========================================
+ Coverage    61.33%   61.4%   +0.07%     
==========================================
  Files          203     207       +4     
  Lines        16781   16846      +65     
==========================================
+ Hits         10292   10344      +52     
- Misses        5621    5627       +6     
- Partials       868     875       +7
Impacted Files Coverage Δ
node/node.go 66.05% <0%> (ø) ⬆️
privval/wire.go 100% <100%> (ø) ⬆️
privval/tcp_socket.go 92% <100%> (ø)
privval/ipc_server.go 64.15% <64.15%> (ø)
privval/tcp.go 71.26% <71.26%> (ø)
privval/tcp_server.go 78.57% <78.57%> (ø)
privval/ipc.go 80% <80%> (ø)
privval/remote_signer.go 82.85% <82.85%> (ø)
libs/events/events.go 93.2% <0%> (-4.86%) ⬇️
libs/db/remotedb/remotedb.go 36.84% <0%> (-4.69%) ⬇️
... and 8 more

@hendrikhofstadt hendrikhofstadt changed the title [WIP] Add IPCPV and fix SocketPV [R4R] Add IPCPV and fix SocketPV Oct 7, 2018
@xla xla self-assigned this Oct 8, 2018
@xla xla changed the title [R4R] Add IPCPV and fix SocketPV privval: Add IPCPV and fix SocketPV Oct 8, 2018
Copy link
Contributor

@xla xla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hej @SLAMPER, thanks a lot for taking the time and putting together this changeset. Most of it should be good as is.

The RemoterSignerClient in its current form doesn't seem to be fit in entirely. I understand that it is the shared abstraction so the actual message handling doesn't need to be duplciated. What strikes me as odd that it is another piece that implements BaseService, besides that I firmly believe we need to drop the service idea as means for lifecycle management, we should not nest those services further. Ideally the RemoteSignerClient is stateless in nature, not shared and concerns like heartbeating (calling Ping) is handled by the wrapping implementation which already has to take care of listener setup and teardown and would be the perfect place to also have a ticker running which gets cleaned up.

Another issue we should address is to carefully craft and return errors so callers gain more control. Let me know if you need references.

connHeartbeat time.Duration

conn net.Conn
lock sync.Mutex
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the reason for the lock? net.Conn is threadsafe.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea was that if the KMS does asynchronous receive the following could happen:

TM sends a sign request.
TM send a getpubkey request.

The KMS handles both but for whichever reason it finishes the getpubkey request first.

Then the sign method would receive the packet for the getpubkey then since we have no package IDs etc.

As long as the KMS side does proper synchronizing this is not an issue but we should take care of this here.

Or do I have misconceptions about how this works ?

if resp.Error != nil {
return fmt.Errorf("remote error occurred: code: %v, description: %s",
resp.Error.Code,
resp.Error.Description)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs proper error type.

Odd formatting, should be in the form of:

	return fmt.Errorf(
		"remote error occurred: code: %v, description: %s",
		resp.Error.Code,
		resp.Error.Description,
	)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will create a dedicated error type.

@xla xla requested a review from liamsi October 8, 2018 08:59
@xla
Copy link
Contributor

xla commented Oct 8, 2018

@liamsi Could you review the proposed heartbeat messages in this PR?

@liamsi
Copy link
Contributor

liamsi commented Oct 8, 2018

@liamsi Could you review the proposed heartbeat messages in this PR?

They are sane in the sense that they do the job: every connHeartbeat period a PingRequest is sent to keep the connection alive. What I found confusing is that I would have expected the SignHeartbeatRequest to be used exactly for this purpose though. Now we have two "hearbeat messages", one which is signed (and holds info about the current round/height etc) and one which is a simple (empty) registered amino message. Am I the only one who finds this confusing?

@hendrikhofstadt
Copy link
Contributor Author

hendrikhofstadt commented Oct 8, 2018

@liamsi
While the pings I implemented are there to maintain and check the health of the connection, the Heartbeat you mention is an part of consensus and imho should not be used for connection maintenance. Not only because it is a part of the consensusreactor and not the connection but also because it might not be suited for every connection type.

Also the pings are sent regularly at fixed intervals opposed to the Heartbeat (at least from what I have read).

@liamsi
Copy link
Contributor

liamsi commented Oct 9, 2018

Thanks for the prompt reply! Although you are right and we do not have a go-routine or alike which periodically sends Heartbeat messages, I thought the name alone was already an indicator that it is meant to be send periodically (fixed intervals). I might be wrong though (cc @ebuchman, @milosevic). If the orig. heartbeat message is meant to be sent periodically, would we still need an additional message to keep the connection alive? For which connection types would it be unsuited?

Although the reactor deals with a ProposalHeartbeatMessage, the core consensus doesn't know about the heartbeat either (at least the spec of the core consensus doesn't explicitly mention a heartbeat message).

@hendrikhofstadt
Copy link
Contributor Author

@liamsi
Thanks for explaining more in-depth what you meant.

What I saw was that the heartbeats are only sent if waitForTxs is true.

if waitForTxs {
if cs.config.CreateEmptyBlocksInterval > 0 {
cs.scheduleTimeout(cs.config.EmptyBlocksInterval(), height, round, cstypes.RoundStepNewRound)
}
go cs.proposalHeartbeat(height, round)

A connection relevant heartbeat/keepalive should not rely on a consensus reactor setting but have to sole purpose to maintain the connection also the period should be deterministic/fixed which the proposeHeartbeat is not from what I could see.

@liamsi
Copy link
Contributor

liamsi commented Oct 9, 2018

Thanks again. Yeah, I was expecting that any remote signer implementation would periodically send (and sign) the following messages instead (not the ProposalHeartbeatMessages used in the reactor / consensus state):

// SignHeartbeatRequest is a PrivValidatorSocket message containing a Heartbeat.
type SignHeartbeatRequest struct {
Heartbeat *types.Heartbeat
}
type SignedHeartbeatResponse struct {
Heartbeat *types.Heartbeat
Error *RemoteSignerError
}

If this isn't the case and those do not work for all connection types (?), you are absolutely right and we'd probably need an actual heartbeat message (as the one you've introduced).

@ebuchman
Copy link
Contributor

@SLAMPER do you mind if we merge #2548 first to get the small fix out of the way?

@hendrikhofstadt
Copy link
Contributor Author

@ebuchman
No. I think it makes sense to merge your fix first 👍 I did not seen it when I created this PR.

@ebuchman
Copy link
Contributor

Cool thanks - merged

@hendrikhofstadt
Copy link
Contributor Author

Btw this is still WIP since we decided to implement a simple MAC auth in IPCVal #2563

@ValarDragon
Copy link
Contributor

You could add the MAC in a separate PR :)

@xla xla changed the title privval: Add IPCPV and fix SocketPV [WIP] privval: Add IPCPV and fix SocketPV Oct 15, 2018
@melekes
Copy link
Contributor

melekes commented Oct 16, 2018

@SLAMPER will you have time to address my comments? After these are fixed, we can merge this PR.

@hendrikhofstadt
Copy link
Contributor Author

@melekes I will do so today and then we can merge 👍

Copy link
Contributor

@melekes melekes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🍰 🌮 🍉

@melekes melekes changed the title [WIP] privval: Add IPCPV and fix SocketPV privval: Add IPCPV and fix SocketPV Oct 17, 2018
@xla xla mentioned this pull request Oct 17, 2018
3 tasks
Copy link
Contributor

@xla xla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 :octocat: :shipit: 🍡

Thanks so much @SLAMPER for the great change-set and proper improvement of the package. The sustained collaboration on this PR over the last weeks was inspiring. To move forward I captured a smalls et of follow-ups in #2651 and will probably pick up myself shortly.

@xla xla merged commit f60713b into tendermint:develop Oct 17, 2018
@xla
Copy link
Contributor

xla commented Oct 17, 2018

Follow-up in #2651

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants