server,gossip: Store generated join tokens in gossip#61512
server,gossip: Store generated join tokens in gossip#61512itsbilal wants to merge 1 commit intocockroachdb:masterfrom
Conversation
aaron-crl
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @itsbilal and @knz)
pkg/server/join_token.go, line 102 at r1 (raw file):
var b bytes.Buffer encoder := base64.NewEncoder(base64.URLEncoding, &b)
Did we check that base64 was a friendly encoding?
pkg/server/status.go, line 2563 at r1 (raw file):
certLocator := security.MakeCertsLocator(s.cfg.SSLCertsDir) if ok, err := certLocator.HasNodeCert(); err != nil || !ok {
Should be able to just try to read the cert using something like this:
cl := security.MakeCertsLocator(s.server.cfg.SSLCertsDir)
caCert, err := loadCertificateFile(cl.CACertPath())
if err != nil {
return nil, errors.Wrapf(
err,
"failed to read inter-node cert from disk at %q ",
cl.CACertPath(),
)
}
aaron-crl
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @itsbilal and @knz)
pkg/server/join_token.go, line 125 at r2 (raw file):
return false, err } return bytes.Equal(token, token2), nil
This should use a ConstantTimeCompare to avoid timing attacks. Feel free to drop a TODO(aaron-crl) for me to pick up here if you don't have time for it today.
7d4665c to
fde2058
Compare
|
pkg/server/join_token.go, line 125 at r2 (raw file): Previously, aaron-crl wrote…
TODO added, thanks! |
* removed stubs and dummy functions * cleaned up TODOs Release justification: low risk, feature flagged functional enhancement Release note: None
fde2058 to
68c673b
Compare
|
Rebased on master, this is ready for a look by someone more familiar with gossip |
68c673b to
6bb6309
Compare
* removed stubs and dummy functions * cleaned up TODOs Release justification: low risk, feature flagged functional enhancement Release note: None
tbg
left a comment
There was a problem hiding this comment.
The use of Gossip is fine, though I can't yet see how the use looks end-to-end because it seems like this PR is just partial work. Gossiping a join token with a 10 minute TTL seems fine though from the Gossip perspective.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @bdarnell, @itsbilal, and @knz)
pkg/server/join_token.go, line 143 at r3 (raw file):
// Checks the join token in the gossip store matches the marshalled form of // the passed-in join token. func (j *joinToken) isValid(g *gossip.Gossip) (bool, error) {
It seems a bit weird to pass *Gossip in here. Why not take a []byte to avoid the tight coupling?
No testing?
6bb6309 to
55e3031
Compare
|
pkg/server/join_token.go, line 143 at r3 (raw file): Previously, tbg (Tobias Grieger) wrote…
Thanks for the review! I added the test right after you wrote this. Also the only reason to throw the gossip client in here is to get it done in this PR (or at least commit) and so @aaron-crl has it ready to work off of. I can see |
aaron-crl
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @bdarnell, @itsbilal, @knz, and @tbg)
pkg/server/join_token.go, line 143 at r3 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
Thanks for the review! I added the test right after you wrote this.
Also the only reason to throw the gossip client in here is to get it done in this PR (or at least commit) and so @aaron-crl has it ready to work off of. I can see
[]bytemaking more sense. Maybe we shouldn't be merging this until Aaron's RPC method that uses this function is ready.
Now that I've seen this pattern I think I can manage the gossip in my PR where it's aligned with the functionality. I'll get someone to review that next week along with the rest of the rest of those changes. We can probably drop this function altogether since without gossip here, it just becomes a Marshal+Equal.
itsbilal
left a comment
There was a problem hiding this comment.
Yep, the additional pieces are yet to come. We've done a pretty fine-grained chunking up of work to increase parallelism of work.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @aaron-crl, @bdarnell, @knz, and @tbg)
pkg/server/join_token.go, line 102 at r1 (raw file):
Previously, aaron-crl wrote…
Did we check that base64 was a friendly encoding?
Docs say this is URL-safe so I assume yes
pkg/server/join_token.go, line 143 at r3 (raw file):
Previously, aaron-crl wrote…
Now that I've seen this pattern I think I can manage the gossip in my PR where it's aligned with the functionality. I'll get someone to review that next week along with the rest of the rest of those changes. We can probably drop this function altogether since without gossip here, it just becomes a Marshal+Equal.
Alright, I'll drop this function and keep this PR to just the change to the function in status.go.
pkg/server/status.go, line 2563 at r1 (raw file):
Previously, aaron-crl wrote…
Should be able to just try to read the cert using something like this:
cl := security.MakeCertsLocator(s.server.cfg.SSLCertsDir) caCert, err := loadCertificateFile(cl.CACertPath()) if err != nil { return nil, errors.Wrapf( err, "failed to read inter-node cert from disk at %q ", cl.CACertPath(), ) }
Done.
When a join token is generated, gossip it among other connected nodes with a short TTL. This allows that join token to be used when connecting to any node in the cluster, and not necessarily to the same node where the join token was generated. Part of cockroachdb#60632. Release justification: Changes to new code paths, gated behind experimental feature flag. Release note: None.
55e3031 to
a57cdfb
Compare
aaron-crl
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @aaron-crl, @bdarnell, @itsbilal, @knz, and @tbg)
pkg/server/join_token.go, line 102 at r1 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
Docs say this is URL-safe so I assume yes
Testing locally "foobar!" => "Zm9vYmFyIQ=="
Double-clicking the encoded string gives me all but the padding characters which is not cut/paste friendly. Since we control both ends of this we can probably just trim the padding and then anticipate decoding without it in the unmarshaling function.
|
Based on a Slack discussion, we're going to can this PR as we need stricter at-most-once semantics than were expected at first. An eventually consistent best-effort once wouldn't do. |
When a join token is generated, gossip it among other connected
nodes with a short TTL. This allows that join token to be used
when connecting to any node in the cluster, and not necessarily
to the same node where the join token was generated.
Also add an isValid function to the join token to confirm (on
the receiver end) that the join token matches what was stored
in gossip.
First commit is #61505. Part of #60632.
Release justification: Changes to new code paths, gated behind
experimental feature flag.
Release note: None.