server: added initial cert utilities for automatic cert generation#60705
server: added initial cert utilities for automatic cert generation#60705craig[bot] merged 1 commit intocockroachdb:masterfrom
Conversation
stevendanna
left a comment
There was a problem hiding this comment.
Neat! I left some comments, but many are for my own edification rather than blockers.
pkg/security/auto_tls_init.go
Outdated
| OrganizationalUnit: []string{service}, | ||
| Country: []string{"US"}, | ||
| }, | ||
| DNSNames: []string{hostname}, |
There was a problem hiding this comment.
Are we guaranteed to have a hostname here? Looking at the callsite that isn't clear to me.
Perhaps we could do something like:
ip := net.ParseIP(hostname)
if ip != nil {
serviceCert.IPAddresses = []net.IP{ip}
} else {
serviceCert.DNSNames = []string{hostname}
}
pkg/security/auto_tls_init.go
Outdated
| max.Exp(big.NewInt(2), big.NewInt(130), nil).Sub(max, big.NewInt(1)) | ||
| serialNumber, err := rand.Int(rand.Reader, max) | ||
| if nil != err { | ||
| log.Fatal("Failed to create random serial number") |
There was a problem hiding this comment.
Are we sure we want to use fatal here rather than returning an error? Fatal will call os.Exit(1).
pkg/security/auto_tls_init.go
Outdated
| max.Exp(big.NewInt(2), big.NewInt(130), nil).Sub(max, big.NewInt(1)) | ||
| serialNumber, err := rand.Int(rand.Reader, max) | ||
| if nil != err { | ||
| log.Fatal("Failed to create random serial number") |
There was a problem hiding this comment.
Same question about log.Fatal here.
pkg/server/auto_tls_init.go
Outdated
| } | ||
|
|
||
| // Start by loading or creating the InterNode certificates | ||
| b.InterNode.loadOrCreateServiceCertificates( |
There was a problem hiding this comment.
I believe this and the below function calls are missing error checks.
pkg/security/auto_tls_init.go
Outdated
| func CreateCACertAndKey(lifespan time.Duration, service string) (certPEM []byte, keyPEM []byte, err error) { | ||
| // This function creates a short lived initial CA for node initialization | ||
|
|
||
| notBefore := time.Now() |
There was a problem hiding this comment.
I haven't had a chance to read through the RFC yet, but if we are using this cert quickly after creating in a cluster setting where we might have clock drift, we might consider setting this a few seconds in the past to avoid confusing errors.
pkg/security/auto_tls_init.go
Outdated
| NotBefore: notBefore, | ||
| NotAfter: notAfter, | ||
| ExtKeyUsage: []x509.ExtKeyUsage{x509.ExtKeyUsageClientAuth, x509.ExtKeyUsageServerAuth}, | ||
| KeyUsage: x509.KeyUsageDigitalSignature, |
There was a problem hiding this comment.
Do we need x509.KeyUsageKeyEncipherment as well? I ask because I see we support plain RSA key exchange here
Lines 185 to 188 in 8187c25
(aside: we might consider an issue for making that list configurable or removing the CBC-mode ciphers from the list)
Either way, perhaps a few comments in these template creations to document why or why not might be useful to us in the future?
knz
left a comment
There was a problem hiding this comment.
Reviewed 3 of 3 files at r1.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @aaron-crl, @itsbilal, and @stevendanna)
pkg/security/auto_tls_init.go, line 30 at r1 (raw file):
// with cluster auto certificate generation. func CreateCACertAndKey(lifespan time.Duration, service string) (certPEM []byte, keyPEM []byte, err error) { // This function creates a short lived initial CA for node initialization
nit: here and throughout the PR: comments are to contain only fully formed sentences, with capitals at the beginning and a period at the end.
pkg/security/auto_tls_init.go, line 32 at r1 (raw file):
Previously, stevendanna (Steven Danna) wrote…
I haven't had a chance to read through the RFC yet, but if we are using this cert quickly after creating in a cluster setting where we might have clock drift, we might consider setting this a few seconds in the past to avoid confusing errors.
Agreed with Steven here.
(Also you'll want timeutil.Now() not time.Now() - we have clock overrides)
pkg/security/auto_tls_init.go, line 37 at r1 (raw file):
// create random serial number for CA max := new(big.Int) max.Exp(big.NewInt(2), big.NewInt(130), nil).Sub(max, big.NewInt(1))
please extract the generation of these bignums into a helper function and reuse it here and below
pkg/security/auto_tls_init.go, line 40 at r1 (raw file):
Previously, stevendanna (Steven Danna) wrote…
Are we sure we want to use fatal here rather than returning an error? Fatal will call
os.Exit(1).
agreed this deserves an error
also, nit: all the logging and error stuff returns messages that are not sentences, so do not start with a capital and do not end with a period
pkg/security/auto_tls_init.go, line 59 at r1 (raw file):
// create private and public key caPrivKey, err := rsa.GenerateKey(rand.Reader, 4096)
we don't use magic constants in the code, so this 4096 needs to be extracted into a named constant and documented.
pkg/security/auto_tls_init.go, line 92 at r1 (raw file):
// This is a utility function to help with cluster auto certificate generation func CreateServiceCertAndKey(lifespan time.Duration, service, hostname string, caCertPEM []byte, caKeyPEM []byte) (certPEM []byte, keyPEM []byte, err error) {
nit: no empty space at the start of functions
pkg/security/auto_tls_init.go, line 99 at r1 (raw file):
// create random serial number for CA max := new(big.Int) max.Exp(big.NewInt(2), big.NewInt(130), nil).Sub(max, big.NewInt(1))
reuse helper function
pkg/security/auto_tls_init.go, line 137 at r1 (raw file):
Previously, stevendanna (Steven Danna) wrote…
Are we guaranteed to have a hostname here? Looking at the callsite that isn't clear to me.
Perhaps we could do something like:
ip := net.ParseIP(hostname) if ip != nil { serviceCert.IPAddresses = []net.IP{ip} } else { serviceCert.DNSNames = []string{hostname} }
I would recommend passing this via configuration so it can be overridden from the command line. However, for the very first version of the PR, let's just leave a TODO in a comment here, and add a follow up item to #60632.
pkg/security/auto_tls_init.go, line 144 at r1 (raw file):
} servicePrivKey, err := rsa.GenerateKey(rand.Reader, 4096)
see comment above about constant
pkg/server/auto_tls_init.go, line 12 at r1 (raw file):
// TODO(aaron-crl): This uses the CertsLocator from the security package // Getting about half way to integration with the certificate manager
nit: missing period
also throughout the rest of this file
pkg/server/auto_tls_init.go, line 80 at r1 (raw file):
if _, err = os.Stat(caKeyPath); !os.IsNotExist(err) { // key exist, this is an error err = errors.New("found key but not certificate for user auth at " + caKeyPath)
errors.Newf("found ... %s", caKeyPath)
Might want to use errors.Wrap / Wrapf too.
pkg/server/auto_tls_init.go, line 134 at r1 (raw file):
if _, err = os.Stat(serviceKeyPath); os.IsNotExist(err) { // cert exists but key doesn't, this is an error err = errors.New("failed to load service certificate key for " + serviceCertPath + "expected key at " + serviceKeyPath)
errors.Newf, here and below or Wrapf
aaron-crl
left a comment
There was a problem hiding this comment.
Thanks for the feedback @stevendanna and @knz !
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @itsbilal, @knz, and @stevendanna)
pkg/security/auto_tls_init.go, line 30 at r1 (raw file):
Previously, knz (kena) wrote…
nit: here and throughout the PR: comments are to contain only fully formed sentences, with capitals at the beginning and a period at the end.
Thanks! I think I got all of these.
pkg/security/auto_tls_init.go, line 32 at r1 (raw file):
Previously, knz (kena) wrote…
Agreed with Steven here.
(Also you'll wanttimeutil.Now()nottime.Now()- we have clock overrides)
Thanks! Addressed both.
pkg/security/auto_tls_init.go, line 37 at r1 (raw file):
Previously, knz (kena) wrote…
please extract the generation of these bignums into a helper function and reuse it here and below
Done.
pkg/security/auto_tls_init.go, line 40 at r1 (raw file):
Previously, knz (kena) wrote…
agreed this deserves an error
also, nit: all the logging and error stuff returns messages that are not sentences, so do not start with a capital and do not end with a period
Thanks, done.
pkg/security/auto_tls_init.go, line 55 at r1 (raw file):
Previously, stevendanna (Steven Danna) wrote…
Do we want
MaxPathLenZero: trueas well? I believe we do if we don't intend to have any intermediate certificates issued.
Fixed, thanks!
pkg/security/auto_tls_init.go, line 59 at r1 (raw file):
Previously, knz (kena) wrote…
we don't use magic constants in the code, so this 4096 needs to be extracted into a named constant and documented.
Done as an unexported const. Let me know if we have a different preference for this.
pkg/security/auto_tls_init.go, line 92 at r1 (raw file):
Previously, knz (kena) wrote…
nit: no empty space at the start of functions
Thanks, fixed.
pkg/security/auto_tls_init.go, line 99 at r1 (raw file):
Previously, knz (kena) wrote…
reuse helper function
Done.
pkg/security/auto_tls_init.go, line 102 at r1 (raw file):
Previously, stevendanna (Steven Danna) wrote…
Same question about log.Fatal here.
Fixed.
pkg/security/auto_tls_init.go, line 137 at r1 (raw file):
Previously, knz (kena) wrote…
I would recommend passing this via configuration so it can be overridden from the command line. However, for the very first version of the PR, let's just leave a TODO in a comment here, and add a follow up item to #60632.
Added this logic but also placed a comment for follow up as I think that will probably give us the best mileage.
pkg/security/auto_tls_init.go, line 141 at r1 (raw file):
Previously, stevendanna (Steven Danna) wrote…
Do we need
x509.KeyUsageKeyEnciphermentas well? I ask because I see we support plain RSA key exchange hereLines 185 to 188 in 8187c25
(aside: we might consider an issue for making that list configurable or removing the CBC-mode ciphers from the list)
Either way, perhaps a few comments in these template creations to document why or why not might be useful to us in the future?
Good catch, thank you. Updated the KeyUsage for now and left comment regarding consolidation of this with the existing infrastructure.
pkg/security/auto_tls_init.go, line 144 at r1 (raw file):
Previously, knz (kena) wrote…
see comment above about constant
Done.
pkg/server/auto_tls_init.go, line 12 at r1 (raw file):
Previously, knz (kena) wrote…
nit: missing period
also throughout the rest of this file
Think I got all of these.
pkg/server/auto_tls_init.go, line 80 at r1 (raw file):
Previously, knz (kena) wrote…
errors.Newf("found ... %s", caKeyPath)Might want to use
errors.Wrap/Wrapftoo.
Thanks! Done.
pkg/server/auto_tls_init.go, line 134 at r1 (raw file):
Previously, knz (kena) wrote…
errors.Newf, here and below orWrapf
Thanks, done.
pkg/server/auto_tls_init.go, line 258 at r1 (raw file):
Previously, stevendanna (Steven Danna) wrote…
I believe this and the below function calls are missing error checks.
Yes they were! Thanks, fixed.
aaron-crl
left a comment
There was a problem hiding this comment.
I've added the bundler utility function. I'll see if I can get the symmetric unbundler done this evening but that can wait for the next PR.
@knz PTAL
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @itsbilal, @knz, and @stevendanna)
itsbilal
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @aaron-crl, @knz, and @stevendanna)
pkg/security/auto_tls_init.go, line 117 at r6 (raw file):
serialNumber, err := createCertificateSerialNumber() if err != nil { return
It's best not to do unnamed returns as bugs creep in that way; just do return nil, nil, err here to make it explicit to the reader.
pkg/security/auto_tls_init.go, line 138 at r6 (raw file):
} caKey, err := x509.ParsePKCS8PrivateKey(caKeyBlock.Bytes)
This should be PKCS1 to match the the encode step, right?
pkg/security/auto_tls_init.go, line 193 at r6 (raw file):
}) return
return serviceCertBlock.Bytes(), certPrivKeyPEM.Bytes(), nil maybe?
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @itsbilal, @knz, and @stevendanna)
pkg/security/auto_tls_init.go, line 117 at r6 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
It's best not to do unnamed returns as bugs creep in that way; just do
return nil, nil, errhere to make it explicit to the reader.
Thanks, fixed an error behavior made more explicit throughout.
pkg/security/auto_tls_init.go, line 138 at r6 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
This should be PKCS1 to match the the encode step, right?
Thanks, good catch, it should have been PKCS8 everywhere. I've updated accordingly.
pkg/security/auto_tls_init.go, line 193 at r6 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
return serviceCertBlock.Bytes(), certPrivKeyPEM.Bytes(), nilmaybe?
Thank you and apologies. Fixed and updated errors to be explicit throughout this an related function.
knz
left a comment
There was a problem hiding this comment.
This is mostly good to go. Thanks for all the polish.
Don't forget to squash everything before this merges though.
Reviewed 2 of 2 files at r6, 1 of 1 files at r7.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @aaron-crl, @itsbilal, and @stevendanna)
pkg/security/auto_tls_init.go, line 39 at r7 (raw file):
func createCertificateSerialNumber() (serialNumber *big.Int, err error) { max := new(big.Int) max.Exp(big.NewInt(2), big.NewInt(130), nil).Sub(max, big.NewInt(1))
If my reading is right, this generates a random value between 1 and 2^130-1, i.e. between 1 and 1361129467683753853853498429727072845823.
Is that intended? If so, please fix the comment to indicate the actual range.
If you want a 31-bit random number, you can use rand.Int31().
pkg/server/auto_tls_init.go, line 81 at r7 (raw file):
// Key exists but cert does not, this is an error. err = errors.Wrapf(err, "found key but not certificate for user auth at: %s"+caKeyPath)
remove the + and replace by a comma.
pkg/server/auto_tls_init.go, line 247 at r7 (raw file):
// cluster. It uses or generates an InterNode CA to produce any missing // unmanaged certificates. It does this base on the logic in: // https://github.com/aaron-crl/cockroach/blob/rfc20200722_cert_free_secure_setup/docs/RFCS/20200722_certificate_free_secure_setup.md#initial-configuration-establishing-trust
Please refer to the RFC by PR URL, not this blob path (this will be auto-deleted when the RFC PR is merged).
pkg/server/auto_tls_init.go, line 253 at r7 (raw file):
cl := security.MakeCertsLocator(c.SSLCertsDir) // TODO(aaron-crl): Put this in the config map. initLifespan, err := time.ParseDuration("366d")
No magic constants: extract in a global constant at the top and document. Possibly add a TODO to make it configurable.
|
You can solve the linter failure with |
knz
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @aaron-crl, @itsbilal, and @stevendanna)
pkg/security/auto_tls_init.go, line 139 at r7 (raw file):
caCert, err := x509.ParseCertificate(caCertBlock.Bytes) if err != nil { err = errors.New("failed to parse valid Certificate from PEM blob")
errors.Wrap here
pkg/security/auto_tls_init.go, line 151 at r7 (raw file):
caKey, err := x509.ParsePKCS8PrivateKey(caKeyBlock.Bytes) if err != nil { err = errors.New("failed to parse valid Certificate from PEM blob")
errors.Wrap here too
63704e5 to
e212b6d
Compare
aaron-crl
left a comment
There was a problem hiding this comment.
TFTR!
Don't forget to squash everything before this merges though.
Done.
You can solve the linter failure with make bazel-generate and committing the resulting changes to the BUILD.bzl files.
Done.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @itsbilal, @knz, and @stevendanna)
pkg/security/auto_tls_init.go, line 39 at r7 (raw file):
Previously, knz (kena) wrote…
If my reading is right, this generates a random value between 1 and 2^130-1, i.e. between 1 and 1361129467683753853853498429727072845823.
Is that intended? If so, please fix the comment to indicate the actual range.
If you want a 31-bit random number, you can use
rand.Int31().
Updated comment. The intention was to have a crypto random serial with greater than 128 bits of entropy in case we decided to rely on the serial for something at some point in the future.
pkg/security/auto_tls_init.go, line 139 at r7 (raw file):
Previously, knz (kena) wrote…
errors.Wrap here
Thanks, done.
pkg/security/auto_tls_init.go, line 151 at r7 (raw file):
Previously, knz (kena) wrote…
errors.Wraphere too
Thanks, done.
pkg/server/auto_tls_init.go, line 81 at r7 (raw file):
Previously, knz (kena) wrote…
remove the + and replace by a comma.
Good catch, thanks, done.
pkg/server/auto_tls_init.go, line 247 at r7 (raw file):
Previously, knz (kena) wrote…
Please refer to the RFC by PR URL, not this blob path (this will be auto-deleted when the RFC PR is merged).
Done.
pkg/server/auto_tls_init.go, line 253 at r7 (raw file):
Previously, knz (kena) wrote…
No magic constants: extract in a global constant at the top and document. Possibly add a TODO to make it configurable.
Done.
knz
left a comment
There was a problem hiding this comment.
LGTM modulo getting the serial number right
Reviewed 4 of 4 files at r8.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @aaron-crl, @itsbilal, @knz, and @stevendanna)
pkg/security/auto_tls_init.go, line 39 at r7 (raw file):
Previously, aaron-crl wrote…
Updated comment. The intention was to have a crypto random serial with greater than 128 bits of entropy in case we decided to rely on the serial for something at some point in the future.
rand.Int generates a value between 0 included and max excluded
I think you want to call rand.Int like you do today, this will generate a value between [0, 2^130-1)
Then add one to the result to get a value between [1, 2^130)
pkg/security/auto_tls_init.go, line 36 at r8 (raw file):
// createCertificateSerialNumber is a helper function that generates a // random value between 1 and 2^130-1.
specify whether the bounds are inclusive or exclusive in the comment.
pkg/security/auto_tls_init.go, line 51 at r8 (raw file):
// with cluster auto certificate generation. func CreateCACertAndKey(lifespan time.Duration, service string) (certPEM []byte, keyPEM []byte, err error) {
nit: remove empty space
e212b6d to
18fcec9
Compare
|
I think I got the serial right this time; also added intent to the comments. Thanks for the patience. |
|
ok you got yourself some linter errors now For the unused functions you can have some empty assignments in a |
aaron-crl
left a comment
There was a problem hiding this comment.
Thanks! I wasn't using my local linters right. I'll wait to squash things until we get a green light from the CI linters.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @itsbilal, @knz, and @stevendanna)
pkg/security/auto_tls_init.go, line 39 at r7 (raw file):
Previously, knz (kena) wrote…
rand.Intgenerates a value between 0 included and max excludedI think you want to call
rand.Intlike you do today, this will generate a value between [0, 2^130-1)
Then add one to the result to get a value between [1, 2^130)
Thanks, updated comment and code.
pkg/security/auto_tls_init.go, line 36 at r8 (raw file):
Previously, knz (kena) wrote…
specify whether the bounds are inclusive or exclusive in the comment.
Done.
pkg/security/auto_tls_init.go, line 51 at r8 (raw file):
Previously, knz (kena) wrote…
nit: remove empty space
Done.
knz
left a comment
There was a problem hiding this comment.
Reviewed 7 of 7 files at r10.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @itsbilal, @knz, and @stevendanna)
…tion server: added utility function for bundling init certs server: added init function that uses a recieved bundle to provision a node security: added helper functions to support automatic certificate generation security: added ClientCAKeyPath helper to align with ClientCACertPath This is part of cockroachdb#60632 and provides functions for cockroachdb#60636. Release note: None
38848b2 to
595be33
Compare
|
bors r=knz |
|
Build succeeded: |
60766: server: Implement auto-init handshake to negotiate CA certs between nodes r=aaron-crl a=itsbilal This PR implements the init protocol phase of the cert-free setup described in #51991. A lot of the code is pulled out of Aaron's reference implementation of this protocol: https://github.com/aaron-crl/toy-secure-init-handshake/tree/n-way-join One part of #60632. Ready for review, but these parts are still TODO: - [x] Unit tests, or at least manual test steps - [x] Address some of the TODOs, including rebasing on latest version of #60705 and only sending over the CAs in the trust bundle. Release note: None. Co-authored-by: Bilal Akhtar <bilal@cockroachlabs.com>
security: added helper functions to support automatic certificate generation
This doesn't do anything yet but will be updated to trigger off:
#60636 when it lands.
This is part of the work supporting #60632
Release note: None
Epic: CRDB-6663