xds: restructure CertificateProvider fields#17201
Conversation
Signed-off-by: Mark D. Roth <roth@google.com>
Signed-off-by: Mark D. Roth <roth@google.com>
|
Some more context from my memory. About the comment:
It is true but it also happens to be true for the field In any case this change might still be desirable for the reasons stated in the PR. |
|
After this change is made, and deprecated fields removed, the control plane (Traffic Director for example) will then have to send the So tagging @karthikbox ... |
The fact that Istio's SDS server may happen to populate only the
This PR does not change anything at all for Envoy. It was already the case that Envoy had not implemented any of the CertificateProvider fields, and control planes were already populating the SDS fields. |
| } | ||
|
|
||
| // Certificate provider for fetching CA certs. This will populate the | ||
| // :ref:`CertificateValidationContext.trusted_ca field |
There was a problem hiding this comment.
Why not put this in CertificateValidationContext to be a sibling of trusted_ca?
There was a problem hiding this comment.
For the same reason tls_certificate_certificate_provider is not part of TlsCertificate to be a sibling of certificate_chain field?
There was a problem hiding this comment.
I agree with Eric, it makes a lot more sense to put it in CertificateValidationContext. There it will effectively be a oneof, rather than being something we need to merge in later.
Sanjay, a CertificateProvider is an alternative to TlsCertificate; it provides the same information in a different way. It would not make sense to have it inside of TlsCertificate.
I've gone ahead and made this change.
There was a problem hiding this comment.
...CertificateProvider is an alternative to TlsCertificate...
CertificateValidationContext does not use TlsCertificate but instead uses trusted_ca (which is of type DataSource) so the field trusted_ca in CertificateValidationContext serves the exact same purpose as certificate_chain in TlsCertificate for sending the actual certificate.
Your comment also says "Only one of trusted_ca, ca_certificate_provider, and ca_certificate_provider_instance".
There was a problem hiding this comment.
TlsCertificate contains both the certs and the private keys. A CertificateProvider provdes both, so it's an alternative to the entire TlsCertificate message.
There was a problem hiding this comment.
That's right, I forgot the private key part. So the TlsCertificate and CertificateValidationContext are clearly differentiated that way but a CertificateProvider is used to get a cert + key for the identity certificate (like a TlsCertificate) or just the trust/root certificate (like the trusted_ca of CertificateValidationContext). Makes sense
|
Since these are "not implemented hide", you can avoid deprecation if gRPC isn't already using this in the wild FWIW. |
Signed-off-by: Mark D. Roth <roth@google.com>
My initial commit actually did that, but it turns out that all of the fields are actually in use, so we need to deprecate them. |
It doesn't change anything for Envoy but it changes a few things for the control plane that is driving both Envoy and gRPC. So the current config sent by a control plane looks like this: But with the changes in this PR and the deprecated fields actually removed the config will have to be: So something for them to be aware of. |
|
I'm fine with the api changes overall. What i'm worried about is that how does the mgmt server decides which API to use for a xDS client? (NOTE: This is not a new problem introduced by this PR.) Seems like there's no single API which would work for both grpc and envoy.
yes, these are in use. |
|
@sanjaypujare The exact change will be a little different than that, but yes, the change does affect management servers that support gRPC (regardless of whether they also support Envoy). So what you're pointing out is basically that management servers that use the fields that are being deprecated here need to be changed to use the new fields instead... which is the whole point of this PR. :) @karthikbox As you note, this concern is not actually related to this PR, so I don't think we need to hash it out here, but we have had a bunch of discussions about this. In the long term, I think the goal should be to migrate Envoy to the CertificateProviderInstance model, because everything else can be supported via that API. An individual Envoy can have a CertificateProvider plugin that speaks SDS if that's what they want to use, or it can have a plugin that explicitly specifies a static set of certs, or reads it from files on local disk, or whatever. The management server should not actually need to know or care how the client is getting its certs, as long as it can tell the client which cert to get via which logical mechanism. This eliminates the need for any sort of negotiation between the client and server. |
Yes, a control plane like TD would need to temporarily populate both. But since the gRPC feature is only in beta right now, we would not need to maintain the code for the old fields for very long. |
How are they used? Since the message moved out is keeping same wire format, and as they are "not implemented hide" so if you don't need generated API compatibility we can still avoid deprecation. |
We do need generated API compatibility, unfortunately. |
| message CertificateProviderPlugin { | ||
| // opaque name used to specify certificate instances or types. For example, "ROOTCA" to specify | ||
| // a root-certificate (validation context) or "TLS" to specify a new tls-certificate. | ||
| string name = 1 [(validate.rules).string = {min_len: 1}]; |
There was a problem hiding this comment.
What is the difference of this name and the one in typed_config?
There was a problem hiding this comment.
The one in typed_config is simply a name or an opaque identifier for the extension. So it is purely config associated with the plugin. Whereas this name is meant to serve as an identification of the certificate that is meant to be provided by the certificate provider. Similar to the certificate_name on line 275. May be we should call this certificate_name too?
There was a problem hiding this comment.
After thinking about this for a bit, I've elected to just remove CertificateProviderPlugin completely. We added that before we added CertificateProviderPluginInstance, but it's not actually being used anywhere, since all of our code uses the newer type. We can always add this later if we actually need it for something.
Signed-off-by: Mark D. Roth <roth@google.com>
|
Let's hold off on this for a bit. We're having some discussions about how we want to structure this for SPIFFE. |
|
It looks like the SPIFFE discussion is getting kicked down the road a bit, so let's proceed with this PR as-is. |
| // delivered via SDS. | ||
| config.core.v3.DataSource trusted_ca = 1; | ||
| // | ||
| // Only one of *trusted_ca*, *ca_certificate_provider*, and *ca_certificate_provider_instance* |
There was a problem hiding this comment.
I don't see ca_certificate_provider defined in this struct. Either that needs to be defined here or the comment should be updated to reflect it. IIRC @coolnsu wanted to keep it for future use.
There was a problem hiding this comment.
Good catch. I forgot to update this comment when I decided not to include that field. (We can always add it later if we need it.)
Fixed.
|
|
||
| // Certificate provider instance for fetching TLS certificates. | ||
| // | ||
| // Only one of *trusted_ca*, *ca_certificate_provider*, and *ca_certificate_provider_instance* |
There was a problem hiding this comment.
Same comment as above about the reference to ca_certificate_provider.
| // The plugin allows certificates to be fetched/refreshed over the network asynchronously with | ||
| // respect to the TLS handshake. | ||
| // [#not-implemented-hide:] | ||
| message CertificateProviderPluginInstance { |
There was a problem hiding this comment.
Apart from moving from tls.proto it seems we have also renamed it from CertificateProviderInstance. Are we forcing all certificate providers to be plugins thru this nomenclature? Also I think "extension" is preferred over "plugin" IIRC from a previous PR/comment but I don't share that preference.
There was a problem hiding this comment.
I would have preferred to keep the existing name, but I was forced to rename it due to #17216. There is no intention to establish a new nomenclature convention here.
| // fetched/refreshed over the network asynchronously with respect to the TLS handshake. | ||
| // | ||
| // DEPRECATED: This message was moved outside of CommonTlsContext | ||
| // and now lives in common.proto. |
There was a problem hiding this comment.
I don't see that in common.proto. So we have decided to drop it?
There was a problem hiding this comment.
Yes. I've updated the comment.
| // Only one of *tls_certificates*, *tls_certificate_sds_secret_configs*, | ||
| // and *tls_certificate_provider_instance* may be used. | ||
| // [#not-implemented-hide:] | ||
| CertificateProviderPluginInstance tls_certificate_provider_instance = 14; |
There was a problem hiding this comment.
Now it uses the new type (which is fair since that's the intent) but the field is now called tls_certificate_provider_instance and the original field was tls_certificate_certificate_provider_instance which combines tls_certificate (indicating this is for the identity certificate) and certificate_provider_instance (indicating this is a certificate provider instance). To be really consistent with other fields providing the identity certificate the field would be called tls_certificate_certificate_provider_plugin_instance . But I am okay with the name.
There was a problem hiding this comment.
I think longer names are useful when they avoid ambiguity, but I don't see any ambiguity in the shorter name in this case.
|
Let's say a control is sending the following config ( Note the After merging the PR then the same control plane will need to change the config as follows to conform to the new structure and also be Envoy compatible: The only 2 changes I see are:
CC @karthikbox |
Signed-off-by: Mark D. Roth <roth@google.com>
|
@lizan, I think this PR is ready to be merged. What do you think? |
Signed-off-by: Mark D. Roth <roth@google.com>
…bridge-stream * upstream/main: (32 commits) tls: move ssl connection info into SocketAddressProvider (envoyproxy#17334) conn pool: default enable runtime feature `conn_pool_delete_when_idle` (envoyproxy#17577) api: LEDS api introduction (envoyproxy#17419) kafka: add support for api versions request in mesh-filter (envoyproxy#17475) ext_proc: Implement BUFFERED_PARTIAL processing mode (envoyproxy#17531) tooling: Async/pathlib/mypy cleanups and utils (envoyproxy#17505) xds: restructure CertificateProvider fields (envoyproxy#17201) Refactor OverloadIntegrationTest breaking out a test base, and the fake resource monitors. (envoyproxy#17530) listener: move active connection collection out of active tcp listener (envoyproxy#16947) tools: format checks for backticks (envoyproxy#17566) coverage: set lower limit for common/quic and common (envoyproxy#17573) v2: final source removal (envoyproxy#17565) test: bumping coverage (envoyproxy#17564) quic: enforcing header size and contents (envoyproxy#17520) Support for canonicalizing URI properly for AWS SigV4 signer (envoyproxy#17137) listener: add a stat for transport socket connect timeout (envoyproxy#17458) listener: add listen() error handling (envoyproxy#17427) http: return per route config when direct response is set (envoyproxy#17449) removing most v2 references from source/ (envoyproxy#17415) bug fix: return bootstrap when validating config (envoyproxy#17499) ... Signed-off-by: Garrett Bourg <bourg@squareup.com>
When these fields were originally added, we didn't fully understand how the `CertificateValidationContext` configuration worked. A `CertificateProvider` cannot provide a full `CertificateValidationContext` proto; it provides only the CA certs (i.e., the `trusted_ca` field within the `CertificateValidationContext` proto). So this PR makes the following changes: - Duplicates the `CertificateProvider` and `CertificateProviderInstance` messages from `CommonTlsContext` as `CertificateProviderPlugin` and `CertificateProviderPluginInstance` outside of `CommonTlsContext` (in common.proto instead of tls.proto). (Note that this also paves the way for reusing these messages for SPIFFE validation in the future.) - Adds new `CertificateProviderPlugin` and `CertificateProviderPluginInstance` fields to `CertificateValidationContext`, as an alternative to the existing `trusted_ca` field. (Added annotations to make this a `oneof` in the future.) - In `CommonTlsContext`, I have deprecated the `validation_context_certificate_provider` and `validation_context_certificate_provider_instance` fields both in the `validation_context_type` oneof and in `CombinedValidationContext`. - In `CommonTlsContext`, I have replaced the `tls_certificate_certificate_provider` and `tls_certificate_certificate_provider_instance` fields with new fields that use the new messages outside of `CommonTlsContext`. Risk Level: Low Testing: N/A Docs Changes: Included in PR Release Notes: N/A Platform Specific Features: N/A Signed-off-by: Mark D. Roth <roth@google.com>
Signed-off-by: Mark D. Roth roth@google.com
Commit Message: xds: restructure CertificateProvider fields
Additional Description: When these fields were originally added, we didn't fully understand how the
CertificateValidationContextconfiguration worked. ACertificateProvidercannot provide a fullCertificateValidationContextproto; it provides only the CA certs (i.e., thetrusted_cafield within theCertificateValidationContextproto). So this PR makes the following changes:CertificateProviderandCertificateProviderInstancemessages fromCommonTlsContextasCertificateProviderPluginandCertificateProviderPluginInstanceoutside ofCommonTlsContext(in common.proto instead of tls.proto). (Note that this also paves the way for reusing these messages for SPIFFE validation in the future.)CertificateProviderPluginandCertificateProviderPluginInstancefields toCertificateValidationContext, as an alternative to the existingtrusted_cafield. (Added annotations to make this aoneofin the future.)CommonTlsContext, I have deprecated thevalidation_context_certificate_providerandvalidation_context_certificate_provider_instancefields both in thevalidation_context_typeoneof and inCombinedValidationContext.CommonTlsContext, I have replaced thetls_certificate_certificate_providerandtls_certificate_certificate_provider_instancefields with new fields that use the new messages outside ofCommonTlsContext.Risk Level: Low
Testing: N/A
Docs Changes: Included in PR
Release Notes: N/A
Platform Specific Features: N/A
CC @sanjaypujare @ejona86 @dfawley @yashykt @easwars @jaychenatr