Conversation
Signed-off-by: Kuat Yessenov <kuat@google.com>
|
can you add detail on how these new attributes impact the calculation of the connection security policy labels, etc.? |
|
Thanks @kyessenov for this PR. Can you add in a comment few examples of how the newly added attributes show up in mixer access logs or telemetry? |
| static const char kConnectionId[]; | ||
| // Record TCP connection status: open, continue, close | ||
| static const char kConnectionEvent[]; | ||
| static const char kUpstreamMtls[]; |
There was a problem hiding this comment.
how does work with this attribute kConnectionMtls
There was a problem hiding this comment.
connection.mtls means downstream connection mTLS status.
upstream.mtls means upstream connection mTLS status.
Depending on whether the proxy acts like an outbound or an inbound sidecar, you use one or the other.
There was a problem hiding this comment.
i see, may be this attribute kConnectionMtls needs to explicitly say that it's related to downstream. It's can be confusing specially after adding this new attribute.
There was a problem hiding this comment.
We can't change the meaning of existing attributes. There are UIs that have expectations. We can change the metric, however, in istio config.
|
We should modify the mtls label to be |
nrjpoddar
left a comment
There was a problem hiding this comment.
What are the types of error messages that can be reported via GetUpstreamFailureReason? I'm worried about cardinality here if we add this on a metric specially if it is directly tied to the Envoy message.
One way to deal with it would be to create known types based of the set of error messages we care about/know in proxy repo and then skip the other kinds to decouple label cardinality explosion from Envoy string message directly.
| static const char kConnectionId[]; | ||
| // Record TCP connection status: open, continue, close | ||
| static const char kConnectionEvent[]; | ||
| static const char kUpstreamMtls[]; |
There was a problem hiding this comment.
i see, may be this attribute kConnectionMtls needs to explicitly say that it's related to downstream. It's can be confusing specially after adding this new attribute.
|
@nrjpoddar yes, we'll have to update the attribute vocabulary once this gets in. I think this is better than changing the meaning of the attribute. It was always about downstream, just never properly documented. Upstream failure reason is coming straight from boringSSL. Yes, it should not be used for metrics, unless we can catalog all the error kinds in the library. |
|
Upstream failure reason is coming straight from boringSSL. Yes, it should not be used for metrics, unless we can catalog all the error kinds in the library. -> I can work with you if needed and we can divide and conquer. Getting this information in metrics, even at low resolution where all SSL related errors are reported as boolean is very useful. An alert via metrics + more information in logs/tracing will be an acceptable short time solution for me. |
|
@nrjpoddar Sure. I still think we need to propagate full-fidelity info from envoy, in case we get our catalog wrong, or there is a need for more details. As a first pass, we should use a CEL expression ( |
|
Hey @PiotrSikora , as a WG lead, can you review or assign someone to review? We need this for 1.4 in case Mixer v2 slips. |
| builder.FlattenMapOfStringToStruct(report_data->GetDynamicFilterState()); | ||
|
|
||
| builder.AddBool(utils::AttributeName::kUpstreamMtls, | ||
| report_data->IsUpstreamMutualTLS()); |
There was a problem hiding this comment.
just a nit. instead of having a bool for mutual TLs, why not have an enum of sorts? upstream: plaintext, tls, mutualTLS ? We are coming across a lot of setups where the sidecar has to terminate one way TLS on VMs.
There was a problem hiding this comment.
Enums are generally painful to deal with in proto. mTLS is special since that's the end state, so it's needed to track the progress on getting to that state. We can populate other TLS properties later (e.g. version, cipher suite ID, etc).
| std::string* trust_domain); | ||
|
|
||
| // Returns true if connection is mutual TLS enabled. | ||
| // Returns true if downstream connection is mutual TLS enabled. |
There was a problem hiding this comment.
rename this to IsDownstreamMutualTLS?
|
|
||
| bool IsUpstreamMutualTLS(const StreamInfo::StreamInfo& stream_info) { | ||
| const auto ssl = stream_info.upstreamSslConnection(); | ||
| return ssl != nullptr && ssl->peerCertificatePresented(); |
There was a problem hiding this comment.
I'm a little confused on the usage of ssl->peerCertificatePresented(). If this flag is set to true, does this imply client was using TLS for talking to upstream and got a successful certificate from upstream or it should mean that client is using TLS irrespective of how the negotiation went and what upstream server presented?
There was a problem hiding this comment.
Ouch, I think this might be an issue. We should be checking whether a local certificate was presented, not peer.
There was a problem hiding this comment.
that's what I thought. I don't see an Envoy API for that. @lizan may be you know better?
There was a problem hiding this comment.
We can extend envoy to report whether a local certificate is presented. That might not mean that the server validated it. Not sure how to get that info.
There was a problem hiding this comment.
we didn't have this in Envoy API because this has been simply reflecting what in the TLS context config, though we're adding more dynamic nature to the config for incremental adoption etc, so it makes sense to add, thanks for posting envoyproxy/envoy#8464.
|
@kyessenov in general, from a user point of view I only care about if the connection over which this transaction is happening is plaintext, TLS or mTLS. In all of these states, I care about errors if any occur. I'm struggling to see how users can get benefit from this low level separation of upstream/downstream without getting more confused. |
|
@nrjpoddar We cannot conflate the two for gateways. We only have one set of attributes we report from the gateways, and downstream/upstream connections have very different characteristics. |
|
@kyessenov: The following test failed, say
DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
|
🧭 This issue or pull request has been automatically marked as stale because it has not had activity from an Istio team member since 2019-10-02. It will be closed on 2019-12-01 unless an Istio team member takes action. Please see this wiki page for more information. Thank you for your contributions. Created by the issue and PR lifecycle manager. |
|
🚧 This issue or pull request has been closed due to not having had activity from an Istio team member since 2019-10-02. If you feel this issue or pull request deserves attention, please reopen the issue. Please see this wiki page for more information. Thank you for your contributions. Created by the issue and PR lifecycle manager. |
Signed-off-by: Kuat Yessenov kuat@google.com
Adds:
upstream.mtlsa bool indicating whether client-side connection is mTLSupstream.failure_reasona string recording connection reset log line (usually due to TLS)/cc @nrjpoddar @douglas-reid