Skip to content

Extensions: Network filter to forward SNI from the downstream TLS connection to the upstream TLS connection#4334

Closed
vadimeisenbergibm wants to merge 116 commits intoenvoyproxy:masterfrom
vadimeisenbergibm:forward_original_sni
Closed

Extensions: Network filter to forward SNI from the downstream TLS connection to the upstream TLS connection#4334
vadimeisenbergibm wants to merge 116 commits intoenvoyproxy:masterfrom
vadimeisenbergibm:forward_original_sni

Conversation

@vadimeisenbergibm
Copy link
Copy Markdown
Contributor

@vadimeisenbergibm vadimeisenbergibm commented Sep 4, 2018

Signed-off-by: Vadim Eisenberg vadime@il.ibm.com

Description: Add forward_original_sni field to the upstream TLS Context. It will allow forwarding SNI from the downstream connection (extracted by the TLS inspector filter) to the upstream. One of the use cases described in #4076. This flag should be mutually exclusive with the sni filed of UpstreamTlsContext.

Risk Level:Low

Testing: unit tests

Docs Changes: yes

Release Notes: yes

Related to #4076

Depends on #4973

Signed-off-by: Vadim Eisenberg <vadime@il.ibm.com>
@vadimeisenbergibm
Copy link
Copy Markdown
Contributor Author

^ @lizan @rshriram I started to work on it.

//
// TLS renegotiation is considered insecure and shouldn't be used unless absolutely necessary.
bool allow_renegotiation = 3;
bool allow_renegotiation = 4;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is changing the option number for an existing command a breaking change? Can the new option be 4 instead?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, please don't change field number, use 4 in new field.

// If true, the SNI from the downstream connection is forwarded to the upstream connection.
//
// TODO: handle oneof with the sni
bool forward_original_sni = 3;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do you need an option for this? why can't you always forward the downstream SNI if there is no SNI value specified in the cluster?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what @lizan wanted #4076 (comment)

Add explicit field forward_original_sni would be better than defaulting to forward.

I proposed:

Change tcp_proxy to set the SNI of the upstream connection in case the cluster's tls context does not have sni field.

@rshriram @lizan Could you please agree on the API ?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's be explicit between forward downstream SNI vs no SNI. You should be able to use oneof without breaking wire format.

// If true, the SNI from the downstream connection is forwarded to the upstream connection.
//
// TODO: handle oneof with the sni
bool forward_original_sni = 3;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's be explicit between forward downstream SNI vs no SNI. You should be able to use oneof without breaking wire format.

//
// TLS renegotiation is considered insecure and shouldn't be used unless absolutely necessary.
bool allow_renegotiation = 3;
bool allow_renegotiation = 4;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, please don't change field number, use 4 in new field.

Signed-off-by: Vadim Eisenberg <vadime@il.ibm.com>
@vadimeisenbergibm
Copy link
Copy Markdown
Contributor Author

@lizan I have fixed the API (this is just the beginning of the PR), do you know how should I specify that only one of the options, sni or forward_original_sni, may be used?

Copy link
Copy Markdown
Member

@rshriram rshriram left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So here is my rationale:
no incoming SNI and cluster has SNI - send cluster SNI
Has incoming SNI and no cluster SNI - send incoming SNI
Has incoming SNI and has cluster SNI (set through destinationRule for eg) : we have choices - ignore incoming SNI and use cluster SNI or ignore cluster SNI and use incoming SNI

Both of these scenarios are useful. When we setup Envoy as a plain pass through proxy like egress gateway, we want the inbound SNI to be forwarded if present, else use the SNI set in the cluster
So that the envoy on the other cluster that’s receiving traffic will have some SNI to route on.

From that point of view, this field should not be a one of. It should be independent and during upstream connection, we evaluate the cases above.

@vadimeisenbergibm
Copy link
Copy Markdown
Contributor Author

vadimeisenbergibm commented Sep 5, 2018

@rshriram

no incoming SNI and cluster has SNI - send cluster SNI
Has incoming SNI and no cluster SNI - send incoming SNI
Has incoming SNI and has cluster SNI (set through destinationRule for eg) : we have choices - ignore incoming SNI and use cluster SNI or ignore cluster SNI and use incoming SNI

"No incoming SNI" - do you mean the SNI was not received from the downstream connection? So the decision which SNI to send must be determined dynamically, correct?

So it should depend on three variables, sni, forward_original_sni (statically set in the configuration of the cluster) and let's call it sni_received _dynamically_from_the_downstream.

So the algorithm is:

if !sni_received_dynamically_from_the_downstream && sni {
    send sni
}

if sni_received_dynamically_from_the_downstream && !sni {
   send sni_received_dynamically_from_the_downstream
}

if sni_received_dynamically_from_the_downstream && sni {
   if forward_original_sni {
     send sni_received_dynamically_from_the_downstream
   } else {
     send sni
   } 
}

// if !sni_received _dynamically_from_the_downstream && !sni {
//      do not send any sni
// }

Did I understand you correct?

@ggreenway
Copy link
Copy Markdown
Member

What configurations do we expect this to work in? I think this only makes sense with tcp_proxy, because it has 1:1 downstream-connection:upstream-connection relationship. For http, if you had multiple requests from different downstream connections, with different SNI values, going over the same upstream connection, it isn't clear what the behavior should be.

If it only works with tcp_proxy, I'm not sure this is the right place for the configuration to live.

@vadimeisenbergibm
Copy link
Copy Markdown
Contributor Author

@ggreenway Good point in
#4334 (comment).

So @lizan, what do you think? Should I add this flag as a parameter to the tcp_proxy?

@rshriram
Copy link
Copy Markdown
Member

rshriram commented Sep 5, 2018

@ggreenway why isn't this valid for http as well? The SNI is outside of the HTTP semantics. It doesn't matter if we have a Mx1 or 1x1 right? the contract here is that if there is an SNI for the inbound connection that caused this cluster to be used, then use that SNI (if allowed).

I am assuming that you can get a handle over the connection on which a request arrived, for HTTP and TCP alike.

@ggreenway
Copy link
Copy Markdown
Member

I think it doesn't make sense because for HTTP we're doing request proxying not connection proxying, and the SNI value is a connection property. So you're taking a value from the first downstream request, and applying it to N upstream requests.

@lizan
Copy link
Copy Markdown
Member

lizan commented Sep 5, 2018

In HTTP (or any other supported Nx1 proxy L7 protocol) the connection pool can be based on upstream SNI. I think it makes sense to send all request from downstream connections with same SNI to same upstream connection. This doesn't have to be implemented initially, but please leave a comment.

Signed-off-by: Vadim Eisenberg <vadime@il.ibm.com>
…ansportSocket()

Signed-off-by: Vadim Eisenberg <vadime@il.ibm.com>
@rshriram
Copy link
Copy Markdown
Member

rshriram commented Sep 7, 2018

I think it doesn't make sense because for HTTP we're doing request proxying not connection proxying, and the SNI value is a connection property. So you're taking a value from the first downstream request, and applying it to N upstream requests.

Agree. But the SNI here is not for routing. Its for policy enforcement, where the policy system might want to look at the SNI header and the HTTP authority to make a decision.

@ggreenway
Copy link
Copy Markdown
Member

But the SNI here is not for routing. Its for policy enforcement

That may be true in this use-case, but you don't know how it will be used elsewhere.

I think if we want to do this for HTTP, @lizan's suggestion is the right way to do it.

Signed-off-by: Vadim Eisenberg <vadime@il.ibm.com>
@stale
Copy link
Copy Markdown

stale bot commented Sep 16, 2018

This pull request has been automatically marked as stale because it has not had activity in the last 7 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!

@stale stale bot added the stale stalebot believes this issue/PR has not been touched recently label Sep 16, 2018
@stale
Copy link
Copy Markdown

stale bot commented Sep 23, 2018

This pull request has been automatically closed because it has not had activity in the last 14 days. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!

@stale stale bot closed this Sep 23, 2018
ronensc and others added 3 commits October 8, 2018 18:45
Signed-off-by: ronenschafferibm <ronen.schaffer@ibm.com>
This reverts commit 3cc57db.

Signed-off-by: Vadim Eisenberg <vadime@il.ibm.com>
…createTransportSocket()"

This reverts commit 2f39451.

Signed-off-by: Vadim Eisenberg <vadime@il.ibm.com>
@vadimeisenbergibm
Copy link
Copy Markdown
Contributor Author

@rshriram @lizan Could you please reopen this PR? I am back from a vacation and will continue working on it. Please disregard the implementation meanwhile, I am going to change it.

This reverts commit f90f771.

Signed-off-by: Vadim Eisenberg <vadime@il.ibm.com>
Signed-off-by: Vadim Eisenberg <vadime@il.ibm.com>
Signed-off-by: Vadim Eisenberg <vadime@il.ibm.com>
…test

Signed-off-by: Vadim Eisenberg <vadime@il.ibm.com>
Signed-off-by: Vadim Eisenberg <vadime@il.ibm.com>
Signed-off-by: Vadim Eisenberg <vadime@il.ibm.com>
Signed-off-by: Vadim Eisenberg <vadime@il.ibm.com>
…ning if statement

since it should be dependant on context or downstream connection

Signed-off-by: Vadim Eisenberg <vadime@il.ibm.com>
Signed-off-by: Vadim Eisenberg <vadime@il.ibm.com>
Signed-off-by: Vadim Eisenberg <vadime@il.ibm.com>
Signed-off-by: Vadim Eisenberg <vadime@il.ibm.com>
@vadimeisenbergibm vadimeisenbergibm changed the title [WIP] Extensions: Network filter to forward SNI from the downstream TLS connection to the upstream TLS connection Extensions: Network filter to forward SNI from the downstream TLS connection to the upstream TLS connection Nov 6, 2018
@vadimeisenbergibm
Copy link
Copy Markdown
Contributor Author

@lizan Could you please review this PR? I wrote several unit tests, please let me know if more tests are required.

Copy link
Copy Markdown
Member

@lizan lizan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great and in this is in a good direction.

Can you pull out the non-extension part to a separate PR? That will help the review process.

We might want to wrap override_server_name in some class (perhaps named TransportSocketOptions, and that can be embedded directly in FilterState, but I'd like to iterate that on the separate PR.

],
)

envoy_cc_library(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be under tcp_proxy, see what is done for PerConnectionCluster, also this should be named override_server_name, "forward" what the filter does and this key only intended for override.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My intention was as follows:
The filter is not related directly to the TCP proxy. The filter specifies the intention that the SNI must be forwarded. Other filters, such as TCP proxy, can extract that information and forward the SNI.

So there are three different names here:

  1. Forward Requested Server Name in the Stream Info, means that there is some requested server name, not necessarrily SNI, maybe requested server name from some other protocol, that must be forwarded to the upstream, not necessarily by tcp_proxy.

  2. Forward Original SNI - the name of the filter, specifies the intention to forward the SNI, a specific kind of a server name

  3. Override Server Name in tcp_proxy, cluster manager etc. it means that the server name of the upstream cluster must be overridden by that value. It does not specify that it is SNI and does not specify where this server name comes from.

This is why I put Forward Requested Server Name in the stream info and not in TCP proxy, to make this artifact independent from the TCP proxy.

CODEOWNERS Outdated
# sni_cluster extension
/*/extensions/filters/network/sni_cluster @rshriram @lizan
# extension
/*/extensions/filters/network/forward_original_sni @lizan
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

name yourself?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lizan Sure, let me extract the non-extension part into a separate PR, I am in favor of separate PRs.

Signed-off-by: Vadim Eisenberg <vadime@il.ibm.com>
Signed-off-by: Vadim Eisenberg <vadime@il.ibm.com>
Signed-off-by: Vadim Eisenberg <vadime@il.ibm.com>
Forward Original SNI
=========================

The `forward_original_sni` is a network filter that instructs other filters,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

given this explanation, I think this should be called use_downstream_sni .
The explanation also needs to be tweaked to state that when an inbound connection is using TLS, and the upstream cluster is using mTLS, this option will allow user to automatically set the SNI of upstream cluster to be the same as the downstream SNI.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rshriram Agreed, used_downstream_sni is more clear.

Regarding:

The explanation also needs to be tweaked to state that when an inbound connection is using TLS, and the upstream cluster is using mTLS, this option will allow user to automatically set the SNI of upstream cluster to be the same as the downstream SNI.

So why does the explanation not cover the case you are describing? If we can consider mTLS being a kind of TLS: a network filter that instructs other filters, such as tcp_proxy, to forward the SNI value from the downstream connection to the upstream connection

@rshriram
Copy link
Copy Markdown
Member

rshriram commented Nov 6, 2018

@lizan Since this has become an extension and this is very specific to Istio, why can't we move this to istio/proxy, just like the way we moved the tcp_cluster_rewrite to istio/proxy?

@mattklein123
Copy link
Copy Markdown
Member

@lizan Since this has become an extension and this is very specific to Istio, why can't we move this to istio/proxy, just like the way we moved the tcp_cluster_rewrite to istio/proxy?

+1, if this is Istio specific please move it.

@lizan
Copy link
Copy Markdown
Member

lizan commented Nov 7, 2018

Yes I'm thinking of that too. the non-extension part (#4973) have to be done before that anyway.

@stale
Copy link
Copy Markdown

stale bot commented Nov 14, 2018

This pull request has been automatically marked as stale because it has not had activity in the last 7 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!

@stale stale bot added the stale stalebot believes this issue/PR has not been touched recently label Nov 14, 2018
@stale
Copy link
Copy Markdown

stale bot commented Nov 21, 2018

This pull request has been automatically closed because it has not had activity in the last 14 days. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!

@stale stale bot closed this Nov 21, 2018
@vadimeisenbergibm
Copy link
Copy Markdown
Contributor Author

@lizan Could you please reopen this PR? I would like to complete it for reference, and then I will close it and move it to istio/proxy.

@lizan
Copy link
Copy Markdown
Member

lizan commented Nov 23, 2018

@vadimeisenbergibm feel free to open another PR, this one is already too heavy to open...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

stale stalebot believes this issue/PR has not been touched recently

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants