Skip to content

cds: Add "auto_http2" option.#399

Merged
mattklein123 merged 11 commits intoenvoyproxy:masterfrom
jrajahalme:auto-http2
Jan 10, 2018
Merged

cds: Add "auto_http2" option.#399
mattklein123 merged 11 commits intoenvoyproxy:masterfrom
jrajahalme:auto-http2

Conversation

@jrajahalme
Copy link
Copy Markdown
Contributor

Currently clusters can not open both HTTP1.1 and HTTP2 upstream
connections at the same time. When the new cluster option
"auto_http2" is set to "true", the cluster must open an HTTP2 upstream
connection if the downstream connection is HTTP2, and an HTTP1.1
upstream connection if the downstream connection is HTTP1.1. This option
is to have no effect if there is no corresponding downstream
connection.

This functionality removes the need to operate multiple clusters and
routing rules for them when the backends accept both HTTP1.1 and HTTP2
connections, and when the choice of the HTTP protocol is significant,
as with gRPC.

Signed-off-by: Jarno Rajahalme jarno@covalent.io

Currently clusters can not open both HTTP1.1 and HTTP2 upstream
connections at the same time.  When the new cluster option
"auto_http2" is set to "true", the cluster must open an HTTP2 upstream
connection if the downstream connection is HTTP2, and an HTTP1.1
upstream connection if the downstream connection is HTTP1.1. This option
is to have no effect if there is no corresponding downstream
connection.

This functionality removes the need to operate multiple clusters and
routing rules for them when the backends accept both HTTP1.1 and HTTP2
connections, and when the choice of the HTTP protocol is significant,
as with gRPC.

Signed-off-by: Jarno Rajahalme <jarno@covalent.io>
@jrajahalme
Copy link
Copy Markdown
Contributor Author

I have the corresponding Envoy PR ready to go after this is merged.

Remove over-enthusiastic spaces.

Signed-off-by: Jarno Rajahalme <jarno@covalent.io>
@mattklein123
Copy link
Copy Markdown
Member

I'm a little suspicious this is a common use case. Why wouldn't most users want to upgrade their incoming HTTP/1.1 connections to HTTP/2 if the backend supports it?

Also, like the other PR, if you have the code already written do you mind posting it so that we can look at the code at the same time? Thank you!

api/cds.proto Outdated
// Cluster can operate both HTTP1.1 and HTTP2 connections and the protocol
// is chosen based on the downstream protocol, if any. In case there is no
// downstream connection, this option will be ignored.
bool auto_http2 = 26;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming we do end up taking this upstream, I think we should make this field more specific, such as infer_http2_from_downstream_connection or something like that. I also wonder if this should go inside the Http2ProtocolOptions message.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The semantics looks complicated and the variable name is too terse, as @mattklein123 suggested.

Can we make this behavior automatic instead of a specific option? Every boolean option essentially double the testing matrix. It adds more work to monitoring and support as well. W need to justify the value before adding another option.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately I don't think this can be the default behavior.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about this more and I would suggest actually making this an enum, with the enum default being the current behavior (effectively, "select based on codec options presence). The reason for this is eventually we will want to support HTTP/1 -> HTTP/2 upgrade and this could be a future enum value. cc @rshriram

@jrajahalme
Copy link
Copy Markdown
Contributor Author

jrajahalme commented Jan 6, 2018 via email

@mattklein123
Copy link
Copy Markdown
Member

@jrajahalme you can just point the proxy PR to your fork of data-plane-api temporarily and it will compile/pass tests.

@htuch
Copy link
Copy Markdown
Member

htuch commented Jan 7, 2018

Thanks for explaining the use case, seems legit. Some comments:

  1. We've had a long standing position that the APIs are not intended to be particularly elegant for human generation, i.e. if generating multiple cluster/routes mechanically achieves the same result, why not just do that? It makes the API more minimal and pushes work to offline config generation tools rather than into the proxy. If there is a semantic effect that can't be achieved this way, then sure, I think this makes a lot of sense to bake into the API.

  2. What are we going to do about protocol_options in Cluster? Today these are a oneof, we should have both protocols options for H1 and H2 with this option. I think we can remove the oneof qualifier in a wire compatible way, that should probably be done in this PR (and related in envoy main repo).

@jrajahalme
Copy link
Copy Markdown
Contributor Author

jrajahalme commented Jan 8, 2018 via email

Remove the oneof qualifier from protocol options. If multiple options
are present, then the cluster can use multiple upstream protocols.

If both HTTP1.1 and HTTP2 options are present, then the cluster will
choose the upstream protocol based on the downstream connection.

Signed-off-by: Jarno Rajahalme <jarno@covalent.io>
@mattklein123
Copy link
Copy Markdown
Member

Yeah I think we need to remove the oneof. We will need to do this in the future anyway to support HTTP/1 -> HTTP/2 upgrade.

@jrajahalme
Copy link
Copy Markdown
Contributor Author

jrajahalme commented Jan 8, 2018 via email

@mattklein123
Copy link
Copy Markdown
Member

@jrajahalme friendly request to please use the GH UI to respond to comments. It's a lot easier to keep track of comments threads vs. email.

Do you think we can deduce the auto http2 feature I proposed from the presence of both H1 and H2 options, or should I add the enum for it?

I think we could, but IMO it becomes very confusing. I would be in favor of just adding the enum now. It's clearer and easier to document.

@jrajahalme
Copy link
Copy Markdown
Contributor Author

jrajahalme commented Jan 8, 2018 via email

@jrajahalme
Copy link
Copy Markdown
Contributor Author

jrajahalme commented Jan 8, 2018 via email

@jrajahalme
Copy link
Copy Markdown
Contributor Author

jrajahalme commented Jan 8, 2018 via email

Add a new enum ClusterProtocolSelection and a corresponding
'protocol_selection' option.  Default value enforces backwards
compatible behavior of exclusively using HTTP/1 or HTTP/2 depending on
the protocol options present.

Signed-off-by: Jarno Rajahalme <jarno@covalent.io>
…on'.

Use HTTP1.1 or HTTP2, depending on which one is used on the downstream connection.

Signed-off-by: Jarno Rajahalme <jarno@covalent.io>
Copy link
Copy Markdown
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did we have any resolution on whether this PR is just syntactic sugar (to reduce the amount of duplicated cluster definitions) or if there is a fundamental needs in terms of API expressiveness?

api/cds.proto Outdated
// If http2_protocol_options are present, HTTP2 will be used, otherwise HTTP1.1 will be used.
EXCLUSIVE_AS_CONFIGURED = 0;
// Use HTTP1.1 or HTTP2, depending on which one is used on the downstream connection.
USE_DOWNSTREAM_PROTO = 1;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefer USE_DOWNSTREAM_PROTOCOL to avoid confusion with protobufs (I read it as "use the protobuf supplied from downstream to specify which protocol to use" at first glance)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, done :-)

api/cds.proto Outdated
Http2ProtocolOptions http2_protocol_options = 14;

// [#not-implemented-hide:]
GrpcProtocolOptions grpc_protocol_options = 15;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is something we probably want medium-long term, so should keep. There are gRPC specific options, like how much to buffer for gRPC retries, that could belong here (I think; can't remember where we left that particular one @alyssawilk @mattklein123).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like my other comment, I would personally kill this until it's needed later. As it's no longer mutually exclusive, GrpcProtocolOptions having an embedded Http2ProtocolOptions I don't think makes sense?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tend to lean towards treating gRPC as H2 until we have a concrete reason not to (which we probably will, eventually) and add it back then

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 (but still suspicious we'll end up wanting it at some point).

api/cds.proto Outdated

enum ClusterProtocolSelection {
// Cluster can only operate on one of the possible upstream protocols (HTTP1.1, HTTP2).
// If http2_protocol_options are present, HTTP2 will be used, otherwise HTTP1.1 will be used.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any load balancing implications when using mixed upstream protocols?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Current Envoy implementation assumes pooled connections are all either HTTP/1 or HTTP/2, as open connections are reused without any version checks. My implementation expands the containers currently used for connection pools to factor in the protocol version into the pool selection.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the default mode the cluster there is no change, so no impact.
In the 'USE_DOWNSTREAM_PROTOCOL' each request is load balanced on the set of hosts as before, and the connection pooling extension I mentioned above makes sure that v1&2 connections are not mixed up. With this I don't see any load balancing impact.

api/cds.proto Outdated
GrpcProtocolOptions grpc_protocol_options = 15;
}
// [#not-implemented-hide:]
TcpProtocolOptions tcp_protocol_options = 12;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IDK re: TCP proxying being mutually exclusive with HTTP for a cluster. Opening this comment thread for folks to respond.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically, I think this was put in place for low level TCP options which are common to TCP proxy, HTTP proxy, etc. Personally, I would kill this until it's needed and add it back later.

@mattklein123
Copy link
Copy Markdown
Member

How do you see the 1->2 upgrade working? Is it different from the current behavior of forwarding downstream HTTP/1 connections to upstream HTTP/2 connections whenever the chosen cluster is configured for HTTP/2?

Like actual HTTP/1.1 -> HTTP/2 upgrade as an option: https://tools.ietf.org/html/rfc7540#section-3.2

Better avoid 'proto' when not meaning protobufs.

Suggested-by: Harvey Tuch <htuch@google.com>
Signed-off-by: Jarno Rajahalme <jarno@covalent.io>
Now that the protocol options are no longer mutually exclusive, HTTP2
options can also be used for gRPC. We can add these back in when
needed. The old code points remain unused.

Signed-off-by: Jarno Rajahalme <jarno@covalent.io>
@jrajahalme
Copy link
Copy Markdown
Contributor Author

jrajahalme commented Jan 8, 2018 via email

@mattklein123
Copy link
Copy Markdown
Member

Thanks for the pointer. So this would apply when we don’t yet know if the upstream server supports HTTP2. My implementation separates the connection pools for HTTP/1 and HTTP/2, so any connection that is upgraded from 1 to 2 should be replaced to the HTTP/2 pool after such an upgrade.

I suspect the implementation of upgrade would probably use a new UpgradingConnectionPool type, but we can worry about that later. I think from the API standpoint everything we are doing here will allow us to seamlessly implement it at some later point.

@mattklein123
Copy link
Copy Markdown
Member

@jrajahalme was talking with @htuch and he suggested that if we had a route matching rule that could match on HTTP protocol this might not be needed. (You could have duplicate clusters, one for HTTP/1 and one for HTTP/2). Would that work?

@jrajahalme
Copy link
Copy Markdown
Contributor Author

jrajahalme commented Jan 9, 2018 via email

@jrajahalme
Copy link
Copy Markdown
Contributor Author

Posted the corresponding Envoy PR: envoyproxy/envoy#2328. Some new tests may still be needed, but the approach should be good for review.

@mattklein123
Copy link
Copy Markdown
Member

That would double the number of routes which could be bad for performance. Let’s see what you think of the PR once I get it out.

How many routes do you have in your setup? We are talking about a single static check when looping over a list (I think we are likely talking sub-microsecond).

Copy link
Copy Markdown
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

envoyproxy/envoy#2328 seems a small enough change, so I don't have a huge objection to proceeding with the current PR. Would just like to make sure we fully understand the alternatives considered.

@mattklein123
Copy link
Copy Markdown
Member

envoyproxy/envoy#2328 seems a small enough change, so I don't have a huge objection to proceeding with the current PR. Would just like to make sure we fully understand the alternatives considered.

It is a small change, but I'm still not convinced it's a common enough use-case that it's worth the complexity, especially when I think routing on protocol is a trivial workaround. Do we think the extra complexity here is justified?

@jrajahalme
Copy link
Copy Markdown
Contributor Author

jrajahalme commented Jan 9, 2018 via email

@mattklein123
Copy link
Copy Markdown
Member

IMO most of envoyproxy/envoy#2328 <envoyproxy/envoy#2328> should be reusable towards HTTP 1->2 upgrade you mentioned earlier, don’t you think?

Some of it maybe, but if we implement upgrade it will be an entirely new connection pool implementation, so really only the config parts.

Ultimately, I don't feel super strongly about this, but I'm not convinced this feature is really necessary or useful in the general case. I would like to hear some opinions from the other maintainers. cc @envoyproxy/maintainers

@jrajahalme
Copy link
Copy Markdown
Contributor Author

jrajahalme commented Jan 9, 2018 via email

@mattklein123
Copy link
Copy Markdown
Member

In our application we don’t have direct control of the routing rules, as we create them from application level policy description that we do not control. In some instances the number to routes will be small, in others it can be big. I’d rather not double the configuration state (routes, clusters) and the execution time in Envoy routing.

OK that's fair. I had assumed that you were basically using catch-all routes so the number of routes would increase from 1 to 2. :)

@jrajahalme
Copy link
Copy Markdown
Contributor Author

jrajahalme commented Jan 9, 2018 via email

Copy link
Copy Markdown
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jrajahalme I discussed with the other maintainers and I think we are all good moving forward on this. Thank you for taking the time to discuss it with us. I have a few small comments but otherwise LGTM.


oneof protocol_options {
// [#not-implemented-hide:]
TcpProtocolOptions tcp_protocol_options = 12;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wora @htuch for locked protos, when we delete a field, should we do something like:
// deleted = 12;
or something? Just curious what the norm is here. I think something like that would help me.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added these comments for both 12 and 15.

api/cds.proto Outdated

enum ClusterProtocolSelection {
// Cluster can only operate on one of the possible upstream protocols (HTTP1.1, HTTP2).
// If http2_protocol_options are present, HTTP2 will be used, otherwise HTTP1.1 will be used.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please :ref: link http2_protocol_options

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Requested-by: Matt Klein <mklein@lyft.com>
Signed-off-by: Jarno Rajahalme <jarno@covalent.io>
Requested-by: Matt Klein <mklein@lyft.com>
Signed-off-by: Jarno Rajahalme <jarno@covalent.io>
Signed-off-by: Jarno Rajahalme <jarno@covalent.io>
mattklein123
mattklein123 previously approved these changes Jan 9, 2018
Copy link
Copy Markdown
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Will defer to @htuch for merge and @wora for any further comments.

Copy link
Copy Markdown
Contributor

@wora wora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, it looks good.

api/cds.proto Outdated
// Cluster can only operate on one of the possible upstream protocols (HTTP1.1, HTTP2).
// If :ref:`http2_protocol_options <envoy_api_field_Cluster.http2_protocol_options>` are
// present, HTTP2 will be used, otherwise HTTP1.1 will be used.
EXCLUSIVE_AS_CONFIGURED = 0;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

USE_CONFIGURED_PROTOCOL sounds better.

Please avoid subtle concept like exclusive unless it is really needed.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Thanks for the review!

api/cds.proto Outdated
// connections to happen over plain text.
Http2ProtocolOptions http2_protocol_options = 14;

// deleted = 15;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use proto language syntax as follows. If you are not familiar, please see proto spec.

reserved 13, 15;

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, used the reserved keyword for 12 and 15 separately.

Requested-by: Hong Zhang <wora2000@gmail.com>
Signed-off-by: Jarno Rajahalme <jarno@covalent.io>
@mattklein123 mattklein123 merged commit a4b80d6 into envoyproxy:master Jan 10, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants