http: send GOAWAY for H2 downstreams for envoy.overload_actions.disable_http_keepalive#12498
Conversation
When the overload manager is configured with the "disable keepalive" action, if the client is speaking H2, send a GOAWAY frame when the trigger state is active. This is guarded behind a runtime configuration flag. Signed-off-by: Alex Konradi <akonradi@google.com>
Signed-off-by: Alex Konradi <akonradi@google.com>
|
I don't know the nuances of h2-goaway well enough to say if this is right or not. Reassigning to Matt. |
mattklein123
left a comment
There was a problem hiding this comment.
Thanks for adding this. A few comments. Thank you!
/wait
| * decompressor: headers-only requests were incorrectly not advertising accept-encoding when configured to do so. This is now fixed. | ||
| * http: added :ref:`headers_to_add <envoy_v3_api_field_extensions.filters.network.http_connection_manager.v3.ResponseMapper.headers_to_add>` to :ref:`local reply mapper <config_http_conn_man_local_reply>` to allow its users to add/append/override response HTTP headers to local replies. | ||
| * http: added HCM level configuration of :ref:`error handling on invalid messaging <envoy_v3_api_field_extensions.filters.network.http_connection_manager.v3.HttpConnectionManager.stream_error_on_invalid_http_message>` which substantially changes Envoy's behavior when encountering invalid HTTP/1.1 defaulting to closing the connection instead of allowing reuse. This can temporarily be reverted by setting `envoy.reloadable_features.hcm_stream_error_on_invalid_message` to false, or permanently reverted by setting the :ref:`HCM option <envoy_v3_api_field_extensions.filters.network.http_connection_manager.v3.HttpConnectionManager.stream_error_on_invalid_http_message>` to true to restore prior HTTP/1.1 beavior and setting the *new* HTTP/2 configuration :ref:`override_stream_error_on_invalid_http_message <envoy_v3_api_field_config.core.v3.Http2ProtocolOptions.override_stream_error_on_invalid_http_message>` to false to retain prior HTTP/2 behavior. | ||
| * http: changed Envoy to send GOAWAY to HTTP2 downstreams when the disable_keepalive overload action is active. This behavior may be temporarily reverted by setting `envoy.reloadable_features.connection_manager_drain_http2` to false. |
There was a problem hiding this comment.
can you ref link to the option?
There was a problem hiding this comment.
This is a runtime flag. Is there a place to link it to?
There was a problem hiding this comment.
Sorry I meant disable_keepalive overload action. Can we reflink that somewhere useful?
There was a problem hiding this comment.
Can we name this feature flag something more specific to overload manager? We already do drain HTTP/2 connections.
| "envoy.reloadable_features.connection_manager_drain_http2")) { | ||
| // For H2, send a GOAWAY frame to tell the client that no further requests will be accepted | ||
| // on this connection. | ||
| connection_manager_.codec_->goAway(); |
There was a problem hiding this comment.
I think you probably want to call startDrainSequence() here? This will do the 2 stage goaway and eventually close the connection?
There was a problem hiding this comment.
startDrainSequence() asserts that the drain state is NotDraining, but on line 1884, we check that the connection is already draining, so startDrainSequence()'s ASSERT will be tripped.
There was a problem hiding this comment.
How did the connection enter draining in the first place? We would have already started the H2 drain sequence. I'm actually not sure why this change is needed.
There was a problem hiding this comment.
Answering based on the re-jiggered implementation. This change ensures that a GOAWAY frame is sent to H2 clients when the proxy wants to stop accepting connections from clients, as opposed to waiting longer and giving them time to start new requests.
Instead of sending send an ad-hoc GOAWAY, use the existing startDrainSequence method to force a drain either immediately or on a delay. Signed-off-by: Alex Konradi <akonradi@google.com>
Signed-off-by: Alex Konradi <akonradi@google.com>
Verify that if the disable keepalive action is active, HTTP/2 and HTTP/3 connections receive GOAWAY frames since the server wants to reject new streams. Signed-off-by: Alex Konradi <akonradi@google.com>
Signed-off-by: Alex Konradi <akonradi@google.com>
|
I've reworked this to use the exising |
Signed-off-by: Alex Konradi <akonradi@google.com>
Signed-off-by: Alex Konradi <akonradi@google.com>
|
/retest |
|
Retrying Azure Pipelines, to retry CircleCI checks, use |
mattklein123
left a comment
There was a problem hiding this comment.
Thanks for making the code more consistent. Flushing a few more comments.
/wait
| * decompressor: headers-only requests were incorrectly not advertising accept-encoding when configured to do so. This is now fixed. | ||
| * http: added :ref:`headers_to_add <envoy_v3_api_field_extensions.filters.network.http_connection_manager.v3.ResponseMapper.headers_to_add>` to :ref:`local reply mapper <config_http_conn_man_local_reply>` to allow its users to add/append/override response HTTP headers to local replies. | ||
| * http: added HCM level configuration of :ref:`error handling on invalid messaging <envoy_v3_api_field_extensions.filters.network.http_connection_manager.v3.HttpConnectionManager.stream_error_on_invalid_http_message>` which substantially changes Envoy's behavior when encountering invalid HTTP/1.1 defaulting to closing the connection instead of allowing reuse. This can temporarily be reverted by setting `envoy.reloadable_features.hcm_stream_error_on_invalid_message` to false, or permanently reverted by setting the :ref:`HCM option <envoy_v3_api_field_extensions.filters.network.http_connection_manager.v3.HttpConnectionManager.stream_error_on_invalid_http_message>` to true to restore prior HTTP/1.1 beavior and setting the *new* HTTP/2 configuration :ref:`override_stream_error_on_invalid_http_message <envoy_v3_api_field_config.core.v3.Http2ProtocolOptions.override_stream_error_on_invalid_http_message>` to false to retain prior HTTP/2 behavior. | ||
| * http: changed Envoy to send GOAWAY to HTTP2 downstreams when the disable_keepalive overload action is active. This behavior may be temporarily reverted by setting `envoy.reloadable_features.connection_manager_drain_http2` to false. |
There was a problem hiding this comment.
Sorry I meant disable_keepalive overload action. Can we reflink that somewhere useful?
| drain_timer_ = read_callbacks_->connection().dispatcher().createTimer( | ||
| [this]() -> void { onDrainTimeout(); }); | ||
| drain_timer_->enableTimer(config_.drainTimeout()); | ||
| if (wait_for_timeout) { |
There was a problem hiding this comment.
Why would we not do the timer in all cases? This timer is designed to avoid races in which the client races with the goaway and then has the streams reset. Given that we will need to wait until all existing streams drain anyway I don't see any reason to not do the normal drain sequence here?
There was a problem hiding this comment.
The goal of this PR is to make H2 clients more responsive to overload conditions. We actually want to stop the clients from making any more requests, even if that means racing with incoming requests. If we wait for the timer, then there is a lag between when the proxy is overloaded and when it actually takes action.
There was a problem hiding this comment.
The default drain timer is I think 5s, and in general we try to make things graceful, similar to what happens with H1. I'm very skeptical that it's worth the extra code complexity and non-graceful behavior for this special case when the default is only around 5s?
There was a problem hiding this comment.
Agreed, I've switched this to drain in the normal way, sending the shutdown notice first and then the GOAWAY after the delay.
| TestScopedRuntime runtime_; | ||
| }; | ||
|
|
||
| TEST_P(DrainH2HttpConnectionManagerImplTest, DisableHttp2KeepAliveWhenOverloaded) { |
There was a problem hiding this comment.
Can you add a '//' comment above the test with a summary of what it tests? Also, I think we should have a test that covers when there are multiple in flight streams. Even better if there were an integration test somehow.
Avoid races by sending the shutdown notice GOAWAY frame before the real one. Signed-off-by: Alex Konradi <akonradi@google.com>
The tests were expecting a GOAWAY frame, but Envoy's QUIC codec doesn't send those on shutdown notification. Instead, wait for the connection to be closed, and check for the presence of a GOAWAY frame after. Signed-off-by: Alex Konradi <akonradi@google.com>
mattklein123
left a comment
There was a problem hiding this comment.
Thanks looks great. One small comment and we can ship!
/wait
| * decompressor: headers-only requests were incorrectly not advertising accept-encoding when configured to do so. This is now fixed. | ||
| * http: added :ref:`headers_to_add <envoy_v3_api_field_extensions.filters.network.http_connection_manager.v3.ResponseMapper.headers_to_add>` to :ref:`local reply mapper <config_http_conn_man_local_reply>` to allow its users to add/append/override response HTTP headers to local replies. | ||
| * http: added HCM level configuration of :ref:`error handling on invalid messaging <envoy_v3_api_field_extensions.filters.network.http_connection_manager.v3.HttpConnectionManager.stream_error_on_invalid_http_message>` which substantially changes Envoy's behavior when encountering invalid HTTP/1.1 defaulting to closing the connection instead of allowing reuse. This can temporarily be reverted by setting `envoy.reloadable_features.hcm_stream_error_on_invalid_message` to false, or permanently reverted by setting the :ref:`HCM option <envoy_v3_api_field_extensions.filters.network.http_connection_manager.v3.HttpConnectionManager.stream_error_on_invalid_http_message>` to true to restore prior HTTP/1.1 beavior and setting the *new* HTTP/2 configuration :ref:`override_stream_error_on_invalid_http_message <envoy_v3_api_field_config.core.v3.Http2ProtocolOptions.override_stream_error_on_invalid_http_message>` to false to retain prior HTTP/2 behavior. | ||
| * http: changed Envoy to send GOAWAY to HTTP2 downstreams when the disable_keepalive overload action is active. This behavior may be temporarily reverted by setting `envoy.reloadable_features.connection_manager_drain_http2` to false. |
There was a problem hiding this comment.
Can we name this feature flag something more specific to overload manager? We already do drain HTTP/2 connections.
Signed-off-by: Alex Konradi <akonradi@google.com>
Signed-off-by: Alex Konradi <akonradi@google.com>
Commit Message: Send H2 GOAWAY for disable keepalive overload action
Additional Description:
When the overload manager is configured with the "disable keepalive" action, if the client is speaking H2, send a GOAWAY frame when the trigger state is active. This can be disabled via runtime override.
Risk Level: low
Testing: added unit tests, ran integration tests
Docs Changes: updated overload action description
Release Notes: http: changed Envoy to send GOAWAY to HTTP2 downstreams when the disable_keepalive overload action is active. This behavior may be temporarily reverted by setting
envoy.reloadable_features.connection_manager_drain_http2to false.Runtime guard: set envoy.reloadable_features.connection_manager_drain_http2 to false to disable the new behavior