router: set correct timeout for egress->ingress envoys#8051
router: set correct timeout for egress->ingress envoys#8051mattklein123 merged 14 commits intoenvoyproxy:masterfrom
Conversation
snowp
left a comment
There was a problem hiding this comment.
Nice, this is a good start
source/common/router/router.cc
Outdated
There was a problem hiding this comment.
we probably need this to be guarded by a config flag as this could potentially break existing deployments
There was a problem hiding this comment.
@snowp, you mean by runtime guarding or by simple bool config property?
There was a problem hiding this comment.
In the past I think we've been doing xds config for these kind of changes, but I guess this case could be considered a bug fix. @mattklein123 do you have any thoughts around what kind of feature guarding we'll need for this?
There was a problem hiding this comment.
My quick thought is this should probably be a full on config option in the router?
test/common/router/router_test.cc
Outdated
There was a problem hiding this comment.
since the route timeout and the expected-rq-timeout-ms is the same in this test case, this line doesn't really verify that we're honoring the expected-rq-timeout-ms. You probably want to make these values different so that you can verify that we're capping the global timeout by the incoming deadline
source/common/router/router.cc
Outdated
There was a problem hiding this comment.
it might be worthwhile to add an integration test to verify that we actually have this header at this point due to all the possible header sanitization that happens at various points during request processing
|
added config guard, may need to fix tests later. |
|
This pull request has been automatically marked as stale because it has not had activity in the last 7 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions! |
|
still working on this (was on vacation). |
Signed-off-by: Kateryna Nezdolii <nezdolik@spotify.com>
Signed-off-by: Kateryna Nezdolii <nezdolik@spotify.com>
Signed-off-by: Kateryna Nezdolii <nezdolik@spotify.com>
Signed-off-by: Kateryna Nezdolii <nezdolik@spotify.com>
a3a1551 to
be4d596
Compare
Signed-off-by: Kateryna Nezdolii <nezdolik@spotify.com>
|
/retest |
|
🔨 rebuilding |
snowp
left a comment
There was a problem hiding this comment.
Looking pretty good overall, just a few comments
| ] | ||
| }]; | ||
|
|
||
| // If set to true, envoy first will check if `x-envoy-expected-timeout-ms` header is present |
There was a problem hiding this comment.
nit: "will first check if the `x-envoy"
There was a problem hiding this comment.
Also use *x-envoy-expected-timeout-ms* formatting for headers.
source/common/router/router.cc
Outdated
| Http::HeaderEntry* header_expected_timeout_entry = | ||
| request_headers.EnvoyExpectedRequestTimeoutMs(); | ||
| if (header_expected_timeout_entry) { | ||
| // This will prevent from overriding `x-envoy-expected-rq-timeout-ms` header. |
There was a problem hiding this comment.
do we need this? won't this be set based on the global_timeout_ later on which should match the value we're extracting from the expected-rq-timeout-ms header?
There was a problem hiding this comment.
@snowp this was protection against this code path:
if (insert_envoy_expected_request_timeout_ms && expected_timeout > 0) {
request_headers.insertEnvoyExpectedRequestTimeoutMs().value(expected_timeout);
}
We do indeed derive timeout and put it into a separate data structure (with global timeout), so there is not a big gain from setting insert_envoy_expected_request_timeout_ms to false.
There was a problem hiding this comment.
changed my mind, there is a gain, so that we use same value in timeout.global_timeout (derived from x-envoy-expected-timeout-ms) and observe same value in header x-envoy-expected-timeout-ms by not overriding it.
There was a problem hiding this comment.
Hmm, I think we want the expected header to reflect the timeout used by the router, which is affected by more than just those two headers. If you look further down in that function you'll see that we'll use the per try timeout instead of the global timeout if it's set. It seems like we want the expected timeout header to always reflect the timeout enforced by the router for the outgoing request, and just use the incoming expected timeout header to infer the global timeout. Does that make sense?
There was a problem hiding this comment.
will adjust the test as well
| // If set to true, envoy first will check if `x-envoy-expected-timeout-ms` header is present | ||
| // and use it's value as timeout to upstream cluster. If header is not present or | ||
| // `respect_expected_rq_timeout` is set to false, envoy will derive timeout value from | ||
| // `x-envoy-upstream-rq-timeout-ms` header. |
There was a problem hiding this comment.
i dont think this is completely correct, as there might not be a rq-timeout header in which case the route specified timeout is used. Might be better to avoid listing what the behavior is without this flag to help ensure that this comment doesn't have to be updated whenever the timeout decision logic changes
| ASSERT_TRUE(upstream_request_->waitForEndStream(*dispatcher_)); | ||
|
|
||
| // Trigger global timeout, populated from `x-envoy-expected-rq-timeout-ms` header. | ||
| timeSystem().sleep(std::chrono::milliseconds(501)); |
There was a problem hiding this comment.
since the upstream-rq-timeout header is greater than the expected-rq-timeout, you can't tell based on this test that the timeout is due to upstream-rq-timeout. I suggest making upstream-rq-timeout 300ms instead so that we know we're timing out before the expected-rq-timeout timer would hit
There was a problem hiding this comment.
@snowp made x-envoy-upstream-rq-timeout-ms smaller than x-envoy-expected-rq-timeout-ms
| ] | ||
| }]; | ||
|
|
||
| // If set to true, envoy first will check if `x-envoy-expected-timeout-ms` header is present |
There was a problem hiding this comment.
Also use *x-envoy-expected-timeout-ms* formatting for headers.
|
|
||
| // If set to true, envoy first will check if `x-envoy-expected-timeout-ms` header is present | ||
| // and use it's value as timeout to upstream cluster. If header is not present or | ||
| // `respect_expected_rq_timeout` is set to false, envoy will derive timeout value from |
There was a problem hiding this comment.
Can you :ref: internal link to the relevant field in the API docs?
|
|
||
| // If set to true, envoy first will check if `x-envoy-expected-timeout-ms` header is present | ||
| // and use it's value as timeout to upstream cluster. If header is not present or | ||
| // `respect_expected_rq_timeout` is set to false, envoy will derive timeout value from |
Signed-off-by: Kateryna Nezdolii <nezdolik@spotify.com>
Signed-off-by: Kateryna Nezdolii <nezdolik@spotify.com>
Signed-off-by: Kateryna Nezdolii <nezdolik@spotify.com>
mattklein123
left a comment
There was a problem hiding this comment.
Thanks, great work. Just a few small comments. Great feature!
/wait
source/common/router/router.cc
Outdated
| if (header_expected_timeout_entry) { | ||
| if (absl::SimpleAtoi(header_expected_timeout_entry->value().getStringView(), | ||
| &header_timeout)) { | ||
| timeout.global_timeout_ = std::chrono::milliseconds(header_timeout); |
There was a problem hiding this comment.
nit: can you lift this part out into a helper which takes the header, etc.? It's repeated 3 times in this function and it would help readability.
Signed-off-by: Kateryna Nezdolii <nezdolik@spotify.com>
Signed-off-by: Kateryna Nezdolii <nezdolik@spotify.com>
Signed-off-by: Kateryna Nezdolii <nezdolik@spotify.com>
Signed-off-by: Kateryna Nezdolii <nezdolik@spotify.com>
|
hi All, Attaching my config below. static_resources: |
|
@Sooryaa-A please create a dedicated issue in envoy repo |
Signed-off-by: Kateryna Nezdolii nezdolik@spotify.com
Description: In case of egress->ingress envoy setup, ingress envoy currently does not respect
x-envoy-expected-rq-timeout-msheader, set by egress envoy and overrides the header with it's own timeout value. This change makes ingress envoy to respectx-envoy-expected-rq-timeout-msheader value, if it's present in request.Risk Level: Low
Testing: Added unit and integration test to make sure header is not sanitised and not ignored.
Docs Changes: Updated API v2 docs
Release Notes: Updated version history
Fixes #7358