-
Notifications
You must be signed in to change notification settings - Fork 5.3k
Inconsistent timeout handling between max_stream_timeout and router global timeout #16129
Description
Due to recent changes to the timeout logic (#15585, https://github.com/envoyproxy/envoy/pull/13302/files) there seems to be some inconsistencies in how timeouts are enforced, leading to surprising behavior when updating Envoy:
The max_stream_timeout falls back to using the route timeout under certain conditions: https://github.com/envoyproxy/envoy/blob/main/source/common/http/conn_manager_impl.cc#L1178-L1195
Prior to all these changes, users were able to override the route timeout by setting https://www.envoyproxy.io/docs/envoy/latest/configuration/http/http_filters/router_filter#x-envoy-upstream-rq-timeout-ms. This breaks with the above changes: even if the router takes this into account when setting its global timeout, the max_stream_timeout
fires and ends the request early.
It seems to me like we should either 1) not check route->timeout on the HCM level, and only read this as part of the router where it can be overriden by the header. To address @ramaraochavali's bug, I think we would have to make sure that we configure a global timeout on the router level, or 2) include checks for x-envoy-upstream-rq-timeout-ms in the HCM refreshDurationTimeout computation.
The second option seems to be the right thing directionally (continues moving this timeout handling to the HCM), but has a few drawbacks: the semantics of x-envoy-upstream-rq-timeout-ms setting the timeout used for the upstream processing is lost, and we'd require a refreshCache in order to apply the new value of the header (previously this would not be required since it gets read by the router). The first option preserves the previous behavior better, but it becomes less clear how we'd unify these the two code paths.