-
Notifications
You must be signed in to change notification settings - Fork 5.3k
Description
Title: Envoy resets TCP connection when HTTP/1.1 upstream times out
Description:
When an upstream times out a HTTP/1.1 request, Envoy resets (with a RST) the TCP connection to that upstream.
Expected behaviour:
I'd expect Envoy to keep the connection open -- a timeout of a HTTP request doesn't seem to me to indicate that the TCP connection is unhealthy and necessary should be reset. My understanding is that's what circuit breakers/outlier detection are for!
I got a message from mklein earlier saying:
For HTTP/1, the protocol dictates that the connection must be closed in cases in which the upstream connection state is unknown (e.g., partial response, etc.).
but haven't been able to find this in the RFC. If the RFC does mandate that the connection must be closed in terms, I'd love a pointer to that!
Repro steps:
To start Envoy & the backend:
git clone https://gist.github.com/julia-stripe/41fdc36f96b33c1e0e1708376fb7540d
cd 41fdc36f96b33c1e0e1708376fb7540d
ruby ./server.rb # start a slow backend server
envoy -c ./envoy-config.yaml --service-cluster test --service-node test
To cause the connection resets:
curl localhost:8190 -H "x-envoy-upstream-rq-timeout-ms: 1"
tcpdump
Here's a tcpdump of the connection between Envoy and the upstream. You can see that Envoy resets (with a RST) the TCP connection after every request.
Admin and Stats Output:
In this gist:
https://gist.github.com/julia-stripe/1071e8f2df89d3227d860332d4e94914
Config: