Retry on mid-stream resets result in 503 but no retry

*Title*: *One line description*

*Description*:
>I have a fleet on Envoy running in Docker calling into upstream which is a Google Cloud hosted service, via h2.
Envoy periodically (very hard to repro) gets 503 with upstream_cx_destroy_remote_with_active_rq  being bumped up.
I tried using https://www.envoyproxy.io/docs/envoy/v1.9.0/configuration/http_filters/router_filter#config-http-filters-router-x-envoy-retry-on retriable-status-codes, made 503 a retryable error code, however the retry does not take place.
The retry does take place only if I explicitly make upstream send the 503 (without bumping upstream_cx_destroy_remote_with_active_rq)

Is this expected? How can I make Envoy retry in this case? 

This only happens once every 24-36 hours for a 200 QPS load. But when it does hundreds of requests fail at a time, causing a big damage. Turning out debug logs might slow everything down. Should I not be concerned about the IO/CPU increase and the effect of Envoy overhead on redirecting those requests?
Also, 5xx as a retry policy does not work. I explicitly want 503 to be retried on, so in 1.9 I am doing this but no effect for Envoy generated 503s as a result of upstream_cx_destroy_remote_with_active_rq. How do I proceed to give you meaningful info?


                          prefix: "/"
                        route:
                          timeout: 120s
                          cluster: test_cluster
                          host_rewrite: "x.y.com"
                          retry_policy:
                            retry_on: connect-failure,refused-stream,retriable-status-codes
                            num_retries: 2
                            retriable_status_codes: [503]

From Harvey Tuch:

Looking at RetryStateImpl::wouldRetryFromReset() and RetryStateImpl::wouldRetry() I'm not sure if they're handling the case of a mid-stream reset. Please file a GH issue and we can discuss further there.

[optional *Relevant Links*:]
> https://groups.google.com/forum/?utm_medium=email&utm_source=footer#!msg/envoy-users/fV87_K4cMks/LvGkWbSZGgAJ
> is it same as https://github.com/envoyproxy/envoy/issues/5023 ? 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Retry on mid-stream resets result in 503 but no retry #5876

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Retry on mid-stream resets result in 503 but no retry #5876

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions