-
Notifications
You must be signed in to change notification settings - Fork 712
set a default retry budget and host retry predicate #2754
Description
Description:
Currently, EG does not expose retry limit settings: #2322, and it doesn't configure a default value, leading to the envoy default of max_retries: 3 to apply.
While working on #2725 I noticed that the default envoy max_retry value is leading to retry overflow when there is a significant amount of requests to a restarting backend.
Envoy provides a sample that increases this value to 10 to deal with restart scenarios. Additionally, a host retry predicate is used, to allow dispatching retries to a different host, increasing the chance of a successful retry.
Istio sets concurrent request limit and retry limit to max_uint32 by default: https://github.com/istio/istio/blob/7f25d965f21154a32ab5e29aef9ae501fd8ef7a8/pilot/pkg/networking/core/v1alpha3/cluster_traffic_policy.go#L348. The high retry limit is required to support rolling update and restart scenarios. Istio also configures a host retry predicate.
I propose that EG will currently set the retry_budget to 100% by default, meaning the total upstream request volume would be capped at 200% of max_parallel_requests. Additionally, EG will configure a host retry predicate by default, aligned with the envoy example (rejecting previously-used hosts from being used in a retry).
[optional Relevant Links:]
- https://www.envoyproxy.io/docs/envoy/latest/api-v3/config/cluster/v3/circuit_breaker.proto
- https://www.envoyproxy.io/docs/envoy/latest/api-v3/config/cluster/v3/circuit_breaker.proto#config-cluster-v3-circuitbreakers-thresholds-retrybudget
- https://www.envoyproxy.io/docs/envoy/latest/api-v3/config/route/v3/route_components.proto.html#envoy-v3-api-msg-config-route-v3-retrypolicy-retryhostpredicate
- https://www.envoyproxy.io/docs/envoy/latest/intro/arch_overview/http/http_connection_management#retry-plugin-configuration