Skip to content

set a default retry budget and host retry predicate #2754

@guydc

Description

@guydc

Description:
Currently, EG does not expose retry limit settings: #2322, and it doesn't configure a default value, leading to the envoy default of max_retries: 3 to apply.

While working on #2725 I noticed that the default envoy max_retry value is leading to retry overflow when there is a significant amount of requests to a restarting backend.

Envoy provides a sample that increases this value to 10 to deal with restart scenarios. Additionally, a host retry predicate is used, to allow dispatching retries to a different host, increasing the chance of a successful retry.

Istio sets concurrent request limit and retry limit to max_uint32 by default: https://github.com/istio/istio/blob/7f25d965f21154a32ab5e29aef9ae501fd8ef7a8/pilot/pkg/networking/core/v1alpha3/cluster_traffic_policy.go#L348. The high retry limit is required to support rolling update and restart scenarios. Istio also configures a host retry predicate.

I propose that EG will currently set the retry_budget to 100% by default, meaning the total upstream request volume would be capped at 200% of max_parallel_requests. Additionally, EG will configure a host retry predicate by default, aligned with the envoy example (rejecting previously-used hosts from being used in a retry).

[optional Relevant Links:]

Metadata

Metadata

Assignees

Labels

area/translatorIssues related to Gateway's translation service, e.g. translating Gateway APIs into the IR.kind/enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions