-
Notifications
You must be signed in to change notification settings - Fork 709
Description
Description:
Today, several projects in the Envoy ecosystem disable the initial_fetch_timeout (envoy default: 15s) by setting it to 0.
- Istio (LDS, CDS, RDS & EDS):
- AWS: https://docs.aws.amazon.com/app-mesh/latest/userguide/envoy-config.html
- Consul (LDS & CDS):
Envoy docs on xDS imply that timeouts for resource requests are treated as an indication that the resource does not exist. So, it's the responsibility of the control plane to eventually update the proxy about these resources:
As a result, clients are expected to use a timeout (recommended duration is 15 seconds) after sending a request for a new resource, after which they will consider the requested resource to not exist if they have not received the resource. In Envoy, this is done for RouteConfiguration and ClusterLoadAssignment resources during resource warming.
Note that even if a requested resource does not exist at the moment when the client requests it, that resource could be created at any time. Management servers must remember the set of resources being requested by the client, and if one of those resources springs into existence later, the server must send an update to the client informing it of the new resource. Clients that initially see a resource that does not exist must be prepared for the resource to be created at any time.
In large systems, an initial fetch timeout may lead to proxy being started with partial configuration, leading to traffic failures. It's not clear if EG behaves according to expectations in order to recover from such errors.
[optional Relevant Links:]
Any extra documentation required to understand the issue.