Skip to content

xds: Consistency while handling CDS updates and EDS #13009

@sschepens

Description

@sschepens

When a cluster is created or updated envoy it enters warming phase and needs a related ClusterLoadAssignement response to fully initialize.
During Envoy startup phase envoy sends requests for those resources to the management server and so the management servers knows it has to respond those.
But, when updating a Cluster via CDS, no EDS re-request is sent to the management server and management server doesn't really know it should send a ClusterLoadAssignement for that Cluster, even if the resource hasn't really changed.
This berhavior introduces subtle bugs in management servers, in our case, our resource versioning scheme somehow included the cluster version, yesterday we introduced a change to remove that, and that unexpectedly broke our cluster updates, leaving some clusters without traffic.

This should probably be handled by envoy, currently go-control-plane and java-control-plane don't really handle this since it's kind of hard to induce this behavior of always pushing an EDS update for a Cluster even if the resource hasn't really changed, and if not using ADS, envoy is possibly connected to multiple management servers which increases the difficulty.

@htuch suggested that envoy should probably unsubscribe from EDS for the updated cluster and immediately subscribe again, since this is what's actually happening inside envoy, a new Cluster is being created and it wants to subscribe to the resources, and the old one wants to unsubscribe. Some more thought should be done on this idea and the consecuences of the old cluster being unsubscribed from resources.

Some discussion already happened on Slack, opening this issue to continue discussion.

This issue also applies to LDS updates with RDS and probably SRDS with RDS.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions