Skip to content

Improve EG Gateway xDS & startup Reliability (custom k8s health prob) #2810

@alexwo

Description

@alexwo

The proposed enhancement involves modifying the controller's "ready" status to accurately reflect the completion and synchronization of xds discovery processes.

Specifically, the "ready" status indicator can transition to "true" when the xDS discovery has fully completed to store it's initial snapshot or when there is no reconciliation required. (empty or new deployment).

This can ensure that envoy proxies are always in sync with latest xDS service and that an EG that has started is able to reconcile.

-> If there is nothing to reconcile -> ready = true
-> if there are changes to reconcile, wait for xDS to complete -> ready = true
-> other wise -> ready = false

Currently, there may be certain cases where xDS is not completely synchronized at startup, which could cause new Envoy proxies to work with an incomplete xDS.

This can provide better guarantees that an operational EG consistently maintains an updated xDS, potentially also can allow avoiding situations where instances startup but fail during the initial reconcile.

Leader Election and multiple instances use case:
Will improve consistency in environments where multiple instances of EG run simultaneously by ensuring they start only once xDS server has persisted the latest state snapshot. #1953

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions