-
Notifications
You must be signed in to change notification settings - Fork 712
Improve EG Gateway xDS & startup Reliability (custom k8s health prob) #2810
Description
The proposed enhancement involves modifying the controller's "ready" status to accurately reflect the completion and synchronization of xds discovery processes.
Specifically, the "ready" status indicator can transition to "true" when the xDS discovery has fully completed to store it's initial snapshot or when there is no reconciliation required. (empty or new deployment).
This can ensure that envoy proxies are always in sync with latest xDS service and that an EG that has started is able to reconcile.
-> If there is nothing to reconcile -> ready = true
-> if there are changes to reconcile, wait for xDS to complete -> ready = true
-> other wise -> ready = false
Currently, there may be certain cases where xDS is not completely synchronized at startup, which could cause new Envoy proxies to work with an incomplete xDS.
This can provide better guarantees that an operational EG consistently maintains an updated xDS, potentially also can allow avoiding situations where instances startup but fail during the initial reconcile.
Leader Election and multiple instances use case:
Will improve consistency in environments where multiple instances of EG run simultaneously by ensuring they start only once xDS server has persisted the latest state snapshot. #1953