-
Notifications
You must be signed in to change notification settings - Fork 709
Description
Description:
When setting metrics.clusterStatName for a EnvoyProxy resource, an incorrectly formatted value (missing a % symbol) caused Envoy Gateway to delete and recreate the Envoy Proxy Deployment with a different name, resulting in traffic disruption.
For example:
Incorrect (missing % after ROUTE_RULE_NAME):
metrics:
clusterStatName: '%ROUTE_KIND%/%ROUTE_NAMESPACE%/%ROUTE_NAME%/rule/%ROUTE_RULE_NAME/%ROUTE_RULE_NUMBER%'
Correct:
metrics:
clusterStatName: '%ROUTE_KIND%/%ROUTE_NAMESPACE%/%ROUTE_NAME%/rule/%ROUTE_RULE_NAME%/%ROUTE_RULE_NUMBER%'
After applying the incorrect format, the existing Envoy Proxy Deployment (envoy-proxy-private) was deleted and replaced with a new Deployment named envoy-kube-system-private. All xRoutes failed until I corrected the typo, which triggered yet another redeployment to restore the original Deployment name.
Expected behavior:
A format error in metrics.clusterStatName should raise a warning in the logs.
Repro steps:
Deploy Envoy Gateway with an EnvoyProxy resource using the incorrect format for metrics.clusterStatName (missing a % symbol as shown above).
Apply the change and observe the name of the Envoy Proxy Deployment.
Fix the format string and apply again.
Expected:
The Deployment name remains the same, and Envoy is updated in-place.
Actual:
Deployment is deleted and recreated with a different name, causing traffic disruption.
Environment:
Envoy Gateway version: 1.5.0
Kubernetes version: 1.32
Logs:
N/A — issue observed from Kubernetes resource deletion/recreation