-
Notifications
You must be signed in to change notification settings - Fork 712
Enable HPA for Envoy Proxy introduce unexpected behavior #2807
Description
Description:
Relates to #2257
Once enabled envoyHpa on EnvoyProxy API some unexpected behavior was observed.
During reconciling, Envoy Gateway managed to create HorizontalPodAutoscaler resource successfully. However, when it comes to any operation related to Gateway API resource, Envoy Gateway unexpectedly reset the replicas of Envoy Proxy deployment to their original value specified in EnvoyProxy.spec.provider.kubernetes.envoyDeployment.replicas, if this field is undefined, then it will set to 1 replica. Though, after some time, when the HPA controller kicks in again, it will set back to its minReplicas, but I am a bit concerned in the event of high traffic, and any operation for Gateway API applied, it enforces the Envoy Proxy deployment to be scaled-down.
Repro steps:
Apply the below EnvoyProxy manifest, then assuming some of HTTPRoute resources exist, then do any operation such as modification into that resource. You will observe a scale-down operation into EnvoyProxy deployment, from 3 replicas (specified in HPA's minReplicas field) to 1 replica (default value for EnvoyProxy replica).
apiVersion: gateway.envoyproxy.io/v1alpha1
kind: EnvoyProxy
metadata:
name: default
namespace: envoy-gateway-system
spec:
logging:
level:
default: warn
provider:
kubernetes:
envoyDeployment:
pod:
labels:
service_name: envoy-proxy
service_group: envoy
container:
resources:
limits:
cpu: 2
memory: 2Gi
requests:
cpu: 1
memory: 1Gi
envoyHpa:
maxReplicas: 12
minReplicas: 3
metrics:
- resource:
name: cpu
target:
averageUtilization: 60
type: Utilization
type: Resource
type: KubernetesProposal:
Even though the code has already been set up to prevent reverting replicas to their original value when the HPA controller kicks in (to adjust the Envoy Proxy deployment replicas), the Envoy Gateway seems still not aware when it comes in createOrUpdateDeployment function during reconciliation.
gateway/internal/infrastructure/kubernetes/infra_resource.go
Lines 66 to 86 in af72b32
| func (i *Infra) createOrUpdateDeployment(ctx context.Context, r ResourceRender) error { | |
| deployment, err := r.Deployment() | |
| if err != nil { | |
| return err | |
| } | |
| current := &appsv1.Deployment{} | |
| key := types.NamespacedName{ | |
| Namespace: deployment.Namespace, | |
| Name: deployment.Name, | |
| } | |
| hpa, err := r.HorizontalPodAutoscaler() | |
| if err != nil { | |
| return err | |
| } | |
| var opts cmp.Options | |
| if hpa != nil { | |
| opts = append(opts, cmpopts.IgnoreFields(appsv1.DeploymentSpec{}, "Replicas")) | |
| } |
The
createOrUpdateDeployment function seems reset the replicas from deploymentConfig in L239, gateway/internal/infrastructure/kubernetes/proxy/resource_provider.go
Lines 238 to 240 in af72b32
| Spec: appsv1.DeploymentSpec{ | |
| Replicas: deploymentConfig.Replicas, | |
| Strategy: *deploymentConfig.Strategy, |
It seems to me that we need to set the replicas to nil if
envoyHpa is utilized, something like below:
if hpa != nil {
deployment.Spec.Replicas = nil
opts = append(opts, cmpopts.IgnoreFields(appsv1.DeploymentSpec{}, "Replicas"))
}
This way the Envoy Gateway would refrain from further replica adjustments when envoyHpa is utilized.
Environment:
latest
Logs:
Include the access logs and the Envoy logs.