-
Notifications
You must be signed in to change notification settings - Fork 712
Loadbalancers created via aws-load-balancer-controller for EnvoyProxy instances are leaked once Gateway is removed #1820
Description
Description:
When creating a Gateway instance in an EKS cluster using the aws-load-balancer-controller to provision loadbalancers, subsequent deletion of the Gateway resource can leave the underlying AWS resources (loadbalancer, targetgroup, sg rules, sgs, etc) leaked and never deleted.
The underlying reason for this appears to be EG is striping the .metadata.finalizers section of the managed Service object. As aws-load-balancer-controller (and probably any and all load balancer providers for that matter) create non k8s resources that must be cleaned up, the finalizer is injected onto the service so that it the controller is guaranteed time to reconcile the deletion of these resources before losing the Service object (and then its tracking of those resources leaving them leaked).
EnvoyProxy service object created
apiVersion: v1
kind: Service
metadata:
annotations:
service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: ip
service.beta.kubernetes.io/aws-load-balancer-scheme: internal
service.beta.kubernetes.io/aws-load-balancer-type: external
creationTimestamp: "2023-08-23T14:18:56Z"
labels:
app.kubernetes.io/component: proxy
app.kubernetes.io/managed-by: envoy-gateway
app.kubernetes.io/name: envoy
gateway.envoyproxy.io/owning-gateway-name: ...
gateway.envoyproxy.io/owning-gateway-namespace:...
name: envoy-...-72616262
namespace: envoy-gateway
resourceVersion: ...
uid: ...
spec:
allocateLoadBalancerNodePorts: true
clusterIP: 172.20.132.225
clusterIPs:
- 172.20.132.225
externalTrafficPolicy: Local
healthCheckNodePort: 30042
internalTrafficPolicy: Cluster
ipFamilies:
- IPv4
ipFamilyPolicy: SingleStack
ports:
...
selector:
app.kubernetes.io/component: proxy
app.kubernetes.io/managed-by: envoy-gateway
app.kubernetes.io/name: envoy
gateway.envoyproxy.io/owning-gateway-name: ...
gateway.envoyproxy.io/owning-gateway-namespace:...
sessionAffinity: None
type: LoadBalancer
status:
loadBalancer:
ingress:
- hostname: ....elb.us-east-1.amazonaws.com
A service object created manually from the following yaml
apiVersion: v1
kind: Service
metadata:
name: loadbalancer
namespace: envoy-gateway
annotations:
service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: ip
service.beta.kubernetes.io/aws-load-balancer-scheme: internal
service.beta.kubernetes.io/aws-load-balancer-type: external
spec:
type: LoadBalancer
ports:
- name: test
port: 8080
protocol: TCP
yeilds
apiVersion: v1
kind: Service
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: ...
service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: ip
service.beta.kubernetes.io/aws-load-balancer-scheme: internal
service.beta.kubernetes.io/aws-load-balancer-type: external
creationTimestamp: "2023-08-23T15:04:39Z"
finalizers:
- service.kubernetes.io/load-balancer-cleanup
- service.k8s.aws/resources
name: loadbalancer
namespace: envoy-gateway
resourceVersion: ...
uid: ...
spec:
allocateLoadBalancerNodePorts: true
clusterIP: 172.20.204.23
clusterIPs:
- 172.20.204.23
externalTrafficPolicy: Cluster
internalTrafficPolicy: Cluster
ipFamilies:
- IPv4
ipFamilyPolicy: SingleStack
ports:
- name: test
nodePort: 30456
port: 8080
protocol: TCP
targetPort: 8080
sessionAffinity: None
type: LoadBalancer
status:
loadBalancer:
ingress:
- hostname: ....elb.us-east-1.amazonaws.com
Unlike what is observed from the EG managed service, we have the finalizer section intact for the manually created service (with it not being part of the original yaml). So it seems as though the reconciliation process is stripping that out. Without that our loadbalancer provider is unable to guarantee cleanup the cloud resources.
Repro steps:
- setup gatewayclass, envoyproxy and gateway resource as per normal with service type LoadBalancer (default) and requisite annotations to activate aws-load-balancer-controller (should also be reproducible for any other loadbalancer providers that rely on finalizers)
apiVersion: "config.gateway.envoyproxy.io/v1alpha1" kind: "EnvoyProxy" metadata: name: "aws-loadbalancer-controller-ep" spec: provider: type: "Kubernetes" kubernetes: envoyService: annotations: service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: ip service.beta.kubernetes.io/aws-load-balancer-scheme: internal service.beta.kubernetes.io/aws-load-balancer-type: external - delete
Gatewayobject. see aws loadbalancer is not removed
Environment:
- aws based kubernetes cluster with aws-load-balancer-controller available to be used
- eg 0.5.0
Logs:
Have not been able to locate specific logs indicating finalizer is being removed.
Relates to: