When cilium-agent runs alongside kube-proxy with KubeProxyReplacement=Partial, both kube-proxy and cilium-agent will try to serve a health check for the service on the port specified by HealthCheckNodePort. This will cause the following error:
level=error msg="ListenAndServe failed for service health server" error="listen tcp :31659: bind: address already in use" serviceName=test-lb-local-k8s2 serviceNamespace=default subsys=service-healthserver svcHealthCheckNodePort=31659
The error itself means that the health check is currently served by kube-proxy. We should add an option to disable Cilium's HealthCheckNodePort service health server for Cilium deployments where kube-proxy is running and intended to serve the service health checks.
Proposal (after discussion @brb):
- Introduce a
enable-health-check-nodeport (default: true) flag to allow users to opt-out of the service health server running inside cilium-agent when running NodePort BPF
- If the user selects
kube-proxy-replacement=partial, then we will disable enable-health-check-nodeport, as we assume that kube-proxy is intentionally running in parallel to cilium-agent.
- If the NodePort BPF is enabled (i.e.
EnableNodePort=true) either via kube-proxy-replacement=probe or kube-proxy-replacement=strict, then we will keep the service health server enabled unless the explicitly user opts-out.
- If NodePort BPF is disabled, then cilium-agent will not start the
HealthCheckNodePort server anyways, so no action should be required there.
- We should extend the above error message with a notice that the service health server can be disabled via flag.
When cilium-agent runs alongside kube-proxy with
KubeProxyReplacement=Partial, both kube-proxy and cilium-agent will try to serve a health check for the service on the port specified byHealthCheckNodePort. This will cause the following error:The error itself means that the health check is currently served by
kube-proxy. We should add an option to disable Cilium'sHealthCheckNodePortservice health server for Cilium deployments where kube-proxy is running and intended to serve the service health checks.Proposal (after discussion @brb):
enable-health-check-nodeport(default:true) flag to allow users to opt-out of the service health server running inside cilium-agent when running NodePort BPFkube-proxy-replacement=partial, then we will disableenable-health-check-nodeport, as we assume that kube-proxy is intentionally running in parallel to cilium-agent.EnableNodePort=true) either viakube-proxy-replacement=probeorkube-proxy-replacement=strict, then we will keep the service health server enabled unless the explicitly user opts-out.HealthCheckNodePortserver anyways, so no action should be required there.