$ kubectl version
Client Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.7", GitCommit:"1f86634ff08f37e54e8bfcd86bc90b61c98f84d4", GitTreeState:"clean", BuildDate:"2021-11-17T14:41:19Z", GoVersion:"go1.16.10", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.7", GitCommit:"f74784f1eaf1e02b651778d6ee2df1ae5ee729ae", GitTreeState:"clean", BuildDate:"2022-03-10T07:58:41Z", GoVersion:"go1.16.10", Compiler:"gc", Platform:"linux/amd64"}
What happened?
When starting windows nodes with a high number of HNS LB policies/rules on the cluster, there is a delay in processing them. This leaves services unreachable during the delay, which takes about half a minute per policy. This can be substatial given enough rules.
This occurs when restarting kube-proxy and rebooting the host. Once the system does reach a state where all the policylists are processed, incremental updates to the services are handled fine (ie. endpoint changes).
What did you expect to happen?
HNS policies should not cause a large delay for Windows nodes.
How can we reproduce it (as minimally and precisely as possible)?
With a large number of HNS policies in place, restart kube-proxy on a Windows node.
Anything else we need to know?
No response
Kubernetes version
Details
Cloud provider
Details
Azure AKSOS version
Details
Install tools
Details
Container runtime (CRI) and version (if applicable)
Details
Related plugins (CNI, CSI, ...) and versions (if applicable)
Details