What happened?
#109981 has introduced a regression for Kubernetes clusters using dockershim as the runtime that invoke CNIs using the v1 HNS APIs. Winkernel proxier is unable to retrieve endpoints due to this hcsshim call failing to retrieve v1 HNS endpoints: https://pkg.go.dev/github.com/Microsoft/hcsshim@v0.8.22/hcn#ListEndpointsOfNetwork
This causes service proxy rules to not be created, as local endpoints would not be found.
Clusters using the containerD runtime and CNIs that leverage HCN APIs are not impacted.
Source:
We have noticed this is failing on our dockershim related tests: https://testgrid.k8s.io/sig-windows-1.23-release#aks-engine-windows-dockershim-1.23
https://prow.k8s.io/view/gs/kubernetes-jenkins/logs/ci-kubernetes-e2e-aks-engine-azure-1-23-windows/1535071534270386176
What did you expect to happen?
Service proxy rules are created for all expected services
How can we reproduce it (as minimally and precisely as possible)?
Create a Kubernetes cluster with Windows and dockershim runtime
Anything else we need to know?
This issue does not occur on clusters with the containerD runtime (that use v2 CNI workflow creating HCN endpoints instead of HNS endpoints)
Kubernetes version
1.23, 1.22
Cloud provider
Reproduced on Azure, but all are impacted.
OS version
Windows Server 2019
Install tools
Details
Container runtime (CRI) and version (if applicable)
Dockershim
Related plugins (CNI, CSI, ...) and versions (if applicable)
Azure CNI + dockershim. Any CNI creating v1 HNS endpoints will be impacted.
What happened?
#109981 has introduced a regression for Kubernetes clusters using dockershim as the runtime that invoke CNIs using the v1 HNS APIs. Winkernel proxier is unable to retrieve endpoints due to this hcsshim call failing to retrieve v1 HNS endpoints: https://pkg.go.dev/github.com/Microsoft/hcsshim@v0.8.22/hcn#ListEndpointsOfNetwork
This causes service proxy rules to not be created, as local endpoints would not be found.
Clusters using the containerD runtime and CNIs that leverage HCN APIs are not impacted.
Source:
We have noticed this is failing on our dockershim related tests: https://testgrid.k8s.io/sig-windows-1.23-release#aks-engine-windows-dockershim-1.23
https://prow.k8s.io/view/gs/kubernetes-jenkins/logs/ci-kubernetes-e2e-aks-engine-azure-1-23-windows/1535071534270386176
What did you expect to happen?
Service proxy rules are created for all expected services
How can we reproduce it (as minimally and precisely as possible)?
Create a Kubernetes cluster with Windows and dockershim runtime
Anything else we need to know?
This issue does not occur on clusters with the containerD runtime (that use v2 CNI workflow creating HCN endpoints instead of HNS endpoints)
Kubernetes version
1.23, 1.22
Cloud provider
Reproduced on Azure, but all are impacted.
OS version
Windows Server 2019
Install tools
Details
Container runtime (CRI) and version (if applicable)
Dockershim
Related plugins (CNI, CSI, ...) and versions (if applicable)
Azure CNI + dockershim. Any CNI creating v1 HNS endpoints will be impacted.