Fix winkernel proxy regression failing to query v1 endpoints created by dockershim CNI#110592
Conversation
|
@daschott: Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
@daschott: This cherry pick PR is for a release branch and has not yet been approved by Release Managers. To merge this cherry pick, it must first be approved ( AFTER it has been approved by code owners, please leave the following comment on a line by itself, with no leading whitespace: /cc kubernetes/release-managers (This command will request a cherry pick review from Release Managers and should work for all GitHub users, whether they are members of the Kubernetes GitHub organization or not.) For details on the patch release process and schedule, see the Patch Releases page. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
@daschott: This issue is currently awaiting triage. If a SIG or subproject determines this is a relevant issue, they will accept it by applying the The DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
Hi @daschott. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
/sig windows |
|
/test pull-kubernetes-e2e-aks-engine-windows-containerd-1-22 |
|
@daschott: Cannot trigger testing until a trusted user reviews the PR and leaves an DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
/ok-to-test |
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: daschott, sbangari The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/test pull-kubernetes-e2e-aks-engine-windows-containerd-1-22 |
|
/test pull-kubernetes-e2e-aks-engine-windows-dockershim-1-22 |
|
@daschott: PR needs rebase. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
Due to revert, closing in favor of #110701 |
What happened?
#109985 has introduced a regression for Kubernetes clusters using dockershim as the runtime that invoke CNIs using the v1 HNS APIs. Winkernel proxier is unable to retrieve endpoints due to this hcsshim call failing to retrieve v1 HNS endpoints: https://pkg.go.dev/github.com/Microsoft/hcsshim@v0.8.22/hcn#ListEndpointsOfNetwork
This causes service proxy rules to not be created, as local endpoints would not be found.
Clusters using the containerD runtime and CNIs that leverage HCN APIs are not impacted.
Source:
We have noticed this is failing on our dockershim related tests: https://testgrid.k8s.io/sig-windows-1.22-release#aks-engine-windows-dockershim-1.22
https://prow.k8s.io/view/gs/kubernetes-jenkins/logs/ci-kubernetes-e2e-aks-engine-azure-1-22-windows/1536521484288135168
What did you expect to happen?
Service proxy rules are created for all expected services
How can we reproduce it (as minimally and precisely as possible)?
Create a Kubernetes cluster with Windows and dockershim runtime
Anything else we need to know?
This issue does not occur on clusters with the containerD runtime (that use v2 CNI workflow creating HCN endpoints instead of HNS endpoints)
Kubernetes version
1.23, 1.22
Cloud provider
Reproduced on Azure, but all are impacted.
OS version
Windows Server 2019
Install tools
Details
Container runtime (CRI) and version (if applicable)
Dockershim
Related plugins (CNI, CSI, ...) and versions (if applicable)
Azure CNI + dockershim. Any CNI creating v1 HNS endpoints will be impacted.