Skip to content
This repository was archived by the owner on May 12, 2021. It is now read-only.

netmon: Fix bug in how routes are converted#1524

Merged
sboeuf merged 1 commit intokata-containers:masterfrom
amshinde:fix-netmon-routes-cidr
Apr 15, 2019
Merged

netmon: Fix bug in how routes are converted#1524
sboeuf merged 1 commit intokata-containers:masterfrom
amshinde:fix-netmon-routes-cidr

Conversation

@amshinde
Copy link
Copy Markdown
Member

The agent expects a IP CIDR for the route destination
rather than an IP address. netmon was incorrectly
converting route dest to an IP address and hence
exiting with an error.

Fixes #1523

Signed-off-by: Archana Shinde archana.m.shinde@intel.com

@amshinde
Copy link
Copy Markdown
Member Author

@sboeuf @egernst PTAL, with this the PR to make tc as default should now pass.

@amshinde
Copy link
Copy Markdown
Member Author

/test

Copy link
Copy Markdown

@sboeuf sboeuf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @amshinde 👍

@devimc
Copy link
Copy Markdown

devimc commented Apr 12, 2019

/retest

@jcvenegas
Copy link
Copy Markdown
Member

@amshinde I marked as stable candidate, please let me know if is fine and if it please backport for 1.5 and 1.6

@amshinde amshinde force-pushed the fix-netmon-routes-cidr branch from 99c6d56 to 429753d Compare April 12, 2019 15:41
@amshinde
Copy link
Copy Markdown
Member Author

/test

Copy link
Copy Markdown
Contributor

@mcastelino mcastelino left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@amshinde one question. How did this work today before tcFilter. Looks like it has been incorrect for a while?

It would be good to include that detail in the commit message.

@sboeuf
Copy link
Copy Markdown

sboeuf commented Apr 12, 2019

@mcastelino This code path was not exercised with macvtap unfortunately, which is why we didn't realize about the bug.
And the reason is, in case of macvtap, we delete the IP, which deletes the routes associated to it.

In case of tc, the IP is untouched, that's why we got some routes received by netmon.

@amshinde
Copy link
Copy Markdown
Member Author

@mcastelino Yes, this has been incorrect. This was not covered in our CI matrix - netmon +tc. I'll add that in the commit.

@amshinde
Copy link
Copy Markdown
Member Author

@sboeuf In case of macvtap, we could have netmon pick up the routes for the veth before the IP is deleted to fix the macvtap mode.

The agent expects a IP CIDR for the route destination
rather than an IP address. netmon was incorrectly
converting route dest to an IP address and hence
exiting with an error.

We did not have an integration test for netmon with tcfilter mode.
macvtap mode did not uncover this, as with macvtap routes are
not really passed to the agent.
We delete the IP on the veth device, and netmon looks at the
routes after the IP is deleted with macvtap.

Fixes kata-containers#1523

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
@amshinde amshinde force-pushed the fix-netmon-routes-cidr branch from 429753d to 8abd2ec Compare April 12, 2019 16:43
@amshinde
Copy link
Copy Markdown
Member Author

/test

@amshinde amshinde added the bug Incorrect behaviour label Apr 12, 2019
@amshinde
Copy link
Copy Markdown
Member Author

@jcvenegas Yup this needs backporting. I have raised PRs to do so. Lets merge them once this goes in.

@amshinde
Copy link
Copy Markdown
Member Author

@chavafg @jcvenegas Seeing this error in nemu CI. Have you seen this before?

 • Failure in Spec Teardown (AfterEach) [10.559 seconds]
15:32:19 [k8s.io] Container
15:32:19 /tmp/jenkins/workspace/kata-containers-runtime-ubuntu-nemu/go/src/github.com/kubernetes-sigs/cri-tools/pkg/framework/framework.go:72
15:32:19   runtime should support basic operations on container
15:32:19   /tmp/jenkins/workspace/kata-containers-runtime-ubuntu-nemu/go/src/github.com/kubernetes-sigs/cri-tools/pkg/validate/container.go:68
15:32:19     runtime should support creating container [Conformance] [AfterEach]
15:32:19     /tmp/jenkins/workspace/kata-containers-runtime-ubuntu-nemu/go/src/github.com/kubernetes-sigs/cri-tools/pkg/validate/container.go:83
15:32:19 
15:32:19     failed to pull image: rpc error: code = Unknown desc = failed to pull and unpack image "docker.io/library/busybox:1.28": failed to unpack image on snapshotter overlayfs: failed to extract layer sha256:432b65032b9466b4dadcc5c7b11701e71d21c18400aae946b101ad16be62333a: mount callback failed on /var/lib/containerd/tmpmounts/containerd-mount642814998: link /var/lib/containerd/tmpmounts/containerd-mount642814998/bin/[ /var/lib/containerd/tmpmounts/containerd-mount642814998/bin/rpm: no such file or directory: unknown
15:32:19     Expected error:
15:32:19         <*status.statusError | 0xc000498630>: {
15:32:19             Code: 2,
15:32:19             Message: "failed to pull and unpack image \"docker.io/library/busybox:1.28\": failed to unpack image on snapshotter overlayfs: failed to extract layer sha256:432b65032b9466b4dadcc5c7b11701e71d21c18400aae946b101ad16be62333a: mount callback failed on /var/lib/containerd/tmpmounts/containerd-mount642814998: link /var/lib/containerd/tmpmounts/containerd-mount642814998/bin/[ /var/lib/containerd/tmpmounts/containerd-mount642814998/bin/rpm: no such file or directory: unknown",
15:32:19             Details: nil,
15:32:19         }
15:32:19         rpc error: code = Unknown desc = failed to pull and unpack image "docker.io/library/busybox:1.28": failed to unpack image on snapshotter overlayfs: failed to extract layer sha256:432b65032b9466b4dadcc5c7b11701e71d21c18400aae946b101ad16be62333a: mount callback failed on /var/lib/containerd/tmpmounts/containerd-mount642814998: link /var/lib/containerd/tmpmounts/containerd-mount642814998/bin/[ /var/lib/containerd/tmpmounts/containerd-mount642814998/bin/rpm: no such file or directory: unknown
15:32:19     not to have occurred

@chavafg
Copy link
Copy Markdown
Contributor

chavafg commented Apr 15, 2019

@amshinde yeah, I have seen it a couple of times... Seems like a network issue pulling images. I've sent a restart.

@sboeuf sboeuf merged commit d75e7fc into kata-containers:master Apr 15, 2019
@egernst egernst mentioned this pull request Apr 16, 2019
@amshinde amshinde deleted the fix-netmon-routes-cidr branch July 11, 2019 22:27
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

bug Incorrect behaviour

Projects

None yet

Development

Successfully merging this pull request may close these issues.

netmon: Retain dest CIDR when passing routes to the runtime

8 participants