Skip to content

Revert #24288#24756

Closed
jibi wants to merge 15 commits intomasterfrom
pr/jibi/revert-24676
Closed

Revert #24288#24756
jibi wants to merge 15 commits intomasterfrom
pr/jibi/revert-24676

Conversation

@jibi
Copy link
Copy Markdown
Member

@jibi jibi commented Apr 5, 2023

This PR undoes a revert PR, #24288, which was reverting #24288 and #24577.

We believed the 2 aforementioned PRs introduced a regression on the upgrade from 1.13 to latest, but after the revert the CI is still failing.

jibi added 15 commits April 5, 2023 10:54
…ring format""

This reverts commit 53fef54.

Signed-off-by: Gilberto Bertin <jibi@cilium.io>
…b rules and routes""

This reverts commit 368ec8e.

Signed-off-by: Gilberto Bertin <jibi@cilium.io>
…fib rules""

This reverts commit ed5114d.

Signed-off-by: Gilberto Bertin <jibi@cilium.io>
This reverts commit dbce5f1.

Signed-off-by: Gilberto Bertin <jibi@cilium.io>
…nt""

This reverts commit 9e62a84.

Signed-off-by: Gilberto Bertin <jibi@cilium.io>
This reverts commit 63c2e2a.

Signed-off-by: Gilberto Bertin <jibi@cilium.io>
This reverts commit 5fb791d.

Signed-off-by: Gilberto Bertin <jibi@cilium.io>
…g routes""

This reverts commit 0f3e989.

Signed-off-by: Gilberto Bertin <jibi@cilium.io>
This reverts commit 3271cb2.

Signed-off-by: Gilberto Bertin <jibi@cilium.io>
…nd routes""

This reverts commit afdc51f.

Signed-off-by: Gilberto Bertin <jibi@cilium.io>
…es and rules""

This reverts commit 9b5e74b.

Signed-off-by: Gilberto Bertin <jibi@cilium.io>
…nd routes""

This reverts commit 2b6d5c4.

Signed-off-by: Gilberto Bertin <jibi@cilium.io>
…nel""

This reverts commit 05593ee.

Signed-off-by: Gilberto Bertin <jibi@cilium.io>
This reverts commit a9cad19.

Signed-off-by: Gilberto Bertin <jibi@cilium.io>
…roto kernel""

This reverts commit 9d60341.

Signed-off-by: Gilberto Bertin <jibi@cilium.io>
@jibi jibi added the release-note/misc This PR makes changes that have no direct user impact. label Apr 5, 2023
@jibi jibi requested review from NikAleksandrov and aanm April 5, 2023 08:59
@jibi jibi requested review from a team as code owners April 5, 2023 08:59
@jibi jibi requested review from asauber and dylandreimerink April 5, 2023 08:59
@jibi jibi added affects/v1.11 This issue affects v1.11 branch affects/v1.12 This issue affects v1.12 branch backport/author The backport will be carried out by the author of the PR. needs-backport/1.13 labels Apr 5, 2023
@jibi
Copy link
Copy Markdown
Member Author

jibi commented Apr 5, 2023

/test

Copy link
Copy Markdown
Member

@borkmann borkmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM (approving on behalf on Nik who is currently out)!

@borkmann borkmann added the dont-merge/waiting-for-user-feedback Waiting for feedback from user before the PR should be merged. label Apr 5, 2023
@borkmann
Copy link
Copy Markdown
Member

borkmann commented Apr 5, 2023

Adding flag to hold merge given @aanm noticed an issue locally that we need to investigate before merging.

@aanm
Copy link
Copy Markdown
Member

aanm commented Apr 5, 2023

I got this diff in the before and after an upgrade. I wonder if this could be the reason why connectivity gets broken after upgrade.

before.txt
after.txt

@pchaigno
Copy link
Copy Markdown
Member

pchaigno commented Apr 5, 2023

I got this diff in the before and after an upgrade. I wonder if this could be the reason why connectivity gets broken after upgrade.

before.txt after.txt

That's just how the reverted patch works. AFAICS, nothing that looks out of the ordinary. Is that the "issue" you noticed locally?

@jibi
Copy link
Copy Markdown
Member Author

jibi commented Apr 6, 2023

not sure if just a red herring, but after the upgrade, beside the proto kernel changes, one other difference is that we are missing a bunch of IPv6 routes for cilium_host

diff <(cat before.txt | grep cilium_host) <(cat after.txt | grep cilium_host)
1,5c1,5
< default via 10.0.1.62 dev cilium_host table 2005
< 10.0.1.62 dev cilium_host table 2005 scope link
< 10.0.0.0/24 via 10.0.1.62 dev cilium_host src 10.0.1.62 mtu 1450
< 10.0.1.0/24 via 10.0.1.62 dev cilium_host src 10.0.1.62
< 10.0.1.62 dev cilium_host scope link
---
> default via 10.0.1.62 dev cilium_host table 2005 proto kernel
> 10.0.1.62 dev cilium_host table 2005 proto kernel scope link
> 10.0.0.0/24 via 10.0.1.62 dev cilium_host proto kernel src 10.0.1.62 mtu 1450
> 10.0.1.0/24 via 10.0.1.62 dev cilium_host proto kernel src 10.0.1.62
> 10.0.1.62 dev cilium_host proto kernel scope link
7,8c7,8
< fe80::8f9:c6ff:fe3d:3775 dev cilium_host table 2005 metric 1024 pref medium
< default via fe80::8f9:c6ff:fe3d:3775 dev cilium_host table 2005 metric 1024 pref medium
---
> fe80::8f9:c6ff:fe3d:3775 dev cilium_host table 2005 proto kernel metric 1024 pref medium
> default via fe80::8f9:c6ff:fe3d:3775 dev cilium_host table 2005 proto kernel metric 1024 pref medium
10,13c10,12
< fd02::/120 via fd02::1cf dev cilium_host src fc00:c111::4 metric 1024 mtu 1450 pref medium
< fd02::1cf dev cilium_host metric 1024 pref medium
< fd02::100/120 via fd02::1cf dev cilium_host src fc00:c111::4 metric 1024 pref medium
< fe80::/64 dev cilium_host proto kernel metric 256 pref medium
---
> fd02::/120 via fd02::1cf dev cilium_host proto kernel src fc00:c111::4 metric 1024 mtu 1450 pref medium
> fd02::1cf dev cilium_host proto kernel metric 1024 pref medium
> fd02::100/120 via fd02::1cf dev cilium_host proto kernel src fc00:c111::4 metric 1024 pref medium
15,16d13
< anycast fe80:: dev cilium_host table local proto kernel metric 0 pref medium
< local fe80::241d:fff:fe71:c0db dev cilium_host table local proto kernel metric 0 pref medium

@aanm
Copy link
Copy Markdown
Member

aanm commented Apr 6, 2023

I got this diff in the before and after an upgrade. I wonder if this could be the reason why connectivity gets broken after upgrade.
before.txt after.txt

That's just how the reverted patch works. AFAICS, nothing that looks out of the ordinary. Is that the "issue" you noticed locally?

No, the issue is that after an upgrade from 1.13, all the connectivity of all pods break, including hostNetwork pods, and they never recover. I just tried to understand if any of the route diff would explain this issue.

@pchaigno
Copy link
Copy Markdown
Member

pchaigno commented Apr 6, 2023

If it's IPv6 that breaks, then maybe Jibi has the explanation above. If IPv4 breaks as well, then I don't think the route changes are related. We probably need someone from datapath to dig into this.

Copy link
Copy Markdown
Member

@asauber asauber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving as a "revert revert" which doesn't appear to break anything. Reaching out to the original reviewers might help for additional approvals here.

@jibi
Copy link
Copy Markdown
Member Author

jibi commented Apr 13, 2023

@nickolaev do you think #24756 (comment) is related to the changes introduced in #24288? Should we reintroduce back that PR?

@jibi
Copy link
Copy Markdown
Member Author

jibi commented Apr 17, 2023

I'm closing this while we figure out what to do.

@jibi jibi closed this Apr 17, 2023
@jibi
Copy link
Copy Markdown
Member Author

jibi commented Apr 17, 2023

(cc @NikAleksandrov as I initially pinged the wrong Nikolay)

@jibi jibi deleted the pr/jibi/revert-24676 branch November 7, 2023 16:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

affects/v1.11 This issue affects v1.11 branch affects/v1.12 This issue affects v1.12 branch backport/author The backport will be carried out by the author of the PR. dont-merge/waiting-for-user-feedback Waiting for feedback from user before the PR should be merged. release-note/misc This PR makes changes that have no direct user impact.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants