Fix backend slot index mismatch in LB reconciler#43902
Fix backend slot index mismatch in LB reconciler#43902joestringer merged 1 commit intocilium:mainfrom
Conversation
|
Commit 7239fcd does not match "(?m)^Signed-off-by:". Please follow instructions provided in https://docs.cilium.io/en/stable/contributing/development/contributing_guide/#developer-s-certificate-of-origin |
7239fcd to
282ddbe
Compare
282ddbe to
01bceb5
Compare
|
code makes sense to me. but can i ask you to adjust the commit to be something like edit: somehow i missed the commit comment. you already have it there. but can you adjust the description, still? |
01bceb5 to
9d7ee34
Compare
|
I’ve updated the commit title to |
|
/test |
|
im looking into these test failures |
|
@Aman-Cool , looks like the test failures are related to #43935 |
9d7ee34 to
282ddbe
Compare
|
@bersoare , Rebased on |
282ddbe to
f3bdce7
Compare
|
https://github.com/cilium/cilium/actions/runs/21313926203/job/61449143557?pr=43902 < this one failed on connectivity tests. need to take a look at the sysdump and see what's going on. |
|
Thanks for flagging this @bersoare, I’ll take a look on my side as well and see if anything stands out. |
|
if you're interested, i encourage you to download the sysdump and take a look at what's there and see if you can spot any signs that the code isnt behaving as you expect. it's a great opportunity to get familiar with how cilium works, in case that is something you'd like to do. i'll take a look when i get a chance and see if i can find something |
|
Thanks for the suggestion! I’ll take a look at the sysdump and see if anything stands out related to this change. Appreciate you taking a look as well. |
|
im seeing what it looks like envoy errors: not sure if this is the cause yet |
|
/test |
dylandreimerink
left a comment
There was a problem hiding this comment.
Changes look good to me. But CI isn't in a good state right now. I recommend rebasing on main to pull in the latest changes that might effect it. Then we can dig into the new results to see if its this PR or something else.
Use slotID instead of loop index when setting backend slots to avoid gaps when maintenance backends are skipped. Signed-off-by: Aman-Cool <aman017102007@gmail.com>
f3bdce7 to
85f1831
Compare
|
/test |
|
We didn't add a test for this bug, but the code in |
Summary
Fix a bug in the load balancer BPF reconciler where backend slots were incorrectly assigned when maintenance backends exist. The code used the loop index (
i + 1) instead of the sequential slot counter (slotID) when setting backend slots, causing gaps in slot assignments when maintenance backends are skipped.Bug: When backends in maintenance state are skipped during reconciliation, subsequent backends are assigned to wrong slots (e.g., slots 1, 3 instead of 1, 2), causing traffic misrouting or drops.
Fix: Use
slotID(which correctly tracks sequential slot numbers) instead ofi + 1(loop index).Release Note
Test Plan
slotID)