Skip to content

lbipam: do not reallocate IPs on operator restart#41147

Merged
joamaki merged 1 commit intomainfrom
pr/marseel/fix_lbipam_restart
Aug 14, 2025
Merged

lbipam: do not reallocate IPs on operator restart#41147
joamaki merged 1 commit intomainfrom
pr/marseel/fix_lbipam_restart

Conversation

@marseel
Copy link
Copy Markdown
Member

@marseel marseel commented Aug 14, 2025

If there was a pool that was filled and had unsatisfied Services, on
operator restart there was a high chance that we will reshuffle
assignement of IPs for that pool. This resulted in previously safisfied
services to either become unsatisfied or get a new IP.

Issue is fixed by not performing any operation on services until full
sync happens. After that, first we try to reuse IPs for already
satisfied services and only after that we try to assign additional IPs
to unsatisfied services.

Additionally, add test that covers this case, simulating restart of
operator.

Related: #40358
Depends on: #41122

lbipam: do not reallocate IPs in LB IPAM on operator restart

@marseel marseel added kind/bug This is a bug in the Cilium logic. needs-backport/1.18 This PR / issue needs backporting to the v1.18 branch labels Aug 14, 2025
@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Aug 14, 2025
@marseel marseel added the release-note/minor This PR changes functionality that users may find relevant to operating Cilium. label Aug 14, 2025
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Aug 14, 2025
Copy link
Copy Markdown
Contributor

@joamaki joamaki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@marseel marseel force-pushed the pr/marseel/fix_lbipam_restart branch from 9520435 to d6d6a16 Compare August 14, 2025 09:29
Base automatically changed from pr/marseel/improve_lbipam to main August 14, 2025 09:37
If there was a pool that was filled and had unsatisfied Services, on
operator restart there was a high chance that we will reshuffle
assignement of IPs for that pool. This resulted in previously safisfied
services to either become unsatisfied or get a new IP.

Issue is fixed by not performing any operation on services until full
sync happens. After that, first we try to reuse IPs for already
satisfied services and only after that we try to assign additional IPs
to unsatisfied services.

Additionally, add test that covers this case, simulating restart of
operator.

Related: #40358

Signed-off-by: Marcel Zieba <marcel.zieba@isovalent.com>
@marseel marseel force-pushed the pr/marseel/fix_lbipam_restart branch from d6d6a16 to 31ce600 Compare August 14, 2025 09:41
@marseel
Copy link
Copy Markdown
Member Author

marseel commented Aug 14, 2025

Fixed golang lint + rebased on main

@marseel
Copy link
Copy Markdown
Member Author

marseel commented Aug 14, 2025

/test

@marseel
Copy link
Copy Markdown
Member Author

marseel commented Aug 14, 2025

Gateway API seems to be failing quite a lot after rebasing, open related issue: #41150
It seems it has been failing pretty consistently for last 2 days: https://github.com/cilium/cilium/actions/workflows/conformance-gateway-api.yaml?query=branch%3Amain+event%3Apush

@marseel marseel marked this pull request as ready for review August 14, 2025 11:19
@marseel marseel requested a review from a team as a code owner August 14, 2025 11:19
@marseel marseel requested review from aditighag and removed request for aditighag August 14, 2025 11:20
@joamaki joamaki added this pull request to the merge queue Aug 14, 2025
Merged via the queue into main with commit b904b9f Aug 14, 2025
391 of 401 checks passed
@joamaki joamaki deleted the pr/marseel/fix_lbipam_restart branch August 14, 2025 11:40
@maintainer-s-little-helper maintainer-s-little-helper bot added ready-to-merge This PR has passed all tests and received consensus from code owners to merge. labels Aug 14, 2025
@joamaki joamaki mentioned this pull request Aug 19, 2025
19 tasks
@joamaki joamaki added backport-pending/1.18 The backport for Cilium 1.18.x for this PR is in progress. and removed needs-backport/1.18 This PR / issue needs backporting to the v1.18 branch labels Aug 19, 2025
@github-actions github-actions bot added backport-done/1.18 The backport for Cilium 1.18.x for this PR is done. and removed backport-pending/1.18 The backport for Cilium 1.18.x for this PR is in progress. labels Aug 21, 2025
terassyi pushed a commit to cybozu-go/cilium that referenced this pull request Dec 12, 2025
lbipam: do not reallocate IPs on operator restart

If there was a pool that was filled and had unsatisfied Services, on
operator restart there was a high chance that we will reshuffle
assignement of IPs for that pool. This resulted in previously safisfied
services to either become unsatisfied or get a new IP.

Issue is fixed by not performing any operation on services until full
sync happens. After that, first we try to reuse IPs for already
satisfied services and only after that we try to assign additional IPs
to unsatisfied services.

Additionally, add test that covers this case, simulating restart of
operator.

Related: cilium#40358

Signed-off-by: Marcel Zieba <marcel.zieba@isovalent.com>
terassyi added a commit to cybozu-go/cilium that referenced this pull request Dec 12, 2025
* backport cilium/pull/41122

Signed-off-by: terashima <tomoya-terashima@cybozu.co.jp>

* backport cilium/pull/41147

lbipam: do not reallocate IPs on operator restart

If there was a pool that was filled and had unsatisfied Services, on
operator restart there was a high chance that we will reshuffle
assignement of IPs for that pool. This resulted in previously safisfied
services to either become unsatisfied or get a new IP.

Issue is fixed by not performing any operation on services until full
sync happens. After that, first we try to reuse IPs for already
satisfied services and only after that we try to assign additional IPs
to unsatisfied services.

Additionally, add test that covers this case, simulating restart of
operator.

Related: cilium#40358

Signed-off-by: Marcel Zieba <marcel.zieba@isovalent.com>

---------

Signed-off-by: terashima <tomoya-terashima@cybozu.co.jp>
Signed-off-by: Marcel Zieba <marcel.zieba@isovalent.com>
Co-authored-by: Marcel Zieba <marcel.zieba@isovalent.com>
yokaze pushed a commit to cybozu-go/cilium that referenced this pull request Jan 30, 2026
* backport cilium/pull/41122

Signed-off-by: terashima <tomoya-terashima@cybozu.co.jp>

* backport cilium/pull/41147

lbipam: do not reallocate IPs on operator restart

If there was a pool that was filled and had unsatisfied Services, on
operator restart there was a high chance that we will reshuffle
assignement of IPs for that pool. This resulted in previously safisfied
services to either become unsatisfied or get a new IP.

Issue is fixed by not performing any operation on services until full
sync happens. After that, first we try to reuse IPs for already
satisfied services and only after that we try to assign additional IPs
to unsatisfied services.

Additionally, add test that covers this case, simulating restart of
operator.

Related: cilium#40358

Signed-off-by: Marcel Zieba <marcel.zieba@isovalent.com>

---------

Signed-off-by: terashima <tomoya-terashima@cybozu.co.jp>
Signed-off-by: Marcel Zieba <marcel.zieba@isovalent.com>
Co-authored-by: Marcel Zieba <marcel.zieba@isovalent.com>
@cilium-release-bot cilium-release-bot bot moved this to Released in cilium v1.19.0 Feb 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport-done/1.18 The backport for Cilium 1.18.x for this PR is done. kind/bug This is a bug in the Cilium logic. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/minor This PR changes functionality that users may find relevant to operating Cilium.

Projects

No open projects
Status: Released

Development

Successfully merging this pull request may close these issues.

3 participants