Skip to content

cilium-dbg: Add preflight checker for cilium-configmap#43025

Merged
gandro merged 1 commit intocilium:mainfrom
andy176631:preflight-check-cm
Feb 2, 2026
Merged

cilium-dbg: Add preflight checker for cilium-configmap#43025
gandro merged 1 commit intocilium:mainfrom
andy176631:preflight-check-cm

Conversation

@andy176631
Copy link
Copy Markdown
Contributor

@andy176631 andy176631 commented Nov 29, 2025

Description

When upgrading Cilium, the user may introduce unrecognized keys in Cilium config (ConfigMap) due to:

  • Keys deprecated in the new Cilium version
  • Keys renamed in the new Cilium version

Such upgrades are likely to succeed initially, but may later fail in connectivity tests or cause issues in production environments.

This PR adds a validator in cilium-dbg preflight that validates the Cilium config (ConfigMap) in a live cluster environment before an upgrade. This ensures that all configuration keys are recognized by both the new versions of the daemon and the operator, allowing users to review any unrecognized keys and evaluate their potential impact on the system.

The cilium-dbg preflight command runs in a Deployment created in the running cluster.

Fixes: #42781

Add detection of unknown keys in the cilium-config ConfigMap during preflight for the agent and operator.

@andy176631 andy176631 requested a review from a team as a code owner November 29, 2025 18:17
@andy176631 andy176631 requested a review from asauber November 29, 2025 18:17
@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Nov 29, 2025
@github-actions github-actions bot added the kind/community-contribution This was a contribution made by a community member. label Nov 29, 2025
@tommyp1ckles tommyp1ckles added the area/helm Impacts helm charts and user deployment experience label Dec 17, 2025
@andy176631
Copy link
Copy Markdown
Contributor Author

It’s probably not the user’s job to do preflight checks. Instead, key consistency between Helm and the daemon/operator should be ensured during development.

Copy link
Copy Markdown
Member

@asauber asauber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is on the way towards what we need here but at least two significant changes are needed:

  1. Rather than check a .yaml file, this needs to grab the Cilium config (ConfigMap) from the current cluster, and check its contents. cilium-dbg is the CLI which is available in all running Cilium pods, such as those used by the preflight.enabled Deploymet.
  2. An invocation of this command needs to be added to the Cilium preflight Deployment Helm template, with the appropriate exit status upon failure.

See https://github.com/cilium/cilium/blob/main/install/kubernetes/cilium/templates/cilium-preflight/deployment.yaml#L52 for the example of cilium-dbg preflight validate-cnp and it's related implementation.

@joestringer joestringer added release-note/minor This PR changes functionality that users may find relevant to operating Cilium. dont-merge/wait-until-release Freeze window for current release is blocking non-bugfix PRs labels Jan 8, 2026
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Jan 8, 2026
@maintainer-s-little-helper
Copy link
Copy Markdown

Commit 48f86b3 does not match "(?m)^Signed-off-by:".

Please follow instructions provided in https://docs.cilium.io/en/stable/contributing/development/contributing_guide/#developer-s-certificate-of-origin

@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-sign-off The author needs to add signoff to their commits before merge. label Jan 10, 2026
@maintainer-s-little-helper
Copy link
Copy Markdown

Commits 48f86b3, 954ad2e do not match "(?m)^Signed-off-by:".

Please follow instructions provided in https://docs.cilium.io/en/stable/contributing/development/contributing_guide/#developer-s-certificate-of-origin

@maintainer-s-little-helper
Copy link
Copy Markdown

Commit 48f86b3 does not match "(?m)^Signed-off-by:".

Please follow instructions provided in https://docs.cilium.io/en/stable/contributing/development/contributing_guide/#developer-s-certificate-of-origin

@andy176631 andy176631 marked this pull request as draft January 11, 2026 22:56
@maintainer-s-little-helper
Copy link
Copy Markdown

Commit 48f86b3 does not match "(?m)^Signed-off-by:".

Please follow instructions provided in https://docs.cilium.io/en/stable/contributing/development/contributing_guide/#developer-s-certificate-of-origin

@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-sign-off The author needs to add signoff to their commits before merge. label Jan 12, 2026
@aanm aanm removed the dont-merge/wait-until-release Freeze window for current release is blocking non-bugfix PRs label Jan 14, 2026
@andy176631 andy176631 requested a review from asauber January 17, 2026 22:35
@andy176631 andy176631 closed this Jan 17, 2026
@andy176631 andy176631 reopened this Jan 17, 2026
@andy176631
Copy link
Copy Markdown
Contributor Author

andy176631 commented Jan 17, 2026

This is on the way towards what we need here but at least two significant changes are needed:

  1. Rather than check a .yaml file, this needs to grab the Cilium config (ConfigMap) from the current cluster, and check its contents. cilium-dbg is the CLI which is available in all running Cilium pods, such as those used by the preflight.enabled Deploymet.
  2. An invocation of this command needs to be added to the Cilium preflight Deployment Helm template, with the appropriate exit status upon failure.

See https://github.com/cilium/cilium/blob/main/install/kubernetes/cilium/templates/cilium-preflight/deployment.yaml#L52 for the example of cilium-dbg preflight validate-cnp and it's related implementation.

Hi @asauber Thanks for the feedback!
I’ve addressed both points by updating the preflight Deployment to invoke cilium-dbg and validating the deployed Cilium ConfigMap from the cluster. Please take another look.
FYI, this commit (205c589) fixes the validator readiness change introduced by the Helm refactor in PR #16896 (file-diff).

@andy176631 andy176631 marked this pull request as ready for review January 17, 2026 23:05
@andy176631 andy176631 requested a review from a team as a code owner January 17, 2026 23:05
@andy176631 andy176631 requested a review from gandro January 17, 2026 23:05
@asauber
Copy link
Copy Markdown
Member

asauber commented Jan 20, 2026

/test

Copy link
Copy Markdown
Member

@asauber asauber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implementation LGTM. Let's check if the preflight tests pass.

@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-sign-off The author needs to add signoff to their commits before merge. label Jan 23, 2026
@andy176631
Copy link
Copy Markdown
Contributor Author

andy176631 commented Jan 23, 2026

Hi @asauber @gandro
The CI issues have been fixed and the checks are now passing on my fork.
Could you please approve the CI workflows, Many thanks!

@gandro
Copy link
Copy Markdown
Member

gandro commented Jan 26, 2026

/test

@gandro
Copy link
Copy Markdown
Member

gandro commented Jan 27, 2026

Build seems broken

445.9 cmd/preflight_k8s_valid_configmap.go:46:17: cannot use operatorCmd.Operator (value of type func() "github.com/cilium/hive/cell".Cell) as "github.com/cilium/hive/cell".Cell value in argument to hive.New: func() "github.com/cilium/hive/cell".Cell does not implement "github.com/cilium/hive/cell".Cell (missing method Apply)

@andy176631
Copy link
Copy Markdown
Contributor Author

Build seems broken

Thanks for pointing this out. I've pushed a fix and the CI is now green.

@andy176631 andy176631 requested a review from gandro January 28, 2026 02:08
@gandro
Copy link
Copy Markdown
Member

gandro commented Jan 28, 2026

Thanks. Please don't forget to squash your commits again

@gandro
Copy link
Copy Markdown
Member

gandro commented Jan 28, 2026

/test

@andy176631 andy176631 force-pushed the preflight-check-cm branch 2 times, most recently from 0e7d0e7 to 718e129 Compare January 29, 2026 06:43
@gandro
Copy link
Copy Markdown
Member

gandro commented Jan 29, 2026

/test

@andy176631
Copy link
Copy Markdown
Contributor Author

andy176631 commented Jan 29, 2026

The CI: Conformance L3-L4 only run failed. It seems unlikely that this failure is related to this PR.

Most PodToCIDR scenarios fail due to client-side timeouts when connecting to the server.

  [.] Action [no-policies/pod-to-cidr:external-172201100-0: cilium-test-1/client3-74887fc475-b64ks (10.244.3.246) -> external-172201100 (172.20.1.100:443)]
  ❌ command "curl --silent --fail --show-error --connect-timeout 2 --max-time 10 -4 -w %{local_ip}:%{local_port} -> %{remote_ip}:%{remote_port} = %{response_code}\n --output /dev/null --retry 3 --retry-all-errors --retry-delay 3 https://172.20.1.100:443" failed: command failed (pod=cilium-test-1/client3-74887fc475-b64ks, container=client3): command terminated with exit code 28

This appears to be similar to #43787.

Introduce a new validate-configmap subcommand under cilium-dbg preflight.

Also update the Helm chart to enable the validator and improve error readability.

Signed-off-by: andy176631 <andy17663@gmail.com>
@gandro
Copy link
Copy Markdown
Member

gandro commented Jan 29, 2026

Please note that pushing to the branch invalidates the test result (thus making the PR unmergable). Unless there's something to fix, I recommend not pushing to the branch.

@andy176631
Copy link
Copy Markdown
Contributor Author

Understood. I now have a better understanding of the timing around rebasing and pushing. Thanks for the guidance!

@gandro
Copy link
Copy Markdown
Member

gandro commented Jan 29, 2026

/test

@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Feb 2, 2026
@gandro gandro added this pull request to the merge queue Feb 2, 2026
Merged via the queue into cilium:main with commit 72ccfe7 Feb 2, 2026
78 checks passed
@andy176631 andy176631 deleted the preflight-check-cm branch February 2, 2026 12:15
@andy176631
Copy link
Copy Markdown
Contributor Author

Thanks to gandro, asauber, and squeed for the guidance in helping me complete my first PR!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/helm Impacts helm charts and user deployment experience kind/community-contribution This was a contribution made by a community member. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/minor This PR changes functionality that users may find relevant to operating Cilium.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Specifying unknown arguments in Cilium Agent config-dir

6 participants