Skip to content

v1.18 Backports 2025-08-25#41365

Merged
pippolo84 merged 22 commits intov1.18from
pr/v1.18-backport-2025-08-25-03-03
Sep 1, 2025
Merged

v1.18 Backports 2025-08-25#41365
pippolo84 merged 22 commits intov1.18from
pr/v1.18-backport-2025-08-25-03-03

Conversation

@pippolo84
Copy link
Copy Markdown
Member

@pippolo84 pippolo84 commented Aug 25, 2025

@pippolo84 pippolo84 added kind/backports This PR provides functionality previously merged into master. backport/1.18 This PR represents a backport for Cilium 1.18.x of a PR that was merged to main. labels Aug 25, 2025
Copy link
Copy Markdown
Contributor

@Artyop Artyop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see the changes of #41299

Edit: after checking out v1.18 branch has the old value and no changes are displayed when I check the commit alone

Copy link
Copy Markdown
Contributor

@smagnani96 smagnani96 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks Fabio!

Copy link
Copy Markdown
Contributor

@antonipp antonipp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for bpf/bpf_lxc.c

@pippolo84 pippolo84 marked this pull request as ready for review August 25, 2025 14:38
@pippolo84 pippolo84 requested review from a team as code owners August 25, 2025 14:38
@pippolo84 pippolo84 force-pushed the pr/v1.18-backport-2025-08-25-03-03 branch from 2c3fc2e to 67c0e9e Compare August 25, 2025 15:14
@pippolo84
Copy link
Copy Markdown
Member Author

/test

I don't see the changes of #41299

Edit: after checking out v1.18 branch has the old value and no changes are displayed when I check the commit alone

Now it should be fixed, thanks for the heads up!

@pippolo84 pippolo84 requested a review from Artyop August 25, 2025 15:15
Copy link
Copy Markdown
Contributor

@Artyop Artyop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, thank you 🙏

@pippolo84 pippolo84 force-pushed the pr/v1.18-backport-2025-08-25-03-03 branch 2 times, most recently from b677930 to 5d7dd54 Compare August 26, 2025 09:05
@pippolo84
Copy link
Copy Markdown
Member Author

@Artyop I've updated the CILIUM_NODEINIT_DIGEST value in install/kubernetes/Makefile.values to reflect the one in quay.io (the same we have in the main branch following the renovatebot commit). This should fix the ImagePullBackOff error for cilium-node-init image in the Smoke Test.

@pippolo84
Copy link
Copy Markdown
Member Author

/test

alimehrabikoshki and others added 18 commits August 26, 2025 11:22
[ upstream commit e69bd64 ]

When shrinking a CiliumPodIPPool the operator could crash with a nil pointer
dereference in updateCIDRSets.  The loop deletes entries from the slice it is
iterating over, leaving behind nil slots that are dereferenced in the next
iteration. This change skips over nil items in the slice.

Signed-off-by: alimehrabikoshki <79400736+alimehrabikoshki@users.noreply.github.com>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit 2b39a67 ]

The correct command is `cilium clustermesh
inspect-policy-default-local-cluster --all-namespaces`.

It appears that the command name was changed, and it was forgotten to
change it here too.

Fixes: 260af0e ("doc: add early warning for
policy-default-local-cluster")

Signed-off-by: Florian Ströger <florian@florianstroeger.com>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit 0b63209 ]

This commit fixes some errorhandling cases in the neighbor calculator
where the value of the error is not checked before joining with other errors.

This leads to the incorrect health status being reported.

e.g.

```
desired neighbor calculator errored: failed to insert desired neighbor: %!w(<nil>)\nfailed to insert desired neighbor: %!w(<nil>)
```

Signed-off-by: Marco Hofstetter <marco.hofstetter@isovalent.com>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit f180e6e ]

This PR removes the global `nCPU` variable from the `pkg/maps/policymap`
package and replaces usage of
[`ebpf.MustPossibleCPU`](https://pkg.go.dev/github.com/cilium/ebpf#MustPossibleCPU)
with [`ebpf.PossibleCPU`](https://pkg.go.dev/github.com/cilium/ebpf#PossibleCPU) by logging the error in case of failure and assuming a single CPU is available.

Note: the underlying implementation of [`ebpf.PossibleCPU`] uses [`sync.OnceValues`](https://github.com/cilium/ebpf/blob/ae226118949d4e3de64520195b66a09591116ea0/cpu_other.go#L11-L13) so there's no overhead to calling it multiple times.

Signed-off-by: Hadrien Patte <hadrien.patte@datadoghq.com>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit a5242c5 ]

Signed-off-by: Ashwin Pillai <pillaiashwin96@gmail.com>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit 4a33221 ]

Bump the checkpatch version, and explicitly pass the GH token with read
permissions to retrieve the list of commits for the target PR, instead
of relying on the fact that the target repository is public.

Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit 9acb306 ]

[ backporter's notes: Updated cilium-node-init sha256 digest according
to the version uploaded to quay.io. ]

This first bump of startup-script to the new tagging way will allow renovate to handle future updates

Signed-off-by: Antony Reynaud <antony.reynaud@isovalent.com>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit b5ed1e0 ]

Some CI environment variables were missing in our CI for some images. We
should enable them not only for the agent.

Signed-off-by: André Martins <andre@cilium.io>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit b25102d ]

Performing the Sanitization of a network policy will result its object
on being modified, therefore we need to make sure we DeepCopy the object
before doing it.

Fixes: 38f30ae ("policy: parse policies in the operator, update informational conditions")
Signed-off-by: André Martins <andre@cilium.io>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit af352a2 ]

This key is already defined in the logger thus we don't need to set it
again when creating a sub-logger. Since slog contains a slice of keys,
and not a map as logrus, they key will be appended to the existing keys
which will result in duplicated keys.

Signed-off-by: André Martins <andre@cilium.io>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit 06da5d7 ]

Instead of using the duplicated log key "resource" we should be more
specific and use the "parentResource" log key instead.

Signed-off-by: André Martins <andre@cilium.io>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit 45f3085 ]

Previously, the checkpatch script used to internally skip the checks
when not targeting the main branch. The blamed commit bumped it to
a more zealous version that runs against all target branches, and
explicitly fails in case any internal command fails.

However, this check is now failing both on push and merge queue
events, as the GITHUB_REF does not point to a PR. Let's prevent
this by making it run on PR events only, to restore the previous
behavior in this case. There's not much point in running checkpatch
on already merged commits anyways; similarly, no reason for running
it as part of the merge queue, given that the check is not required,
and the result does not depend on whether the branch is rebased.

Fixes: 4a33221 ("checkpatch: bump checkpatch version, and minor adaptations")
Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit bf1d7e5 ]

Make it more clear that the Cilium agent never pulls in this code, but
that this is really only used from unit tests.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit 26b5bbc ]

Dmitriy says:

  When --bpf-lb-algorithm-annotation feature is enabled, BPF code
  in run-time may not select load balancer algorithm for service at
  all for previously existing services. Thus, this behavior broke
  all new network connections to the service if it had an unknown
  algorithm value in service bpf map entry. This bug is stably
  reproducible for NodePort, HostPort, LocalRedirect service types.

  This commit solves the problem as follows: in the situation where
  the BPF service map contains an unknown LB algorithm, it simply uses
  the default LB algorithm from the --bpf-lb-algorithm option.

lb{4,6}_algorithm() should return a proper algorithm that is going to
be used in the datapath. Either a service had an annotation, or if not
then the user configured default should be picked.

One corner case is when we come from a future Cilium version where
users had an algorithm annotation on the service, which the current
Cilium version does not support. In that case we can only treat this
as a hint and need to fallback to the default.

Note that in the old code before the LB control plane rework, the
GetAnnotationServiceLoadBalancingAlgorithm() was pushing through
any annotation which was not loadbalancer.SVCLoadBalancingAlgorithmUndef
and otherwise the function was returning the default selected algorithm.
After the rework we just propagate loadbalancer.ToSVCLoadBalancingAlgorithm()
directly.

This meant that handling of loadbalancer.SVCLoadBalancingAlgorithmUndef
was pushed to runtime, therefore for services with no explicit annotation
this triggered the default case which led to drops.

Another side-note: loadbalancer.ToSVCLoadBalancingAlgorithm() does not
translate LB_SELECTION_FIRST. The latter is only ever used in BPF unit
tests.

Co-developed-by: Dmitriy Andreychenko <dmitriy.andreychenko@flant.com>
Signed-off-by: Dmitriy Andreychenko <dmitriy.andreychenko@flant.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit ec378db ]

Some distributions (e.g. AWS EKS clusters without AWS VPC CNI plugin) do
not install the `portmap` binary on the nodes, leading to confusion when
trying to use the portmap plugin. This commit documents the requirement
and hints at a solution for providing binaries if needed.

Co-authored-by: Joe Stringer <joe@cilium.io>
Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit ff9a230 ]

Internal ARP-handling functions such as those in lib/arp.h expect
linearized ARP packets. However, no code exist to linearize this
packet type. This means user-facing features such as L2 Announcements
break easily if the kernel decides to split ARP request in chunks.

Implement revalidate_data_arp_pull() and call it the same way it is done
for IPv4/IPv6 packets to prevent this from happening and keep things
consistent across protocols.

Fixes: #40419

Signed-off-by: Valentine Sinitsyn <valentine.sinitsyn@gmail.com>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit 4415e13 ]

This change adds error-handling logic that fixes routing for local addresses which have been passed in --exclude-local-address.
Previously, the routing code would always attempt a FIB lookup for packets destined to these addresses which would always fail with BPF_FIB_LKUP_RET_NOT_FWDED because the addresses are local.

The routing code will now remediate this by passing the packet to the kernel's routing stack when encountering this scenario.

The additional revalidate_data check on the ipv4 pointer is needed because it can get invalidated in fib_redirect_v4 so it should be checked before being passed further

Signed-off-by: Anton Ippolitov <anton.ippolitov@datadoghq.com>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
[ upstream commit 56a0504 ]

[ backporter's notes: Fixes minor conflicts due to different signature
for setupIPSecSuitePrivileged in stable branch. ]

As in #41006, this commits adds the TestPrivileged prefix to some xfrm
tests we missed to modify in the latest PR. With this, all the
unparallel tests should be executed properly in CI.

Signed-off-by: Simone Magnani <simone.magnani@isovalent.com>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
@pippolo84 pippolo84 force-pushed the pr/v1.18-backport-2025-08-25-03-03 branch from 5d7dd54 to 1888655 Compare August 26, 2025 09:22
@pippolo84
Copy link
Copy Markdown
Member Author

/test

@viktor-kurchenko
Copy link
Copy Markdown
Contributor

@borkmann @antonipp @pillai-ashwin kind ping)

@antonipp
Copy link
Copy Markdown
Contributor

antonipp commented Sep 1, 2025

Hi, I already approved my change: #41365 (review)
LMK if there's anything else I need to do

@pippolo84 pippolo84 added this pull request to the merge queue Sep 1, 2025
Merged via the queue into v1.18 with commit fbc250b Sep 1, 2025
323 of 326 checks passed
@pippolo84 pippolo84 deleted the pr/v1.18-backport-2025-08-25-03-03 branch September 1, 2025 17:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport/1.18 This PR represents a backport for Cilium 1.18.x of a PR that was merged to main. kind/backports This PR provides functionality previously merged into master.

Projects

None yet

Development

Successfully merging this pull request may close these issues.