Skip to content

v1.19 Backports 2026-01-19#43866

Merged
giorio94 merged 10 commits intov1.19from
pr/v1.19-backport-2026-01-19-10-01
Jan 19, 2026
Merged

v1.19 Backports 2026-01-19#43866
giorio94 merged 10 commits intov1.19from
pr/v1.19-backport-2026-01-19-10-01

Conversation

@giorio94
Copy link
Copy Markdown
Member

@giorio94 giorio94 commented Jan 19, 2026

Once this PR is merged, a GitHub action will update the labels of these PRs:

 43790 43776 43731 43782 43691 43069 43775 43748

mhofstetter and others added 9 commits January 19, 2026 10:01
[ upstream commit d3ad556 ]

According to the documentation of the flag `endpoint-regen-interval`, the
periodic endpoint regeneration should only be configured if the interval
is bigger than `0`.

```
Periodically recalculate and re-apply endpoint configuration. Set to 0 to disable
```

But the current implementation has 2 issues.

* If the interval is configured as `0`, the regeneration is executed once at startup
* If the endpoint garbage collection is disabled, the periodic regeneration isn't
  configured at all.

This commit fixes these two issues by splitting the controller registrations
and only apply them if their interval is bigger than `0`.

Note: Always executing the stop hook that removes the controllers isn't a problem.
      Even if no controller would be registered.

Signed-off-by: Marco Hofstetter <marco.hofstetter@isovalent.com>
Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
[ upstream commit 13bed6d ]

Currently, the periodic controllers for endpoint garbage collection and endpoint
regeneration are registered at startup - without waiting for the completion
of the endpoint restoration (incl. regeneration) at startup.

This might lead to problems where the premature endpoint regeneration interfers
with the endpoint restoration (incl. processing the deletion queue).

This commit fixes this by moving the controller registration into a hive job
that waits for the completion of the endpoint restoration.

Signed-off-by: Marco Hofstetter <marco.hofstetter@isovalent.com>
Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
[ upstream commit 6739285 ]

Signed-off-by: Robin Gögge <r.goegge@isovalent.com>
Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
[ upstream commit 561161f ]

[ backporter's notes: skipped the release.yaml hunk, as the target file
  doesn't exist in v1.19. ]

As helm uses a different file to login into the registries, we also need
to use cosign login so the oci helm charts can be properly signed by
cosign.

Fixes: 7494f7c ("workflows/release: release helm charts as OCI packages")
Signed-off-by: André Martins <andre@cilium.io>
Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
[ upstream commit bf0dca5 ]

Previously, ESP rule was too broad and automation was deleting firewall
rule. Let's switch to only allowing ESP traffic between nodes.

Signed-off-by: Marcel Zieba <marcel.zieba@isovalent.com>
Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
[ upstream commit 640e955 ]

Julian spotted that setting ctx_skip_nodeport_set(ctx) is not having
the desired effect for the case when there is XDP used on the node.

The ctx_skip_nodeport_set() marker is not transferred to the skb and
as a result the service lookup would happen twice instead of just in
the XDP layer resulting in higher per-packet cost. The latter lookup
in the tcx layer for such packets is unnecessary.

Therefore, use the correct ctx_set_xfer(ctx, XFER_PKT_NO_SVC).

Reported-by: Julian Wiedmann <jwi@isovalent.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
[ upstream commit 7f95906 ]

Add a test where the backend is local with the service L7 proxy delegate.
We expect the service to be passed up the stack unmodified. XFER_PKT_NO_SVC
is set to skip tcx service handling a second time.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
[ upstream commit 4fe778b ]

Add a test where the backend is remote with the service L7 proxy delegate.
We expect the service to be NATed and sent out the node. XFER_PKT_NO_SVC
is /not/ set in this case given the backend is not part of the local
endpoint map.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
[ upstream commit a1afcb4 ]

Reason is a field defined with a fixed possible set of values so it
makes a good candidate to be added in those metrics.

Signed-off-by: Arthur Outhenin-Chalandre <git@mrfreezeex.fr>
Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
@giorio94 giorio94 added kind/backports This PR provides functionality previously merged into master. backport/1.19 This PR represents a backport for Cilium 1.19.x of a PR that was merged to main. labels Jan 19, 2026
[ upstream commit ae1e694 ]

[ backporter's notes: hit minor conflict in `pkg/endpoint/restore_test.go`
  due to changes in surrounding context, resolved accepting combination. ]

In the past, the metric `agent_bootstrap_seconds` with label
`scope=restore` was used to publish some endpoint restoration
related metrics.

With the modularization of the legacy daemon init logic, some of the
endpoint restoration logic no longer reported its duration to that metric.

In addition to this, the label `scope=restore` was pretty generic, because
other restoration logic was using the same label (e.g. identity restoration).
And the metric didn't take into account that the actual regeneration of
the restored endpoints was executed asynchronously.

To shed some more light into the endpoint restoration process, this
commit introduces two endpoint restoration specific metrics.

* `endpoint_restoration_endpoints` - with label `phase` & `outcome`
* `endpoint_restoration_duration_seconds` - with label `phase`

The following phases of the endpoint restoration process report the metric.

* `read_from_disk`: Reads old endpoints from state dir
* `restoration`: Restore old endpoints. includes validation & IP re-allocation
* `prepare_regeneration`: Trigger asynchronous regeneration
* `initial_policy_computation`: Duration until the initial policy for all restored endpoints is computed
* `regeneration`: Duration until all restored endpoints are regenerated

e.g.

```
root@kind-control-plane:/home/cilium# cilium-dbg shell metrics endpoint_restoration
Metric                                         Labels                                                Value
cilium_endpoint_restoration_duration_seconds   phase=initial_policy_computation                      0.006262
cilium_endpoint_restoration_duration_seconds   phase=prepare_regeneration                            2.506187
cilium_endpoint_restoration_duration_seconds   phase=read_from_disk                                  0.002105
cilium_endpoint_restoration_duration_seconds   phase=regeneration                                    2.270423
cilium_endpoint_restoration_duration_seconds   phase=restoration                                     0.101983
cilium_endpoint_restoration_endpoints          outcome=failed phase=read_from_disk                   0.000000
cilium_endpoint_restoration_endpoints          outcome=failed phase=regeneration                     0.000000
cilium_endpoint_restoration_endpoints          outcome=failed phase=restoration                      1.000000
cilium_endpoint_restoration_endpoints          outcome=skipped phase=restoration                     1.000000
cilium_endpoint_restoration_endpoints          outcome=successful phase=initial_policy_computation   4.000000
cilium_endpoint_restoration_endpoints          outcome=successful phase=prepare_regeneration         4.000000
cilium_endpoint_restoration_endpoints          outcome=successful phase=read_from_disk               6.000000
cilium_endpoint_restoration_endpoints          outcome=successful phase=regeneration                 4.000000
cilium_endpoint_restoration_endpoints          outcome=successful phase=restoration                  4.000000
cilium_endpoint_restoration_endpoints          outcome=total phase=initial_policy_computation        4.000000
cilium_endpoint_restoration_endpoints          outcome=total phase=prepare_regeneration              4.000000
cilium_endpoint_restoration_endpoints          outcome=total phase=read_from_disk                    6.000000
cilium_endpoint_restoration_endpoints          outcome=total phase=regeneration                      4.000000
cilium_endpoint_restoration_endpoints          outcome=total phase=restoration                       6.000000
```

Signed-off-by: Marco Hofstetter <marco.hofstetter@isovalent.com>
Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
@giorio94 giorio94 force-pushed the pr/v1.19-backport-2026-01-19-10-01 branch from 990a388 to a6b3c65 Compare January 19, 2026 09:14
Copy link
Copy Markdown
Member

@aanm aanm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was exactly the intention, thank you!

@giorio94
Copy link
Copy Markdown
Member Author

/test

@giorio94 giorio94 marked this pull request as ready for review January 19, 2026 09:39
@giorio94 giorio94 requested review from a team as code owners January 19, 2026 09:39
@giorio94 giorio94 requested a review from nbusseneau January 19, 2026 09:39
Copy link
Copy Markdown
Member

@mhofstetter mhofstetter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@giorio94 giorio94 added this pull request to the merge queue Jan 19, 2026
Merged via the queue into v1.19 with commit c129565 Jan 19, 2026
402 of 406 checks passed
@giorio94 giorio94 deleted the pr/v1.19-backport-2026-01-19-10-01 branch January 19, 2026 13:42
@cilium-release-bot cilium-release-bot bot moved this to Released in cilium v1.19.0 Feb 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport/1.19 This PR represents a backport for Cilium 1.19.x of a PR that was merged to main. kind/backports This PR provides functionality previously merged into master.

Projects

No open projects
Status: Released

Development

Successfully merging this pull request may close these issues.

8 participants