Skip to content

Probe for socket diag & termination functionality#42867

Merged
tommyp1ckles merged 3 commits intocilium:mainfrom
tommyp1ckles:pr/tp/better-socket-diag-probing
Jan 21, 2026
Merged

Probe for socket diag & termination functionality#42867
tommyp1ckles merged 3 commits intocilium:mainfrom
tommyp1ckles:pr/tp/better-socket-diag-probing

Conversation

@tommyp1ckles
Copy link
Copy Markdown
Contributor

@tommyp1ckles tommyp1ckles commented Nov 19, 2025

The loadbalancer control plane will attempt to use the functionality in pkg/datapath/sockets/ in order to perform termination of sockets to backends that are stale/no-longer-exist.

This requires certain kernel config options to be enabled for correct functioning. Currently, when we fail to terminate a socket we will report an error indicating that this may be caused by such issues. However, this is insufficient as some of the functionality (i.e. CONFIG_INET_UDP_DIAG & CONFIG_INET_TCP_DIAG) will not actually return an error if the config isn't enabled - instead the netlink call will silently return nothing and the socket and no sockets will be terminated).

These changes add a probe that detects this functionality explicitly, and we use that to either completely avoid the loadbalancer socket termination job if it is not supported.

Testing

Hard to unit test this to any extent as this probes kernel functionality, I wrote a test utility and tested this under various kernels built with the specified config:

package main

import (
        "errors"
        "fmt"
        "log/slog"

        "github.com/cilium/cilium/pkg/datapath/linux/probes"
        "github.com/cilium/cilium/pkg/datapath/sockets"
)

func main() {
        err := sockets.InetDiagDestroyEnabled(slog.Default(), true, false)
        if err == nil {
                fmt.Println("ok!")
                return
        }
        if errors.Is(err, probes.ErrNotSupported) {
                fmt.Println("NOT SUPPORTED:", err)
        } else {
                panic(err)
        }
}
thadlaw@lima-grouse ~/little-vm-helper/_data/kernels/bpf-next $ grep CONFIG_INET_ .config
CONFIG_INET_TUNNEL=y
CONFIG_INET_DIAG=y
CONFIG_INET_TCP_DIAG=y
CONFIG_INET_UDP_DIAG=y
CONFIG_INET_RAW_DIAG=y
# CONFIG_INET_DIAG_DESTROY is not set

make -j$(nproc)

# Running inside qemu with kernel built above:

root@kind-bpf-net:/host# ./main
NOT SUPPORTED: failed while iterating sockets: not supported: operation to destroy probe socket is unsupported. This likely means that kernel CONFIG_INET_DIAG_DESTROY must be set in order for this functionality to work
# All enabled - fully supported
thadlaw@lima-grouse ~/little-vm-helper/_data/kernels/bpf-next $ grep CONFIG_INET .config
CONFIG_INET_TABLE_PERTURB_ORDER=16
CONFIG_INET_TUNNEL=y
CONFIG_INET_DIAG=y
CONFIG_INET_TCP_DIAG=y
CONFIG_INET_UDP_DIAG=y
CONFIG_INET_RAW_DIAG=y
CONFIG_INET_DIAG_DESTROY=y

# Running inside qemu with kernel built above:

root@kind-bpf-net:/host# ./main
ok!
# Note:  INET_UDP_DIAG is omitted and therefore disabled here.
thadlaw@lima-grouse ~/little-vm-helper/_data/kernels/bpf-next $ grep CONFIG_INET .config | grep -v "^#"
CONFIG_INET=y
CONFIG_INET_TABLE_PERTURB_ORDER=16
CONFIG_INET_TUNNEL=y
CONFIG_INET_DIAG=y
CONFIG_INET_TCP_DIAG=y
CONFIG_INET_RAW_DIAG=y
CONFIG_INET_DIAG_DESTROY=y

# Running inside qemu with kernel built above:

root@kind-bpf-net:/host# ./main
NOT SUPPORTED: not supported: no netlink messages testing INET_DIAG listing for udp. This indicates that the kernel does not have the appropriate kernel config set (CONFIG_INET_UDP_DIAG)
Improve probing for necessary kernel functionality required for service backend termination to avoid false positives.

@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Nov 19, 2025
@tommyp1ckles tommyp1ckles marked this pull request as ready for review November 19, 2025 00:41
@tommyp1ckles tommyp1ckles requested a review from a team as a code owner November 19, 2025 00:41
@tommyp1ckles
Copy link
Copy Markdown
Contributor Author

/test

@tommyp1ckles tommyp1ckles marked this pull request as draft November 19, 2025 00:55
@tommyp1ckles tommyp1ckles force-pushed the pr/tp/better-socket-diag-probing branch from 905f65c to 5bf1620 Compare November 19, 2025 04:42
@tommyp1ckles tommyp1ckles marked this pull request as ready for review November 19, 2025 04:42
@tommyp1ckles
Copy link
Copy Markdown
Contributor Author

/test

@tommyp1ckles tommyp1ckles changed the title Probe for socket termination functionality Probe for socket diag & termination functionality Nov 19, 2025
@tommyp1ckles tommyp1ckles force-pushed the pr/tp/better-socket-diag-probing branch from 5bf1620 to 4129631 Compare November 20, 2025 00:26
Copy link
Copy Markdown
Member

@aditighag aditighag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I'm missing something, but how is this different from - https://github.com/cilium/cilium/blob/main/pkg/loadbalancer/reconciler/termination.go#L240-L240?

@tommyp1ckles
Copy link
Copy Markdown
Contributor Author

tommyp1ckles commented Nov 20, 2025

@aditighag This will probe for CONFIG_INET_DIAG_DESTROY being set, but not CONFIG_INET_{TCP,UDP}_DIAG but set false with CONFIG_INET_DIAG=y.

Specifically, CONFIG_INET_{TCP,UDP}_DIAG extends the netlink functionality, so if one of them is not enabled it doesn't return ENOTSUPP as the underlying functionality is on, but rather will just send no socket nl messages.

Also, this will now probe prior to starting the Job and just avoid running that if it fails, so that we don't continuously retry this if it's destined to fail.

(we can maybe get rid of the messages you linked to, however)

@tommyp1ckles tommyp1ckles force-pushed the pr/tp/better-socket-diag-probing branch from 4129631 to 59487cd Compare November 20, 2025 21:31
@tommyp1ckles
Copy link
Copy Markdown
Contributor Author

To add to this a bit, this is also intended to provide a single reusable way to do this probing for any other uses/vendor usage of the functionality usage in pkg/datapath/sockets/

@tommyp1ckles tommyp1ckles marked this pull request as draft November 24, 2025 18:19
Copy link
Copy Markdown
Member

@aditighag aditighag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder whether we really need granular configuration checks? Is it not sufficient to document the additional required configurations in the kube-proxy guide?

Edit: Synced up offline. I think it makes sense to probe the configurations, and log warnings if not found. 👍 👍

@tommyp1ckles tommyp1ckles force-pushed the pr/tp/better-socket-diag-probing branch 4 times, most recently from 09df1f5 to 780ba48 Compare December 16, 2025 19:44
@tommyp1ckles
Copy link
Copy Markdown
Contributor Author

/test

@tommyp1ckles tommyp1ckles added release-note/minor This PR changes functionality that users may find relevant to operating Cilium. area/loadbalancing Impacts load-balancing and Kubernetes service implementations labels Dec 16, 2025
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Dec 16, 2025
@tommyp1ckles tommyp1ckles marked this pull request as ready for review December 16, 2025 19:45
@tommyp1ckles tommyp1ckles requested a review from a team as a code owner December 16, 2025 19:45
@tommyp1ckles tommyp1ckles requested a review from qmonnet December 16, 2025 19:45
Copy link
Copy Markdown
Member

@qmonnet qmonnet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack for doc change

Copy link
Copy Markdown
Member

@aditighag aditighag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, except for the go linter issues.

@julianwiedmann julianwiedmann added the feature/socket-lb Impacts the Socket-LB part of Cilium's kube-proxy replacement. label Dec 19, 2025
@joestringer joestringer added the dont-merge/wait-until-release Freeze window for current release is blocking non-bugfix PRs label Jan 9, 2026
@aanm aanm removed the dont-merge/wait-until-release Freeze window for current release is blocking non-bugfix PRs label Jan 14, 2026
@tommyp1ckles tommyp1ckles force-pushed the pr/tp/better-socket-diag-probing branch from 780ba48 to 08e9efc Compare January 17, 2026 01:43
@tommyp1ckles
Copy link
Copy Markdown
Contributor Author

/test

@tommyp1ckles tommyp1ckles force-pushed the pr/tp/better-socket-diag-probing branch from 08e9efc to 5ea2839 Compare January 17, 2026 05:56
@tommyp1ckles
Copy link
Copy Markdown
Contributor Author

/test

Functionality provided by the sockets library, namely iterating and
closing sockets, requires certain kernel config options being
enabled.

Because there is no standard way for distributions to store kernel
config values; the most reliable way to probe for these being
enabled is to simply try to use the functionality.

Therefore, this adds a probing function that will test for functions
provided by:

* CONFIG_INET_DIAG_DESTROY
* CONFIG_INET_UDP_DIAG
* CONFIG_INET_TCP_DIAG

Signed-off-by: Tom Hadlaw <tom.hadlaw@isovalent.com>
Previously we would just provide an error message when
the terminate failed. Instead, we can proactively probe
for this behavior using the new probe that checks for
all necessary functionality required for socket termination.

Signed-off-by: Tom Hadlaw <tom.hadlaw@isovalent.com>
Signed-off-by: Tom Hadlaw <tom.hadlaw@isovalent.com>
@tommyp1ckles tommyp1ckles force-pushed the pr/tp/better-socket-diag-probing branch from 5ea2839 to fdc4b29 Compare January 20, 2026 05:55
@tommyp1ckles
Copy link
Copy Markdown
Contributor Author

/test

@tommyp1ckles tommyp1ckles added this pull request to the merge queue Jan 21, 2026
Merged via the queue into cilium:main with commit 52db796 Jan 21, 2026
83 of 90 checks passed
@tommyp1ckles tommyp1ckles deleted the pr/tp/better-socket-diag-probing branch January 21, 2026 01:25
Comment on lines +133 to +134
lo := net.IP{127, 0, 0, 1}
if err := Iterate(uint8(probe.proto), unix.AF_INET, probe.filterMask, func(s *netlink.Socket, err error) error {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

late drive-by - is there any chance of this falling apart on an IPv6-only kernel?

@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Jan 21, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/loadbalancing Impacts load-balancing and Kubernetes service implementations feature/socket-lb Impacts the Socket-LB part of Cilium's kube-proxy replacement. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/minor This PR changes functionality that users may find relevant to operating Cilium.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants