Introduce support for automatic datapath mode selection (bpf.datapathMode=auto)#43062
Merged
julianwiedmann merged 16 commits intocilium:mainfrom Jan 22, 2026
Merged
Introduce support for automatic datapath mode selection (bpf.datapathMode=auto)#43062julianwiedmann merged 16 commits intocilium:mainfrom
julianwiedmann merged 16 commits intocilium:mainfrom
Conversation
Member
Author
|
/test |
0a22e71 to
118623c
Compare
Member
Author
|
/test |
118623c to
646d15d
Compare
Member
Author
|
/test |
646d15d to
9eef474
Compare
Member
Author
|
/test |
This commit implements a refactor to the datapath ConnectorConfig structure so it can be embedded into the Orchestrator like other datapath configs (Wireguard, IPsec, Tunnel, etc.) Previously, the ConnectorConfig would have a dependency on the Orchestrator in that the Hive startup hook entry would wait for the datapath initialized signal exported from the Orchestrator. This was done to provide a guarantee that we only probe for buffer margins when the Loader has completed at least one initialization pass. In preparation for the ConnectorConfig to express an operational datapath mode down into the Loader, we need to reverse this dependency so that the Loader can access it when setting up the data path. Ultimately, the Orchestrator now just calls the ConfigConnector Reinitialize() routine every time like other functions. Signed-off-by: Alasdair McWilliam <alasdair.mcwilliam@isovalent.com>
Remove the datapath mode switch in ciliumHealthManager cleanupEndpoint() because they all do the same thing anyway. Signed-off-by: Alasdair McWilliam <alasdair.mcwilliam@isovalent.com>
This commit relocates the datapathMode validation logic from the daemon component into the datapath/connector package. This seems appropriate given we are separating 'configured' mode from 'operational' mode to facilitate auto-discovery. This commit also includes a minor tweak to how the daemon pulls in the datapath ConnectorConfig interface to simplify imports. Finally, this commit includes updates to the basic connector config tests function better on test hosts that do not provide netkit. Signed-off-by: Alasdair McWilliam <alasdair.mcwilliam@isovalent.com>
Minor refactor to internal HeaderfileWriter.writeTemplateConfig() function to just accept the local node configuration and endpoint configuration as input parameters, rather than specific values from these structures. This simplifies the function signature and allows us to replace datapath mode checking for netkit with local node configuration in the next commit. No functional change in this commit. Signed-off-by: Alasdair McWilliam <alasdair.mcwilliam@isovalent.com>
This commit introduces DatapathIsLayer2 into the local node configuration structure. This carries true if the operational datapath mode requires that workload-facing network interfaces process ARP, and false otherwise. Signed-off-by: Alasdair McWilliam <alasdair.mcwilliam@isovalent.com>
This commit migrates the /config API from DaemonConfig.DatapathMode to
the ConnectorConfig, which means this API now expresses configured mode
separately to operational mode.
The existing JSON properties ("datapathMode", "datapath-mode") now
carry the operational datapath mode. This has been done for backwards
compatibility - e.g. a newer client can interface with an older API.
The new JSON properties ("configuredDatapathMode", "configured-datapath-mode")
carry the configured datapath mode. At the time of writing, these will
carry identical values. However, in a future commit, an "auto" mode will
be added as a valid option for the configured datapath mode only.
Signed-off-by: Alasdair McWilliam <alasdair.mcwilliam@isovalent.com>
Expands the Cilium status output to correctly differentiate between configured and operational datapath modes. This will alter the output when a future commit introduces support for datapathMode=auto so the modes are visible to administrators. Signed-off-by: Alasdair McWilliam <alasdair.mcwilliam@isovalent.com>
Introduces the start of a streamlined datapath link creation mechanism for use when creating new endpoints within Cilium. The mechanism is simply a NewLinkPair() method of ConnectorConfig, which accepts a LinkConfig and associated sysctls. The aspiration is that other packages can remain ignorant of the operational datapath mode, and over time, support mixed-mode datapaths for migration purposes (e.g. veth to netkit). The LinkConfig structure is repositioned so it can be imported by other components via the standard datapath types package. It is also extended with other fields, such as EndpointID, HostIfName, PeerIfName, and PeerNamespace. These values alter the behaviour of the underlying implementation. - If EndpointID is not specified, a HostIfName and PeerIfName must be. - If EndpointID is specified, Cilium will auto-generate HostIfName and PeerIfName. - The peer link will be automatically be switched into the NetNS provided by PeerNamespace. This commit also adds a new LinkPair type, which is returned by the NewLinkPair() method, and provides some ancillary helpers. Finally, this also provides a DeleteLinkPair() function which operates on the same LinkConfig structure. This is provided for consistency with the creation routine, to abstract translation of EndpointID. Signed-off-by: Alasdair McWilliam <alasdair.mcwilliam@isovalent.com>
This commit migrates the Cilium Health Manager to utilise the datapath connector.NewLinkPair() method, rather than calling specific driver functions directly. Signed-off-by: Alasdair McWilliam <alasdair.mcwilliam@isovalent.com>
This commit migrates the Cilium CNI plugin to use the new datapath connector NewLinkPair() function, rather than calling specific driver functions directly. Signed-off-by: Alasdair McWilliam <alasdair.mcwilliam@isovalent.com>
…ector. This commitmigrates the Cilium docker plugin to use the new datapath connector NewLinkPair() function, rather than calling specific driver functions directly. Signed-off-by: Alasdair McWilliam <alasdair.mcwilliam@isovalent.com>
…fig. This commit introduces new DatapathIsNetkit field into the local node config structure, which is set by the ConfigConnector when instantiating a new instance of the structure. This commit also migrates the loader logic to carry the value of this field into the loader bpf config structures, rather than using the daemon config. Signed-off-by: Alasdair McWilliam <alasdair.mcwilliam@isovalent.com>
This commit migrates the endpoint restore logic to use the new datapath connector logic to validate detected endpoints are compatible with the current operational datapath mode of the agent. Previous logic assumed that if the underlying link of an endpoing was netkit, that we are compatible. However, it's not possible to derive the mode of the driver from this. This commit introduces logic in the connector that provides necessary compatibility checks while also probing the netkit structure returned by the kernel to verify mode. This commit also tweaks the error log raised if incompatible endpoints are detected, to include a list of "detected" incompatible modes. While it's probably safe to assume the agent will never detect more than 1 type of incompatible link, this was modified so failures in this assumption (e.g. bugs) won't produce incorrect logs. Signed-off-by: Alasdair McWilliam <alasdair.mcwilliam@isovalent.com>
This commit introduces a new automatic datapath mode. If enabled, the
connector will probe the host kernel for netkit support when creating
a new instance of the ConnectorConfig.
If configured mode is auto: if netkit support is found, then the
operational mode will be set to netkit. Otherwise, it will default
back to veth.
The difference is visible through cilium-dbg:
$ cilium-dbg status | grep Mode
Attach Mode: TCX
Device Mode: netkit [Configured: auto]
And REST API:
$ curl -s --unix-socket /var/run/cilium/cilium.sock http://localhost/v1/config | jq '.status | {datapathMode, configuredDatapathMode}'
{
"datapathMode": "netkit",
"configuredDatapathMode": "auto"
{
This does not effect the behaviour of hard-coding the datapath-mode
to either netkit or netkit-l2. If either mode is manually configured,
and the netkit probe fails, cilium will fail to start.
Signed-off-by: Alasdair McWilliam <alasdair.mcwilliam@isovalent.com>
This commit updates the feature metrics logic so we correctly report the configured vs. operational datapath mode. For example, when configured in automatic mode and the connector detects netkit support, metrics will now show this as: $ cilium-dbg metrics list | grep datapath_config cilium_feature_datapath_config configured_mode=auto operational_mode=netkit This commit also updates all other feature metric tests to explicitly set datapathMode=veth in the DaemonConfig structure, to avoid complications with the first entry now being "auto." Signed-off-by: Alasdair McWilliam <alasdair.mcwilliam@isovalent.com>
1451058 to
e491eed
Compare
Member
Author
|
/test |
borkmann
approved these changes
Jan 20, 2026
Member
borkmann
left a comment
There was a problem hiding this comment.
lgtm, great work! one small nit but could also be follow-up is to document this option in https://github.com/cilium/cilium/blob/main/Documentation/operations/performance/tuning.rst#netkit-device-mode
Member
Author
|
/ci-ipsec-e2e |
8 tasks
Member
I'd say let's merge this big chunk, and address docs in a follow-up PR. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
High Level Summary:
Abstracts datapath mode selection logic into a new
ConnectorModetype that expresses methods such asIsVeth(),IsNetkit()andIsLayer2().Splits the current single datapath mode into
configured modeandoperational mode, which are encapsulated in the datapathConnectorConfig.Migrates all manual checks of
DaemonConfig.DatapathModeto ConfigConnector via queries to the operational mode.Introduces an automatic datapath mode, which probes the underlying host for netkit support. If enabled, netkit is used, with veth as the fallback, all at runtime. The default mode is still veth.
Additional notes:
Updates to
cilium-dbg status:Device Modeunchanged where configured/operational modes are equal (e.g. eth/veth, netkit/netkit).Device Modeexposes split only where the modes differ (e.g. auto/veth, auto/netkit).Updates to
cilium-dbg metrics:datapath_configexpresses bothoperational_modeandconfigured_modeeven if they are equal.Fake datapath connector logic is provided for testing, that can still express different modes, where another component within Cilium may want to adapt its behaviour based on a specific datapath mode.
Previous endpoint restore logic was updated to detect incompatible datapath modes of existing pods - ref: daemon: Fail agent startup on incompatible datapath mode #42482. This logic has been updated to further probe between
netkitandnetkit-l2modes, which were not previously detected by the referenced change.Example outputs where automatic mode is used
Log outputs
Release Notes