cilium, bigtcp: Rework initialization flow#43891
Merged
joestringer merged 5 commits intomainfrom Jan 26, 2026
Merged
Conversation
566794f to
5ea8503
Compare
Contributor
Author
|
/test |
Contributor
Author
|
/ci-kubespray |
b2c14f0 to
c2cc375
Compare
Contributor
Author
|
/test |
Contributor
Author
|
/ci-kubespray |
While we're at it, add references to the upstream kernel commits required for the feature checks to pass. Signed-off-by: Alice Mikityanska <alice@isovalent.com> Signed-off-by: Joe Stringer <joe@cilium.io>
1. On errors, revert changes to the original values, rather than defaults. There are devices for which gso_max_size=65536 is too big. 2. In error handling flow, modify IPv6 values first, similarly to how it's done during the configuration. Modifying gso_max_size also affects gso_ipv4_max_size when setting values below 64k, so it should be done before IPv4. 3. In error handling flow, go over the devices in the reverse order, because there might be weird dependencies between them, e.g., tso_max_size of one device depends on gso_max_size of another. Signed-off-by: Alice Mikityanska <alice@isovalent.com> Signed-off-by: Joe Stringer <joe@cilium.io>
Adjust the configuration flow for the startBIGTCP() to use a more typical detect, modify, update pattern. 1. Move loop for device GSO limit detection into startBIGTCP() 2. Modify the configuration at the end upon successful config. 3. Change the configuration to the default if no devices are selected. Signed-off-by: Alice Mikityanska <alice@isovalent.com> Signed-off-by: Joe Stringer <joe@cilium.io>
Make the initialization flow of BIG TCP more robust by exposing all the logic explicitly in startBIGTCP(). The robustness changes include: 1. Return errors from startBIGTCP(). Previously, it would always return nil. 2. Fallback to older kernels' defaults when probing for potentially unsupported parameters. 3. Revert the change from commit fcdbf6d ("cilium, bigtcp: Allow raising GRO/GSO size without BIG TCP"), that would set gso_max_size=64k regardless of tso_max_size, which might be smaller, failing the operation in that case. Restore the old logic (in non-BIG TCP, keep values lower than 64k as is), but make it more robust: instead of hiding the check inside SetGROGSOIPv6MaxSize() and pretending that it set 64k, let startBIGTCP() check it explicily, whether lowering to 64k is needed. At the same time, store the lowest value among all netdevs to be used by the Cilium tunnel netdev. Fixes: #43737 Signed-off-by: Alice Mikityanska <alice@isovalent.com> Signed-off-by: Joe Stringer <joe@cilium.io>
BIG TCP initialization code refuses to proceed with enabling the feature if Cilium is set to tunneling mode, but the admin doesn't declare kernel support for BIG TCP for VXLAN and GENEVE tunnels. However, tunneling mode isn't the only case when a GENEVE tunnel can be created. Another case is dsrDispatch=geneve. Currently, BIG TCP proceeds to increase gso_max_size and gro_max_size, but the following creation of the GENEVE tunnel fails. Detect this configuration in advance and block BIG TCP. Also block BIG TCP in dsrDispatch=ipip, because IPIP tunnels don't support gso_max_size > 64k either. Fixes: #43938 Reported-by: Chris Bannister <c.bannister@gmail.com> Signed-off-by: Alice Mikityanska <alice@isovalent.com>
joestringer
approved these changes
Jan 26, 2026
Member
joestringer
left a comment
There was a problem hiding this comment.
Thanks, LGTM. As part of review I split the first commit into a few smaller pieces, so I'll push those shortly and merge. There's no diff.
c2cc375 to
cd7141b
Compare
10 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Make the initialization flow of BIG TCP more robust by exposing all the logic explicitly in startBIGTCP(). The robustness changes include:
On errors, revert changes to the original values, rather than defaults. There are devices for which gso_max_size=65536 is too big.
In error handling flow, modify IPv6 values first, similarly to how it's done during the configuration. Modifying gso_max_size also affects gso_ipv4_max_size when setting values below 64k, so it should be done before IPv4.
Return errors from startBIGTCP(). Previously, it would always return nil.
In error handling flow, go over the devices in the reverse order, because there might be weird dependencies between them, e.g., tso_max_size of one device depends on gso_max_size of another.
Fallback to older kernels' defaults when probing for potentially unsupported parameters.
Revert the change from commit fcdbf6d ("cilium, bigtcp: Allow raising GRO/GSO size without BIG TCP"), that would set gso_max_size=64k regardless of tso_max_size, which might be smaller, failing the operation in that case. Restore the old logic (in non-BIG TCP, keep values lower than 64k as is), but make it more robust: instead of hiding the check inside SetGROGSOIPv6MaxSize() and pretending that it set 64k, let startBIGTCP() check it explicily, whether lowering to 64k is needed. At the same time, store the lowest value among all netdevs to be used by the Cilium tunnel netdev.
Fixes: #43737
Also fix another bug: block BIG TCP with dsrDispatch=geneve (when no kernel support is present) and dsrDispatch=ipip (as there is no pending kernel support yet).
Fixes: #43938