Skip to content

wireguard: connectivity issues with ipv6-only clusters #23917

@giorio94

Description

@giorio94

Is there an existing issue for this?

  • I have searched the existing issues

What happened?

While running the connectivity tests on an IPv6-only cluster with wireguard encryption, I've noticed different failures which affect pod to pod communication. For instance:

❌ 1/1 tests failed (2/4 actions), 30 tests skipped, 9 scenarios skipped:
Test [no-policies]:
  ❌ no-policies/pod-to-pod/curl-1: cilium-test/client-7b78db77d5-2cpmk (fd00:10:242:1::a9b2) -> cilium-test/echo-other-node-78f77b57f8-qnpq7 (fd00:10:242::786a:8080)
  ❌ no-policies/pod-to-pod/curl-3: cilium-test/client2-78f748dd67-w6dhk (fd00:10:242:1::69e4) -> cilium-test/echo-other-node-78f77b57f8-qnpq7 (fd00:10:242::786a:8080)

The issue relates with fragmentation, as confirmed by executing a ping with increasing packet size (ping -s 1360 works, while ping -s 1361 fails and triggers fragmentation, although it shouldn't). Pod interfaces have MTU set (through the default route) to 1420, cilium_wg0 has MTU 1420, while the host interface has MTU 1500.

More specifically, the problem is triggered by the padding added by wireguard to align the packet to 16 bytes [1], which though, should be limited by the MTU to prevent the occurrence of fragmentation. Still, this does not happen, since
here [2] the MTU is detected based on the wrong device (eth0 rather than cilium_wg0). This happens since dev is correctly set to cilium_wg0 [3] after bpf_redirect is performed by Cilium, while skb_dst(skb)->dev does not seem to get updated.

[1]: https://lxr.missinglinkelectronics.com/linux+v5.19/drivers/net/wireguard/send.c#L141
[2]: https://lxr.missinglinkelectronics.com/linux+v5.19/drivers/net/wireguard/device.c#L171
[3]: https://lxr.missinglinkelectronics.com/linux+v5.19/net/core/filter.c#L2110

Cilium Version

Recent version on master

Kernel Version

Linux 6.1.0-3-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.8-1 (2023-01-29) x86_64 GNU/Linux

Kubernetes Version

Client Version: v1.26.1
Kustomize Version: v4.5.7
Server Version: v1.25.3

Sysdump

No response

Relevant log output

No response

Anything else?

No response

Code of Conduct

  • I agree to follow this project's Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/datapathImpacts bpf/ or low-level forwarding details, including map management and monitor messages.area/encryptionImpacts encryption support such as IPSec, WireGuard, or kTLS.area/mtuRelates to MTU management in Cilium.feature/ipv6Relates to IPv6 protocol supportfeature/ipv6-onlyRelates to single-stack IPv6 support.feature/wireguardRelates to Cilium's Wireguard featurekind/bugThis is a bug in the Cilium logic.pinnedThese issues are not marked stale by our issue bot.

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions