Skip to content

Masquerading fails when devices are set without our kube-proxy replacement #12141

@pchaigno

Description

@pchaigno

When Cilium (v1.8) is configured with global.devices set, global.autoDirectNodeRoutes=true, and global.masquerade=true, multi-node connectivity seems to fail. In particular, attempts to resolve DNS names fail with the following errors:

$ dig kubernetes.default.svc.cluster.local @10.0.1.157
;; reply from unexpected source: 192.168.36.12#53, expected 10.0.1.157#53
;; reply from unexpected source: 192.168.36.12#53, expected 10.0.1.157#53

The query reaches the second node with kube-dns as expected, but then the answer leaves the second node with source IP address 192.168.36.12 instead of the 10.0.1.157 expected by the client. So masquerading seems to fail (which is why setting global.masquerade=false works around the bug).

@brb added a few more informations:

One more data point: when we do NOT set --device(s), a reply from bpf_lxc does not enter nat/POSTROUTING chain (in which the MASQ rule is installed).

tc filter del dev enp0s8 ingress fixes the issue.

Related: #11969

Metadata

Metadata

Assignees

Labels

area/datapathImpacts bpf/ or low-level forwarding details, including map management and monitor messages.kind/bugThis is a bug in the Cilium logic.needs/triageThis issue requires triaging to establish severity and next steps.pinnedThese issues are not marked stale by our issue bot.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions