Skip to content

Difference in cilium bpf masq compared to upstream ip-masq-agent #35823

@tamilmani1989

Description

@tamilmani1989

Is there an existing issue for this?

  • I have searched the existing issues

Version

equal or higher than v1.16.0 and lower than v1.17.0

What happened?

Cilium ebpf based masquerading does not perform SNAT on packets destined for a remote node, even if the node CIDR is not listed in the non-masquerade list. This behavior differs from the upstream ip-masq-agent, which SNATs all packets except those with IP CIDRs specified in the non-masquerade CIDR list in the ip-masq-agent configmap. If the node CIDR is omitted from the ip-masq-agent configmap, the upstream ip-masq-agent will SNAT packets destined for remote node IPs, as expected.

I tried to look cilium bpf snat code and found this code checking for remote identity and ignoring snat. I could be wrong.

if (identity_is_remote_node(remote_ep->sec_identity))
where bpf program smartly ignores snat remote node ips.

How can we reproduce the issue?

  1. enable bpf masquerading in cilium agent.
  2. Deploy this config map: (192.168.0.0/16 is pod cidr and node cidr is 10.10.0.0/16)
apiVersion: v1
kind: ConfigMap
metadata:
  name: ip-masq-agent
  namespace: kube-system
data:
  config: |
    nonMasqueradeCIDRs:
    - 192.168.0.0/16 
    masqLinkLocal: true

  1. ping to remote node ip from any of the pods
    traffic won't get snat which is not expected.
k exec -it -n kube-system         cilium-zg2mz -- cilium bpf ipmasq list
Defaulted container "cilium-agent" out of: cilium-agent, install-cni-binaries (init), mount-cgroup (init), apply-sysctl-overwrites (init), mount-bpf-fs (init), clean-cilium-state (init), systemd-networkd-overrides (init), block-wireserver (init)
IP PREFIX/ADDRESS
192.168.0.0/16

00:01:46.635718 00:0d:3a:fa:50:9b > 12:34:56:78:9a:bc, ethertype IPv4 (0x0800), length 98: 192.168.1.101 > 10.10.0.5: ICMP echo request, id 37444, seq 39, length 64

In case of upstream ip-masq-agent:

root@aks-nodepool1-30074333-vmss000000:/# iptables -t nat -L  IP-MASQ-AGENT
Chain IP-MASQ-AGENT (1 references)
target     prot opt source               destination
RETURN     all  --  anywhere             192.168.0.0/16       /* ip-masq-agent: local traffic is not subject to MASQUERADE */
MASQUERADE  all  --  anywhere             anywhere             /* ip-masq-agent: outbound traffic is su

00:22:36.014827 00:0d:3a:fa:50:9b > 12:34:56:78:9a:bc, ethertype IPv4 (0x0800), length 98: 10.10.0.4 > 10.10.0.5: ICMP echo request, id 40417, seq 9, length 64

where 10.10.0.4 is source node ip and pod traffic get snat with node ip.

Cilium Version

1.16

Kernel Version

5.15

Kubernetes Version

1.28

Regression

No response

Sysdump

No response

Relevant log output

No response

Anything else?

No response

Cilium Users Document

  • Are you a user of Cilium? Please add yourself to the Users doc

Code of Conduct

  • I agree to follow this project's Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/datapathImpacts bpf/ or low-level forwarding details, including map management and monitor messages.feature/snatRelates to SNAT or Masquerading of traffickind/bugThis is a bug in the Cilium logic.kind/community-reportThis was reported by a user in the Cilium community, eg via Slack.needs/triageThis issue requires triaging to establish severity and next steps.staleThe stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions