-
Notifications
You must be signed in to change notification settings - Fork 3.7k
Bug: ICMP error packets have incorrect checksum after SNAT RevNAT in LB-node #40827
Description
Is there an existing issue for this?
- I have searched the existing issues
Version
equal or higher than v1.18.0 and lower than v1.19.0
What happened?
ICMP error packets returned from LB's backend pod on other node have incorrect checksum after SNAT RevNAT in LB-node.
It occur when external-traffic-policy is cluster and LB mode is SNAT.
How I think this error occurs is below.
- When SNAT RevNAT in LB-node, snat_v4_rev_nat() is called.
- snat_v4_rev_nat() rewrite IP-address of inner-header in ICMP error packet's payload by calling snat_v4_rev_nat_handle_icmp_error().
- snat_v4_rev_nat_handle_icmp_error() rewrite inner-header, but it doesn't rewrite ICMP header's checksum.
- After that, snat_v4_rev_nat() calls snat_v4_rewrite_headers() to rewrite outer-header.
- snat_v4_rewrite_headers() has some checksum update logic, but it doesn't correspond payload's update.(Only corresponding update of address or port.)
Note:
Currently, because DNAT RevNAT of ICMP error is not supported, the reply packet(4 in above figure) cannot reach client correctly with or without this checksum bug.
(The reply packet should have LB's VIP as saddr but currently it have backend pod's address as saddr.)
That might be why this bug wasn’t detected.
How can we reproduce the issue?
Settings like above figure.
Cilium Version
cilium image (running): 1.19.0-dev
Kernel Version
6.8.0-60-generic
Kubernetes Version
Client Version: v1.33.3
Kustomize Version: v5.6.0
Server Version: v1.29.0
Regression
No response
Sysdump
No response
Relevant log output
Anything else?
No response
Cilium Users Document
- Are you a user of Cilium? Please add yourself to the Users doc
Code of Conduct
- I agree to follow this project's Code of Conduct