Failed access service external ip in pod

### Is there an existing issue for this?

- [X] I have searched the existing issues

### What happened?

We use cilium vxlan and don't enable nodePort/externalIPs. So we depend kube-proxy handle service external ip.
I found when access service external ip (backend is a pod in another node) in pod, source ip will change to node ip after kube-proxy masquerade.
Because cilium_host ip addr scope is link, kernel ip masquerade wouldn't choose cilium_host even destination ip is another pod, package will send to server pod through cilium_vxlan and source ip is node ip.
Server pod will send reply package to client pod's node (not through cilium_vxlan), maybe network device (like router) didn't recognize server pod ip, it will drop the package.

1. client pod ip -> service external ip
2. client pod ip -> server pod ip (DNAT)
3. client node ip -> server pod ip (Masquerade) through cilium_vxlan
4. server pod ip -> client node ip (direct route, not through cilium_vxlan)

I try to change cilium_host ip addr scope to global, it works.
So I think we should remove scope link in init.sh to fix this issue.

### Cilium Version

v1.10.15

### Kernel Version

5.4.119

### Kubernetes Version

v1.22.5

### Sysdump

KubeProxyReplacement Details:
  Status:             Partial
  Session Affinity:   Enabled
  Services:
  - ClusterIP:      Enabled
  - NodePort:       Disabled
  - LoadBalancer:   Disabled
  - externalIPs:    Disabled
  - HostPort:       Disabled

### Relevant log output

_No response_

### Anything else?

client pod is 172.17.0.3
client node is 192.168.2.219
client cilium_host is 172.17.0.28
external ip is 119.28.229.32
server pod is 172.17.0.177

```
$ ip a show cilium_host
4: cilium_host@cilium_net: <BROADCAST,MULTICAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether da:5d:4c:d9:e4:7e brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.28/32 scope link cilium_host
       valid_lft forever preferred_lft forever
    inet6 fe80::d85d:4cff:fed9:e47e/64 scope link
       valid_lft forever preferred_lft forever
```

iptables
```
-A KUBE-SERVICES -d 119.28.229.32/32 -p tcp -m comment --comment "default/kubernetes-extranet:https loadbalancer IP" -m tcp --dport 443 -j KUBE-FW-JRLK3S5QDR5VE4WN
# Warning: iptables-legacy tables present, use iptables-legacy-save to see them
-A KUBE-FW-JRLK3S5QDR5VE4WN -m comment --comment "default/kubernetes-extranet:https loadbalancer IP" -j KUBE-MARK-MASQ
-A KUBE-FW-JRLK3S5QDR5VE4WN -m comment --comment "default/kubernetes-extranet:https loadbalancer IP" -j KUBE-SVC-JRLK3S5QDR5VE4WN
-A KUBE-FW-JRLK3S5QDR5VE4WN -m comment --comment "default/kubernetes-extranet:https loadbalancer IP" -j KUBE-MARK-DROP
```

```
$ conntrack -E -d 119.28.229.32

[NEW] tcp      6 120 SYN_SENT src=172.17.0.3 dst=119.28.229.32 sport=42614 dport=443 [UNREPLIED] src=172.17.0.177 dst=192.168.2.219 sport=443 dport=34294
```

After remove `scope link`

```
$ ip a del 172.17.0.28/32 dev cilium_host
$ ip a add 172.17.0.28/32 dev cilium_host
$ ip a show cilium_host
4: cilium_host@cilium_net: <BROADCAST,MULTICAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether da:5d:4c:d9:e4:7e brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.28/32 scope global cilium_host
       valid_lft forever preferred_lft forever
    inet6 fe80::d85d:4cff:fed9:e47e/64 scope link
       valid_lft forever preferred_lft forever
```

```
$ conntrack -E -d 119.28.229.32

    [NEW] tcp      6 120 SYN_SENT src=172.17.0.3 dst=119.28.229.32 sport=43612 dport=443 [UNREPLIED] src=172.17.0.177 dst=172.17.0.28 sport=443 dport=27012
```
1. client pod ip -> service external ip
2. client pod ip -> server pod ip (DNAT)
3. client's cilium_host ip -> server pod ip (Masquerade) through cilium_vxlan
4. server pod ip -> client's cilium_host ip through cilium_vxlan

We can see kernel ip masquerade choose cilium_host's ip `172.17.0.28` instead of  node ip `192.168.2.219`, and reply package also through cilium_vxlan.

### Code of Conduct

- [X] I agree to follow this project's Code of Conduct

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Failed access service external ip in pod #21737

Is there an existing issue for this?

What happened?

Cilium Version

Kernel Version

Kubernetes Version

Sysdump

Relevant log output

Anything else?

Code of Conduct

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Failed access service external ip in pod #21737

Description

Is there an existing issue for this?

What happened?

Cilium Version

Kernel Version

Kubernetes Version

Sysdump

Relevant log output

Anything else?

Code of Conduct

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions