Skip to content

Decrease IPCache CEP update window #11751

@brb

Description

@brb

It has been observed that IPCache entry update for a remote node's pod might take up to 10s: #11698 (comment). During that period, a NodePort service backend won't be reachable through an intermediate node from outside (in the vxlan mode), because in CILIUM_CALL_IPV{4,6}_NODEPORT_NAT we do a lookup in the IPCache map to determine whether a remote backend should be sent via the tunnel. If the IPCache entry is not found, then it's going via the direct routing path which fails due to failing fib_lookup() (expected).

To fix this, we could check the tunnel maps in nodeport.h instead. However, there is a plan (and a PR) to get rid of the tunnel map, and to use the IPCache map instead.

According to @aanm, the update duration can be decreased:

this controller has a run interval of 10 seconds: https://github.com/cilium/cilium/blob/816b893a71997351c77667e2cff54fd33dbf98c8/pkg/k8s/watchers/endpointsynchronizer.go#L86
pkg/k8s/watchers/endpointsynchronizer.go:86
          
when we create the CEP it does not contain all the information the ipcaches needs since we do an early return https://github.com/cilium/cilium/blob/816b893a71997351c77667e2cff54fd33dbf98c8/pkg/k8s/watchers/endpointsynchronizer.go#L173

also, namedPorts are not immediately known when a pod is created, Jarno will know the whys better than me so we do another earlier return if the named ports are not available https://github.com/cilium/cilium/blob/816b893a71997351c77667e2cff54fd33dbf98c8/pkg/k8s/watchers/endpointsynchronizer.go#L112

Metadata

Metadata

Assignees

Labels

kind/bugThis is a bug in the Cilium logic.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions