Kube-proxy facing locking timeout in large clusters during load test with services enabled

Follows from discussion in https://github.com/kubernetes/kubernetes/issues/48052

We noticed this while performing load test on 4000 node clusters with services enabled. The iptables restore step in the proxier fails with:
```
E0625 09:03:14.873338       5 proxier.go:1574] Failed to execute iptables-restore: failed to acquire old iptables lock: timed out waiting for the condition
```

And the reason quite likely is because of "huge" size of iptables (tens of MBs) as we run 30 pods per node and each pod is part of exactly one service
=> 30 * 4000 = 120k service endpoints (and these updates happen on all 4000 nodes)

cc @kubernetes/sig-network-misc @kubernetes/sig-scalability-misc @danwinship @wojtek-t 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kube-proxy facing locking timeout in large clusters during load test with services enabled #48107

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Kube-proxy facing locking timeout in large clusters during load test with services enabled #48107

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions