Skip to content

CI: BPF Checks --TestBPF/bpf_nat_tests.o/nat4_port_allocation_tcp #42599

@Bigdelle

Description

@Bigdelle

CI failure

In the past week, ~2.5% of BPF Checks workflow runs have been failing with this error.

--- FAIL: TestBPF (39.98s)
--- FAIL: TestBPF/bpf_nat_tests.o (1.43s)
--- FAIL: TestBPF/bpf_nat_tests.o/nat4_port_allocation_tcp (0.58s)

    bpf_test.go:480: bpf_nat_tests.c:1412: assert failed at bpf_nat_tests.c:1412

┌──────────────────────────────────────────────────────────────────────────────────────────────┐
│  STATUS │ ELAPSED │                  PACKAGE                   │ COVER │ PASS │ FAIL │ SKIP  │
│─────────┼─────────┼────────────────────────────────────────────┼───────┼──────┼──────┼───────│
│  FAIL   │ 40.01s  │ github.com/cilium/cilium/bpf/tests/bpftest │  --   │ 536  │  3   │  0    │
└──────────────────────────────────────────────────────────────────────────────────────────────┘

Looking at the code, this is the failing assertion:

/* Only occasional failures at 50% of the test. */
assert(retries_50percent[SNAT_COLLISION_RETRIES] < 15);

This tests the port allocation algorithm in SNAT, which has a level of stochasticism incorporated into it. Seems that failures should occur to some level, but more investigation is required.

Here are the two runs that have hit this:

  1. https://github.com/cilium/cilium/actions/runs/18934561156/job/54058216154
  2. https://github.com/cilium/cilium/actions/runs/19071506098/job/54475532408

The only artifacts generated during the test were the junits zip:
cilium-junits.zip

Metadata

Metadata

Assignees

Labels

area/CIContinuous Integration testing issue or flakeci/flakeThis is a known failure that occurs in the tree. Please investigate me!

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions