-
Notifications
You must be signed in to change notification settings - Fork 3.7k
CI: TestScript/healthserver-proxy-redirect #44414
Copy link
Copy link
Closed
Labels
area/CIContinuous Integration testing issue or flakeContinuous Integration testing issue or flakearea/loadbalancingImpacts load-balancing and Kubernetes service implementationsImpacts load-balancing and Kubernetes service implementationsci/flakeThis is a known failure that occurs in the tree. Please investigate me!This is a known failure that occurs in the tree. Please investigate me!
Description
The TestScript/healthserver-proxy-redirect.txtar test is flaky in the CI pipeline, failing around %5 of the time in the last week.
Here is a snippet of the failure outputs seen
--- FAIL: TestScript/healthserver-proxy-redirect.txtar (9.46s)
scripttest.go:253: 2026-02-17T21:01:39Z
scripttest.go:255: $WORK=/tmp/TestScripthealthserver-proxy-redirect.txtar4279584420/001
scripttest.go:72:
[stdout]
HEALTHADDR=127.1.220.95
DATADIR=/home/runner/work/cilium/cilium/pkg/loadbalancer/healthserver/testdata
***
WORK=/tmp/TestScripthealthserver-proxy-redirect.txtar4279584420/001
TMPDIR=/tmp/TestScripthealthserver-proxy-redirect.txtar4279584420/001/tmp
scripttest.go:72: #! --enable-health-check-nodeport
# Add a node address. (0.002s)
> db/insert node-addresses addrv4.yaml
> hive start
> env HEALTHPORT=40002
> replace '$HEALTHPORT' $HEALTHPORT service.yaml
scripttest.go:72: # Add endpoints first to avoid races with health server setup. (0.001s)
> k8s/add endpointslice.yaml
scripttest.go:72: # Add the service and verify the health server response.
> k8s/add service.yaml
> * http/get http://$HEALTHADDR:$HEALTHPORT healthserver.before
scripttest.go:72: (command "* http/get http://$HEALTHADDR:$HEALTHPORT healthserver.before" failed, retrying in 20ms...)
scripttest.go:72: (command "* http/get http://$HEALTHADDR:$HEALTHPORT healthserver.before" failed, retrying in 40ms...)
scripttest.go:72: (command "* http/get http://$HEALTHADDR:$HEALTHPORT healthserver.before" failed, retrying in 80ms...)
scripttest.go:72: (command "* http/get http://$HEALTHADDR:$HEALTHPORT healthserver.before" succeeded after 3 retries in 0.146s)
> * cmp healthserver.expected healthserver.before
scripttest.go:72: # Simulate proxy redirection (0.001s)
> svc/set-proxy-redirect test/echo 1000
[stdout]
Set ProxyRedirect (port=1000) on service test/echo
scripttest.go:72: # Verify synthetic endpoint count of 1
> * http/get http://$HEALTHADDR:$HEALTHPORT healthserver.after
> * cmp healthserver-proxy.expected healthserver.after
diff healthserver-proxy.expected healthserver.after
--- healthserver-proxy.expected
+++ healthserver.after
@@ -3,6 +3,6 @@
Content-Type=application/json
Date=<omitted>
X-Content-Type-Options=nosniff
-X-Load-Balancing-Endpoint-Weight=1
+X-Load-Balancing-Endpoint-Weight=3
---
-{"service":{"namespace":"test","name":"echo"},"localEndpoints":1}
+{"service":{"namespace":"test","name":"echo"},"localEndpoints":3}
Here are a few instances of this test failing in CI workflows
https://github.com/cilium/cilium/actions/runs/22114971162
https://github.com/cilium/cilium/actions/runs/22085231113
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
area/CIContinuous Integration testing issue or flakeContinuous Integration testing issue or flakearea/loadbalancingImpacts load-balancing and Kubernetes service implementationsImpacts load-balancing and Kubernetes service implementationsci/flakeThis is a known failure that occurs in the tree. Please investigate me!This is a known failure that occurs in the tree. Please investigate me!