Speed up CI and fix flaky E2E tests#4104
Merged
ChrisJBurns merged 7 commits intomainfrom Mar 11, 2026
Merged
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #4104 +/- ##
==========================================
- Coverage 68.70% 68.65% -0.05%
==========================================
Files 454 454
Lines 46051 46051
==========================================
- Hits 31641 31618 -23
- Misses 11968 11992 +24
+ Partials 2442 2441 -1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
d5af6fc to
e604417
Compare
Signed-off-by: Chris Burns <29541485+ChrisJBurns@users.noreply.github.com>
Pre-pull Docker images (osv-mcp, gofetch, egress-proxy) in the E2E CI workflow so workload creation does not pay the image-pull cost inside the 60s API middleware timeout. This eliminates the class of flakiness where the first workload creation in a CI matrix bucket fails with HTTP 500 because the image pull exceeds the timeout. Also replace the hardcoded port 60000 in TestRunConfigBuilder with networking.FindAvailable() to avoid failures when that port is already in use on the CI runner. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Upgrade CPU/memory-intensive CI jobs from ubuntu-latest (2 cores, 7 GB) to ubuntu-8cores-32gb to speed up PR turnaround: - Linting: golangci-lint is CPU-bound, scales well with cores - Operator tests, integration tests, build: Go compilation benefits from more cores - Operator E2E tests: KIND cluster creation and Chainsaw tests are CPU/memory hungry (3 parallel jobs at ~8 min each) - Helm chart tests: KIND cluster + ko builds benefit from more CPU Quick jobs (generate-crds, generate-crd-docs) are left on ubuntu-latest since they complete in under a minute. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The v1.34.3 and v1.35.1 jobs were hitting the 30-minute timeout on ubuntu-latest (2 cores). The job does ko builds (3 images), Docker pulls (6 images), kind load operations, helm deploy, and E2E tests which is too much for a 2-core runner within the time limit. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The VirtualMCP lifecycle E2E tests run sequentially by default, taking ~24 minutes as each test suite creates/waits/tears down its own K8s resources. All test suites use unique resource names and auto-assigned NodePorts, so they are safe to run concurrently. Adding --procs=4 to the ginkgo command allows 4 test suites to run simultaneously, which should cut the test phase to ~6-8 minutes. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
e604417 to
48404c9
Compare
The mock OIDC servers in session_management_v2 and auth_discovery tests used NodePorts 30013 and 30010 which collide with auto-assigned NodePorts when tests run in parallel. Move them to 30913 and 30910 to avoid the Kubernetes auto-assignment range (which starts low). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Hardcoded NodePorts (30013, 30010) for mock OIDC test servers collided with auto-assigned NodePorts when operator E2E tests run in parallel. Instead of picking "safe" high ports, let Kubernetes auto-assign NodePorts and read them back after service creation. - DeployParameterizedOIDCServer: remove nodePort parameter, return the allocated port instead - auth_discovery_test: hoist oidcNodePort to outer var block, read back from service after creation, use dynamic port in getOIDCToken - session_management_v2_test: capture returned port from helper Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
JAORMX
approved these changes
Mar 11, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
ubuntu-latesttoubuntu-8cores-32gb. Run operator E2E tests with 4-way Ginkgo parallelism. Use dynamic NodePort allocation for mock OIDC servers so parallel test processes never collide on port numbers. Replace a hardcoded port inTestRunConfigBuilderwith dynamic allocation.Type of change
Test plan
Verified all CI workflows pass on the PR branch. Confirmed E2E Lifecycle jobs now complete in ~10 min (previously timing out at 30 min). Confirmed E2E tests no longer fail with HTTP 500 on first workload creation. Confirmed operator E2E tests pass with 8-way parallelism and dynamic NodePort allocation.
Changes
.github/workflows/e2e-tests.yml.github/workflows/lint.ymlubuntu-8cores-32gbrunner.github/workflows/operator-ci.ymlubuntu-8cores-32gb.github/workflows/helm-charts-test.ymlubuntu-8cores-32gbrunner.github/workflows/test-e2e-lifecycle.ymlubuntu-8cores-32gbrunnercmd/thv-operator/Taskfile.yml--procs=8to ginkgo for parallel operator E2E testspkg/runner/config_test.gonetworking.FindAvailable()test/e2e/thv-operator/virtualmcp/helpers.goDeployParameterizedOIDCServeruses auto-assigned NodePort and returns ittest/e2e/thv-operator/virtualmcp/virtualmcp_session_management_v2_test.gotest/e2e/thv-operator/virtualmcp/virtualmcp_auth_discovery_test.goSpecial notes for reviewers
ubuntu-latestsince they complete in under a minute.Generated with Claude Code