Rebalance E2E test CI matrix from 4 to 8 buckets#4074
Merged
Conversation
amirejaz
previously approved these changes
Mar 10, 2026
ChrisJBurns
previously approved these changes
Mar 10, 2026
reyortiz3
previously approved these changes
Mar 10, 2026
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #4074 +/- ##
==========================================
- Coverage 68.56% 68.49% -0.07%
==========================================
Files 446 446
Lines 45574 45574
==========================================
- Hits 31246 31215 -31
- Misses 11914 11947 +33
+ Partials 2414 2412 -2 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
The `api` bucket had 191 specs (nearly 4x the ~53-test comfort zone for the 15m timeout), and 3 test files were orphaned from CI entirely because their labels didn't match any matrix filter. Split the matrix into 8 balanced buckets (33-57 specs each), fix the orphaned tests, and preserve backward compatibility so the original parent labels (mcp, api, core, etc.) still work for local development. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The proxy-mw bucket was timing out at 15 minutes after 3 previously orphaned test files were added to it. Split into two buckets (proxy with ~25 specs, middleware+stability with ~28 specs) so each fits comfortably within the timeout. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
9b466d1
901db47 to
9b466d1
Compare
ChrisJBurns
approved these changes
Mar 10, 2026
aponcedeleonch
approved these changes
Mar 10, 2026
rdimitrov
approved these changes
Mar 10, 2026
Contributor
|
FYI — the new
These tests were orphaned from CI before this PR brought them back, so the bugs have been there since the tests were written in January (#3210). Separate from the port allocation race being addressed in #4078. |
ChrisJBurns
added a commit
that referenced
this pull request
Mar 10, 2026
The SSE endpoint rewrite tests were orphaned from CI until #4074 rebalanced the E2E buckets. Now that they run, all 4 fail: - 3 tests used `--remote-url` which is not a CLI flag; the URL should be passed as the positional argument (the run command auto-detects URLs and treats them as remote servers). - 1 test used `thv list --registry` which is not valid; the correct invocation is `thv registry list`. Both flags were silently ignored because Cobra's UnknownFlags whitelist is enabled on the run command, causing the "requires at least 1 arg" error. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
3 tasks
Collaborator
|
@gkatz2 Yep, I had the PR in the works ready https://github.com/stacklok/toolhive/pull/4080/changes. Hopefully this removes the errors. |
JAORMX
added a commit
that referenced
this pull request
Mar 11, 2026
These SSE endpoint rewrite tests were orphaned from CI until the matrix rebalance (#4074), then had stale flags fixed (#4080), but have never actually passed. Four root causes are addressed: 1. GenerateMCPServerURL omits /sse for remote SSE URLs with empty path, causing waitForInitializeSuccess to GET / instead of /sse (404). 2. getSSERewriteConfig reads X-Forwarded-* from the outbound request which has auto-injected headers from SetXForwarded(), converting relative endpoint URLs into absolute ones with the proxy's own host. 3. Test 3 mock server lacks a 404 fallback for non-/sse paths. 4. Test 4 (OSV + endpoint-prefix) fails due to #3372; skip it. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3 tasks
aponcedeleonch
pushed a commit
that referenced
this pull request
Mar 11, 2026
Fix deterministic E2E proxy SSE endpoint rewrite test failures These SSE endpoint rewrite tests were orphaned from CI until the matrix rebalance (#4074), then had stale flags fixed (#4080), but have never actually passed. Four root causes are addressed: 1. GenerateMCPServerURL omits /sse for remote SSE URLs with empty path, causing waitForInitializeSuccess to GET / instead of /sse (404). 2. getSSERewriteConfig reads X-Forwarded-* from the outbound request which has auto-injected headers from SetXForwarded(), converting relative endpoint URLs into absolute ones with the proxy's own host. 3. Test 3 mock server lacks a 404 fallback for non-/sse paths. 4. Test 4 (OSV + endpoint-prefix) fails due to #3372; skip it. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
apiE2E bucket had 191 test specs crammed into one 15m CI job — nearly 4x the ~53-test safe ceiling given compilation overhead. This caused flaky timeouts and random test kills.telemetry_metrics_validation,sse_endpoint_rewrite,network_isolation) had labels that didn't match any CI matrix filter, so they never ran in CI.mcp,api,core, etc.) still work for local development.Type of change
Test plan
Verified all 21 modified test files compile (
go build ./test/e2e/...). Confirmed no test file is orphaned by checking everyDescribelabel against the CI matrix filters. Label changes are additive (sub-labels added alongside existing parent labels), so no test logic is affected.Changes
.github/workflows/e2e-tests.ymltest/e2e/telemetry_metrics_validation_e2e_test.go"middleware"label (was orphaned from CI)test/e2e/sse_endpoint_rewrite_test.go"proxy"label (was orphaned from CI)test/e2e/network_isolation_test.go"proxy"label (was orphaned from CI)test/e2e/fetch_mcp_server_test.go"mcp-run"sub-labeltest/e2e/osv_mcp_server_test.go"mcp-run"sub-labeltest/e2e/osv_streamable_http_mcp_server_test.go"mcp-protocol"sub-labeltest/e2e/remote_mcp_server_test.go"mcp-protocol"sub-labeltest/e2e/protocol_builds_e2e_test.go"mcp-protocol"sub-labeltest/e2e/inspector_test.go"mcp-protocol"sub-labeltest/e2e/inspector_autocleanup_test.go"mcp-protocol"sub-labeltest/e2e/api_registry_test.go"api-registry"sub-labeltest/e2e/api_workloads_test.go"api-workloads"sub-labeltest/e2e/api_workload_lifecycle_test.go"api-workloads"sub-labeltest/e2e/api_clients_test.go"api-clients"sub-labeltest/e2e/api_clients_validation_test.go"api-clients"sub-labeltest/e2e/api_skills_test.go"api-clients"sub-labeltest/e2e/api_discovery_test.go"api-misc"sub-labeltest/e2e/api_groups_test.go"api-misc"sub-labeltest/e2e/api_healthcheck_test.go"api-misc"sub-labeltest/e2e/api_version_test.go"api-misc"sub-labeltest/e2e/api_secrets_test.go"api-misc"sub-labeltest/e2e/README.mdSpecial notes for reviewers
Bucket distribution:
CI cost tradeoff: 8 parallel jobs instead of 4 (more billable minutes from fixed overhead), but the longest job drops from 191 specs to 57 specs, so wall-clock time should decrease significantly.
Generated with Claude Code