Skip to content

Add multi-backend optimizer acceptance test#4049

Merged
aponcedeleonch merged 1 commit intomainfrom
aponce/optimizer-multibackend-acceptance-test
Mar 9, 2026
Merged

Add multi-backend optimizer acceptance test#4049
aponcedeleonch merged 1 commit intomainfrom
aponce/optimizer-multibackend-acceptance-test

Conversation

@aponcedeleonch
Copy link
Copy Markdown
Member

Summary

  • Issue Acceptance testing — cold start request latency when optimizing over 200 tools #3759 requires acceptance tests for cold-start request latency when optimizing over many tools. This PR adds the test infrastructure and initial multi-backend test.
  • Adds a new E2E test (virtualmcp_optimizer_multibackend_test.go) that deploys a VirtualMCPServer with 4 backends (yardstick, gofetch, osv, github-mcp-server) and validates optimizer behavior: tool exposure, cold-start latency (<5s), and search quality across backends.
  • Adds GitHubMCPServerImage constant for the github-mcp-server image used as a test backend.

Addresses #3759

Type of change

  • New feature

Test plan

  • Linting (task lint-fix)
  • Manual testing (describe below)

Verified the new test file compiles cleanly with go vet. The test is designed to run in a kind cluster with task test-e2e (operator tests). CI will validate compilation and linting.

Special notes for reviewers

The issue requests 200+ tools for scale testing. This PR uses 4 real MCP server backends which provide fewer tools. Scaling to 200+ tools is planned as a follow-up (likely by adding a tool-generation fixture or additional backends).

Generated with Claude Code

@github-actions github-actions bot added the size/S Small PR: 100-299 lines changed label Mar 9, 2026
@codecov
Copy link
Copy Markdown

codecov bot commented Mar 9, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 68.69%. Comparing base (f7cc727) to head (dfd699d).
⚠️ Report is 2 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4049      +/-   ##
==========================================
+ Coverage   68.63%   68.69%   +0.06%     
==========================================
  Files         446      446              
  Lines       45424    45424              
==========================================
+ Hits        31175    31205      +30     
+ Misses      11840    11810      -30     
  Partials     2409     2409              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@aponcedeleonch aponcedeleonch force-pushed the aponce/optimizer-multibackend-acceptance-test branch from e12142e to d428c32 Compare March 9, 2026 12:34
@github-actions github-actions bot added size/S Small PR: 100-299 lines changed and removed size/S Small PR: 100-299 lines changed labels Mar 9, 2026
@aponcedeleonch aponcedeleonch force-pushed the aponce/optimizer-multibackend-acceptance-test branch from d428c32 to aa23404 Compare March 9, 2026 13:12
@github-actions github-actions bot added size/S Small PR: 100-299 lines changed and removed size/S Small PR: 100-299 lines changed labels Mar 9, 2026
@aponcedeleonch aponcedeleonch force-pushed the aponce/optimizer-multibackend-acceptance-test branch from aa23404 to c8e9eb7 Compare March 9, 2026 13:42
@github-actions github-actions bot added size/M Medium PR: 300-599 lines changed and removed size/S Small PR: 100-299 lines changed labels Mar 9, 2026
@aponcedeleonch aponcedeleonch force-pushed the aponce/optimizer-multibackend-acceptance-test branch from c8e9eb7 to ed15e61 Compare March 9, 2026 14:02
@github-actions github-actions bot added size/M Medium PR: 300-599 lines changed and removed size/M Medium PR: 300-599 lines changed labels Mar 9, 2026
yrobla
yrobla previously approved these changes Mar 9, 2026
@aponcedeleonch aponcedeleonch force-pushed the aponce/optimizer-multibackend-acceptance-test branch from ed15e61 to ba9d4d5 Compare March 9, 2026 15:11
@github-actions github-actions bot added size/M Medium PR: 300-599 lines changed and removed size/M Medium PR: 300-599 lines changed labels Mar 9, 2026
@github-actions github-actions bot added the size/M Medium PR: 300-599 lines changed label Mar 9, 2026
@aponcedeleonch aponcedeleonch force-pushed the aponce/optimizer-multibackend-acceptance-test branch from ba9d4d5 to ff3a867 Compare March 9, 2026 16:38
@github-actions github-actions bot added size/M Medium PR: 300-599 lines changed and removed size/M Medium PR: 300-599 lines changed labels Mar 9, 2026
@aponcedeleonch aponcedeleonch force-pushed the aponce/optimizer-multibackend-acceptance-test branch from ff3a867 to 9510729 Compare March 9, 2026 19:08
@github-actions github-actions bot added size/L Large PR: 600-999 lines changed and removed size/M Medium PR: 300-599 lines changed labels Mar 9, 2026
Adds an E2E test that deploys a VirtualMCPServer with 4 backends
(yardstick, gofetch, osv, github) in a kind cluster, then validates:
- Only find_tool and call_tool are exposed in optimizer mode
- Cold-start FindTool request completes under 5 seconds
- Search results are semantically relevant across all backends

Addresses #3759 (200+ tools count to be added in a follow-up).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@aponcedeleonch aponcedeleonch force-pushed the aponce/optimizer-multibackend-acceptance-test branch from 9510729 to dfd699d Compare March 9, 2026 19:09
@github-actions github-actions bot added size/L Large PR: 600-999 lines changed and removed size/L Large PR: 600-999 lines changed labels Mar 9, 2026
@aponcedeleonch aponcedeleonch merged commit 86fcca2 into main Mar 9, 2026
36 checks passed
@aponcedeleonch aponcedeleonch deleted the aponce/optimizer-multibackend-acceptance-test branch March 9, 2026 19:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/L Large PR: 600-999 lines changed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Acceptance testing — cold start request latency when optimizing over 200 tools

3 participants