fix(serve): sync E2E baseline capabilities with registry#4284
Conversation
PRs #4249 (workspace memory + agents CRUD) and #4269 (workspace file read routes) added `workspace_memory`, `workspace_agents`, and `workspace_file_read` to `SERVE_CAPABILITY_REGISTRY` and updated the unit-level `EXPECTED_STAGE1_FEATURES` in `packages/cli/src/serve/server.test.ts`, but missed the matching integration-test expectation. The E2E `qwen serve — capabilities envelope > advertises all baseline capabilities` assertion has been failing on `main` since those PRs landed. Append the three tags in the same positions as `SERVE_CAPABILITY_REGISTRY` and the unit-level constant (`workspace_memory` + `workspace_agents` after `workspace_providers`, `workspace_file_read` after `mcp_guardrails`). No production code changes — same shape as #4268.
There was a problem hiding this comment.
Pull request overview
This PR updates the qwen serve integration-test capability baseline so the E2E expectation matches the current serve capability registry after recently added workspace features. No production code changes are included.
Changes:
- Adds
workspace_memoryandworkspace_agentsto the expected baseline features. - Adds
workspace_file_readto the expected baseline features. - Preserves ordering consistent with
SERVE_CAPABILITY_REGISTRYand unit-level expectations.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
📋 Review SummaryThis PR fixes a test synchronization issue by updating the E2E integration test baseline to include 🔍 General Feedback
🎯 Specific Feedback🔵 Low
✅ Highlights
|
Code Coverage Summary
CLI Package - Full Text ReportCore Package - Full Text ReportFor detailed HTML reports, please see the 'coverage-reports-22.x-ubuntu-latest' artifact from the main CI run. |
wenshao
left a comment
There was a problem hiding this comment.
No issues found. LGTM! ✅ — DeepSeek/deepseek-v4-pro via Qwen Code /review
Two regressions introduced by #4271 (MCP guardrail push events) had been failing every main E2E run since the PR landed. Both fixes are in integration tests; no source changes. 1. `qwen serve — capabilities envelope > advertises all baseline capabilities`. `mcp_guardrail_events` was added to `SERVE_CAPABILITY_REGISTRY` and to the unit baseline list (`packages/cli/src/serve/server.test.ts:119`) but not to the integration test's hand-maintained list. Same drift class as #4268 / #4284. Fix: append the tag in registry order. 2. `MCP child amplification (P1 baseline) > clientCount matches external pgrep observation`. The test (added by #4271, never passed CI) asserted `pgrep_observed === MCP_SERVERS_CONFIGURED`, ignoring that an ACP child runs TWO `Config` objects — bootstrap (`runAcpAgent` → `config.initialize`) + per-session (`newSessionConfig` → `config.initialize`) — each with its own `McpClientManager`. After one session, pgrep observes 2×N grandchildren while `/workspace/mcp` snapshot (`buildWorkspaceMcpStatus(this.config)`) reads only the bootstrap manager (=N). Fix: encode the 2× architectural amplification literally so a future follow-up that unifies the managers fails this assertion and forces an explicit update; keep `clientCount === MCP_SERVERS_CONFIGURED` and the original `clientCount ≤ pgrep` over-report guard intact. Verified locally: both tests pass on first attempt (no retries) via `vitest run --root ./integration-tests cli/qwen-serve-routes.test.ts cli/qwen-serve-baseline.test.ts`. 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code)
Summary
workspace_memory,workspace_agents, andworkspace_file_readto the integration-testcaps.featuresbaseline so it matches the currentSERVE_CAPABILITY_REGISTRYand the unit-levelEXPECTED_STAGE1_FEATURES.qwen serve — capabilities envelope > advertises all baseline capabilitieshas been failing onmainsince feat(serve): workspace memory and agents CRUD (#4175 Wave 4 PR 16) #4249 (workspace memory + agents CRUD) and feat(serve): safe workspace file read routes (#4175 PR 19) #4269 (workspace file read routes) landed without updating the integration mirror — same shape of drift fixed by fix(serve): add mcp_guardrails to E2E capabilities expectation #4268 formcp_guardrails.Failing run that motivated this: https://github.com/QwenLM/qwen-code/actions/runs/26026289503
Test plan
npm run test:integration:sandbox:none -- integration-tests/cli/qwen-serve-routes.test.tspasses locally.qwen serve — capabilities envelope > advertises all baseline capabilitiespasses in CI on Linux (sandbox:none + sandbox:docker) and macOS.🤖 Generated with Qwen Code