test(perf): skip daemon baseline harness under sandbox#4234
Conversation
The qwen-serve-baseline harness walks the daemon process tree using host-side `pgrep -P`. Under the Docker/Podman sandbox the daemon's `qwen --acp` child and its MCP grandchildren run inside the container's PID namespace, which host `pgrep` cannot observe, so the MCP-grandchild descendant walk always sees zero and times out. The test passes in the no-sandbox job but failed every retry in the Docker release job. Extend the existing Windows `SKIP` gate to also skip when sandbox is enabled, matching the precedent in acp-integration.test.ts and cron-tools.test.ts. Refs #4205
📋 Review SummaryThis PR fixes a test-environment regression introduced in #4205 by skipping the 🔍 General Feedback
🎯 Specific Feedback🔵 Low
✅ Highlights
|
Code Coverage Summary
CLI Package - Full Text ReportCore Package - Full Text ReportFor detailed HTML reports, please see the 'coverage-reports-22.x-ubuntu-latest' artifact from the main CI run. |
doudouOUC
left a comment
There was a problem hiding this comment.
Reviewed against the follow-up note on #4205 and the Docker release-job failure it called out. The change is narrowly scoped to the baseline harness skip gate, and the QWEN_SANDBOX predicate matches the existing sandbox skip precedent in the ACP / cron integration tests.
The tradeoff is intentional: this host-side ps / pgrep harness has no meaningful signal under Docker/Podman PID namespace isolation, while the no-sandbox path still runs unchanged. CI is green, and I do not see any blocking issue here.
The shared skip-helper idea from the automated comment is reasonable future cleanup, but not necessary for this targeted release-job fix.
Summary
qwen-serve-baseline.test.tsharness is now skipped when running under the Docker/Podman sandbox, in addition to the existing Windows skip.pgrep -P. Under the sandbox the daemon'sqwen --acpchild and its MCP server grandchildren run inside the container's PID namespace, which a hostpgrepcannot see. The "MCP child amplification" test therefore waits for MCP grandchildren that are structurally invisible to it, and times out on every retry. The test was added in test(perf): add daemon baseline harness (#4175 Wave 1 PR 1) #4205 and first ran in a Docker release job on the 2026-05-17 scheduled release, where it was the sole failure; it passes in the no-sandbox job in ~3.6s.SKIPpredicate — it uses the exact sameQWEN_SANDBOXcheck shape already established inacp-integration.test.ts,acp-cron.test.ts, andcron-tools.test.ts, which skip for the same PID-namespace reason.This is a test-environment fix, not a product change. The baseline harness is by design a host-side
ps/pgrepmeasurement tool (stated in its own file header); it has no meaningful definition inside a container whose process subtree lives in a separate namespace. Skipping under sandbox is the same decision already made for every other test in this suite that inspects spawned child processes — this test simply missed the gate when it was introduced. No production code path is touched, and no-sandbox coverage (the environment where these metrics are actually meaningful) is unchanged.Validation
QWEN_SANDBOX=docker/podmanthe suite reports skipped instead of failing; under no-sandbox it runs exactly as before.SKIPnow true whenQWEN_SANDBOXis set to a non-falsevalue, false otherwise (Windows behavior unchanged).SKIPconstant diff and compare tointegration-tests/cli/acp-integration.test.ts(IS_SANDBOX).Scope / Risk
pgrep-based measurements are not valid across a container PID-namespace boundary, so a Docker run produced no signal anyway, only a false failure. Full coverage remains in the no-sandbox job.Testing Matrix
Testing matrix notes:
Linked Issues / Bugs
Fixes the Docker release-job regression introduced by #4205. Refs #4175 (Mode B v0.16 rollout tracking issue).