test: stabilize main e2e flakes#3992
Merged
Merged
Conversation
Contributor
Code Coverage Summary
CLI Package - Full Text ReportCore Package - Full Text ReportFor detailed HTML reports, please see the 'coverage-reports-22.x-ubuntu-latest' artifact from the main CI run. |
wenshao
approved these changes
May 10, 2026
wenshao
left a comment
Collaborator
There was a problem hiding this comment.
E2E 测试修复审查通过 ✅
tool-control.test.ts— 临时目录隔离 +updatedInput透传修复 +read_file断言强化file-system.test.ts—args.includes(fileName)过滤缩窄遥测断言cron-interactive.test.ts—waitForScreen替换idle(8000)消除竞态single-turn.test.ts/system-control.test.ts/cron-tools.test.ts— 小幅调整
Build、typecheck、ESLint 均无问题。LGTM!
— DeepSeek/deepseek-v4-pro via Qwen Code /review
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
What changed:
updatedInputE2E cases in unique temporary subdirectories so retained CI output cannot leak an oldtest.txtinto later runs.read_filecall that targetsnon_existent.txt.idle(8000)in the cron interactive test with a condition-based screen wait.Why it changed:
tool-controlupdatedInputfailures, confirming this is not a one-off CI failure.Reviewer focus:
KEEP_OUTPUT=true.Validation
Commands run:
Prompts / inputs used:
25595712775.25597002146, where Linux docker still reproduced thetool-controlupdatedInputfailures.Expected result:
tool-controlupdatedInputcases are not affected by retained files from earlier tests.Observed result:
OPENAI_API_KEY,OPENAI_BASE_URL, andOPENAI_MODELare not configured.Quickest reviewer verification path:
npm run build && npm run bundle npm run test:integration:sdk:sandbox:none -- tool-control npm run test:integration:cli:sandbox:none -- file-system npm run test:integration:interactive:sandbox:none -- cron-interactiveEvidence:
Scope / Risk
Main risk or tradeoff:
Not covered / not validated:
npx tsc --noEmit -p integration-tests/tsconfig.jsonis currently blocked by existing integration-test type errors outside this change, so it is not listed as a passing check.Breaking changes / migration notes:
Testing Matrix
Testing matrix notes:
Linked Issues / Bugs
Related to main E2E run https://github.com/QwenLM/qwen-code/actions/runs/25595712775