fix(core): honor output language in side queries#4519
Conversation
wenshao
left a comment
There was a problem hiding this comment.
[Suggestion] Missed callers that produce user-visible text
Three additional runSideQuery callers produce user-visible output but were not updated with respectOutputLanguagePreference: true:
packages/core/src/tools/web-fetch.ts:154— web content summaries shown to userspackages/core/src/agents/arena/ArenaManager.ts:1690— agent approach summaries for side-by-side comparisonpackages/core/src/utils/subagentGenerator.ts:129— generated agent name/description visible to users
Consider adding the flag to these call sites for consistency with the four callers already updated.
[Suggestion] Test gaps in sideQuery.test.ts
The new tests cover the happy path (flag=true, valid file with content) but miss three realistic edge cases:
- File does not exist (exercises the
catch {}path atsideQuery.ts:157) - File is empty or whitespace-only (exercises
content.trim() || undefined) getOutputLanguageFilePath()returnsundefinedwithrespectOutputLanguagePreference: true(exercises the early return atsideQuery.ts:152)
— qwen3.7-max via Qwen Code /review
| config: Config, | ||
| ): Promise<string | undefined> { | ||
| const filePath = config.getOutputLanguageFilePath?.(); | ||
| if (!filePath) return undefined; |
There was a problem hiding this comment.
[Suggestion] readOutputLanguagePreference calls readFile on every runSideQuery invocation with no caching. The file path is static for the Config instance lifetime, and generateToolUseSummary fires on every completed tool batch — so the same small file is read from disk dozens of times per session.
Consider memoizing the result (keyed by filePath):
let cachedPreference: { filePath: string; value: string | undefined } | undefined;
async function readOutputLanguagePreference(
config: Config,
): Promise<string | undefined> {
const filePath = config.getOutputLanguageFilePath?.();
if (!filePath) return undefined;
if (cachedPreference?.filePath === filePath) return cachedPreference.value;
try {
const content = await readFile(filePath, 'utf8');
const value = content.trim() || undefined;
cachedPreference = { filePath, value };
return value;
} catch {
cachedPreference = { filePath, value: undefined };
return undefined;
}
}— qwen3.7-max via Qwen Code /review
There was a problem hiding this comment.
Handled in 84eec74. readOutputLanguagePreference now memoizes by file path, and the new regression mutates the same output-language.md after the first side query to verify subsequent calls reuse the cached preference instead of re-reading.
| } as unknown as Config; | ||
| }); | ||
|
|
||
| function writeOutputLanguageFile(content: string): string { |
There was a problem hiding this comment.
[Suggestion] writeOutputLanguageFile is duplicated verbatim across 5 test files (sideQuery.test.ts, sessionRecap.test.ts, sessionTitle.test.ts, toolUseSummary.test.ts, suggestionGenerator.test.ts). Only the temp directory prefix string differs.
Consider extracting to a shared test utility:
// packages/core/src/test-utils/outputLanguage.ts
export function writeOutputLanguageFile(content: string): string {
const dir = fs.mkdtempSync(path.join(os.tmpdir(), 'qwen-output-lang-'));
const file = path.join(dir, 'output-language.md');
fs.writeFileSync(file, content, 'utf8');
return file;
}— qwen3.7-max via Qwen Code /review
There was a problem hiding this comment.
I left this one out of the focused bug fix for now. Extracting the duplicated test helper would touch several unrelated test files, while this PR's behavioral changes stay limited to side-query output-language plumbing and its direct coverage.
| { role: 'user', parts: [{ text: RECAP_USER_PROMPT }] }, | ||
| ], | ||
| systemInstruction: RECAP_SYSTEM_PROMPT, | ||
| respectOutputLanguagePreference: true, |
There was a problem hiding this comment.
[Suggestion] RECAP_SYSTEM_PROMPT (line 20) and TITLE_SYSTEM_PROMPT (line 26 in sessionTitle.ts) both contain the instruction "Match the dominant language of the conversation (English or Chinese)." When respectOutputLanguagePreference: true appends a conflicting directive like "You MUST always respond in Chinese", the model receives two contradictory language instructions. The outcome is model-dependent and non-deterministic.
Consider either removing the "Match the dominant language" sentence from these prompts, or appending an explicit override (e.g., "This overrides any prior language instructions in this prompt.").
— qwen3.7-max via Qwen Code /review
There was a problem hiding this comment.
Handled in 84eec74. The output-language append now includes an explicit override sentence, so a configured output-language preference supersedes earlier language-selection guidance in prompts like recap/title without editing each caller prompt separately.
pomelo-nwu
left a comment
There was a problem hiding this comment.
Hi @Jerry2003826, thank you for your continued contributions — 9 PRs in a short time is impressive! 🎉
As we review your changes, we'd like to ask you to update each PR to follow the latest PR template on the main branch. The most important section is the Reviewer Test Plan, which significantly accelerates the review and merge process.
Specifically, for each PR please include:
- How to verify — clear reproduction steps so a reviewer can confirm the fix/feature
- Evidence (Before & After) — use the
tmux-real-user-testingskill (or manual tmux capture) to show before/after screenshots of the TUI behavior. Side-by-side evidence makes it much faster for maintainers to validate and merge - Tested on — fill in the OS table (macOS / Windows / Linux)
PRs with a complete Reviewer Test Plan are prioritized for review — without it, review may be delayed.
You can see the full template at: .github/pull_request_template.md
Thanks again for your effort — looking forward to getting these merged! 🚀
中文说明
你好 @Jerry2003826,感谢你的持续贡献——短时间内提交了 9 个 PR,非常高效!🎉
在 review 过程中,我们希望你能按照 main 分支上最新的 PR 模版更新每个 PR 的描述。其中最关键的部分是 Reviewer Test Plan,它能显著加速审核和合并流程。
具体来说,请为每个 PR 补充:
- How to verify — 清晰的复现步骤,让 reviewer 能确认修复/功能的效果
- Evidence (Before & After) — 使用
tmux-real-user-testingskill(或手动 tmux 截取)展示修改前后的 TUI 截图对比,前后对比能让维护者更快地验证和合并 - Tested on — 填写操作系统测试表格(macOS / Windows / Linux)
有完整 Reviewer Test Plan 的 PR 会被优先审核——缺少该部分可能会导致审核延迟。
完整模版见:.github/pull_request_template.md
再次感谢你的付出,期待尽快把这些 PR 合并!🚀
— Qwen Code
|
Updated in a22d233. This addresses the review comments by adding
I also added coverage for the requested Validation run locally on Windows: npm run test --workspace=packages/core -- src/tools/web-fetch.test.ts -t "output language"
npm run test --workspace=packages/core -- src/utils/subagentGenerator.test.ts -t "output language"
npm run test --workspace=packages/core -- src/agents/arena/ArenaManager.test.ts -t "routes all approach summaries"
npm run test --workspace=packages/core -- src/utils/sideQuery.test.ts -t "output language path|output language file is blank"
npm run test --workspace=packages/core -- src/utils/sideQuery.test.ts src/services/sessionTitle.test.ts src/services/sessionRecap.test.ts src/services/toolUseSummary.test.ts src/followup/suggestionGenerator.test.ts src/tools/web-fetch.test.ts src/agents/arena/ArenaManager.test.ts src/utils/subagentGenerator.test.ts
npx prettier --check packages/core/src/utils/sideQuery.test.ts packages/core/src/tools/web-fetch.ts packages/core/src/tools/web-fetch.test.ts packages/core/src/agents/arena/ArenaManager.ts packages/core/src/agents/arena/ArenaManager.test.ts packages/core/src/utils/subagentGenerator.ts packages/core/src/utils/subagentGenerator.test.ts
npx eslint packages/core/src/utils/sideQuery.test.ts packages/core/src/tools/web-fetch.ts packages/core/src/tools/web-fetch.test.ts packages/core/src/agents/arena/ArenaManager.ts packages/core/src/agents/arena/ArenaManager.test.ts packages/core/src/utils/subagentGenerator.ts packages/core/src/utils/subagentGenerator.test.ts
npm run lint --workspace=packages/core
npm run typecheck --workspace=packages/core
git diff --checkNote: when I initially ran several vitest commands in parallel locally on Windows, two runs hit the known |
|
Added aaf6883 for the remaining I also rechecked the three user-visible call sites from the review against the current branch head; they already pass
Validation: npm run test --workspace=@qwen-code/qwen-code-core -- src/utils/sideQuery.test.ts -t "output language"
npm run test --workspace=@qwen-code/qwen-code-core -- src/utils/sideQuery.test.ts
npx eslint packages/core/src/utils/sideQuery.ts packages/core/src/utils/sideQuery.test.ts packages/core/src/tools/web-fetch.ts packages/core/src/agents/arena/ArenaManager.ts packages/core/src/utils/subagentGenerator.ts
npx prettier --check packages/core/src/utils/sideQuery.ts packages/core/src/utils/sideQuery.test.ts packages/core/src/tools/web-fetch.ts packages/core/src/agents/arena/ArenaManager.ts packages/core/src/utils/subagentGenerator.ts
npm run lint --workspace=@qwen-code/qwen-code-core
npm run typecheck --workspace=@qwen-code/qwen-code-core
git diff --check |
| ); | ||
| }); | ||
|
|
||
| it('leaves JSON system instruction unchanged when output language path is missing', async () => { |
There was a problem hiding this comment.
[Suggestion] JSON-mode negative test coverage is asymmetric with text mode. Text mode has three negative tests (path undefined, file missing, file blank), but JSON mode has only one (file missing). Also: the test name says "path is missing" when the file at the path is missing — the text-mode tests correctly distinguish these cases. And the bare path.join(os.tmpdir(), 'qwen-side-query-missing.md') is not parallel-safe; the text-mode "file is missing" test at line ~515 correctly uses fs.mkdtempSync for isolation.
Consider adding two JSON-mode tests mirroring text mode, and using mkdtempSync for the existing one:
it('leaves JSON system instruction unchanged when output language path is undefined', async () => {
vi.mocked(mockBaseLlmClient.generateJson).mockResolvedValue({ ok: true });
// mockConfig already returns undefined for getOutputLanguageFilePath
await runSideQuery<{ ok: boolean }>(mockConfig, { /* ... */ respectOutputLanguagePreference: true });
expect(callArg.systemInstruction).toBe('custom JSON side query prompt');
});
it('leaves JSON system instruction unchanged when output language file is missing', async () => {
vi.mocked(mockBaseLlmClient.generateJson).mockResolvedValue({ ok: true });
const dir = fs.mkdtempSync(path.join(os.tmpdir(), 'qwen-side-query-missing-'));
vi.mocked(mockConfig.getOutputLanguageFilePath).mockReturnValue(
path.join(dir, 'output-language.md'),
);
// ...
});
it('leaves JSON system instruction unchanged when output language file is blank', async () => {
vi.mocked(mockBaseLlmClient.generateJson).mockResolvedValue({ ok: true });
vi.mocked(mockConfig.getOutputLanguageFilePath).mockReturnValue(
writeOutputLanguageFile(' \n\t '),
);
// ...
});— qwen3.7-max via Qwen Code /review
There was a problem hiding this comment.
Added in e0c28da. JSON mode now mirrors the text-mode negative coverage: undefined output-language path, missing file using an isolated temp dir, and blank file all leave the original JSON systemInstruction unchanged.
| const callArg = mockGenerateContent.mock.calls[0]?.[0]; | ||
| expect(callArg.systemInstruction).toContain( | ||
| 'You MUST always respond in Chinese.', | ||
| ); |
There was a problem hiding this comment.
[Suggestion] This assertion only checks that the language preference is present, but doesn't verify the original systemInstruction is preserved. Compare with existing tests in toolUseSummary.test.ts and sessionRecap.test.ts which assert both:
// Established pattern (e.g. toolUseSummary.test.ts:322-326):
expect(options.systemInstruction).toContain(TOOL_USE_SUMMARY_SYSTEM_PROMPT);
expect(options.systemInstruction).toContain('You MUST always answer in Chinese.');If appendSystemInstructionText had a bug that replaced rather than appended, this test would still pass. Add:
| ); | |
| expect(callArg.systemInstruction).toContain( | |
| 'Extract and summarize the requested information', | |
| ); | |
| expect(callArg.systemInstruction).toContain( | |
| 'You MUST always answer in Chinese.', | |
| ); |
— qwen3.7-max via Qwen Code /review
There was a problem hiding this comment.
Added in e0c28da. The web-fetch output-language test now asserts both that the original web-fetch system instruction (Extract and summarize the requested information...) is preserved and that the configured language preference is appended.
| const generateJsonCall = (mockClient.generateJson as Mock).mock.calls[0]; | ||
| const callParams = generateJsonCall[0]; | ||
| expect(callParams.systemInstruction).toContain( | ||
| 'You MUST always respond in Chinese.', |
There was a problem hiding this comment.
[Suggestion] Same as web-fetch.test.ts — this only asserts the language preference is present. The established pattern in toolUseSummary.test.ts and sessionRecap.test.ts also verifies the original systemInstruction is preserved:
| 'You MUST always respond in Chinese.', | |
| expect(callParams.systemInstruction).toContain( | |
| 'You are an elite AI agent architect', | |
| ); | |
| expect(callParams.systemInstruction).toContain( | |
| 'You MUST always answer in Chinese.', | |
| ); |
— qwen3.7-max via Qwen Code /review
There was a problem hiding this comment.
Added in e0c28da. The subagent output-language test now follows the same pattern: it verifies the original You are an elite AI agent architect system prompt remains present and the configured language preference is appended.
aaf6883 to
e0c28da
Compare
|
Updated in e0c28da for the latest review suggestions. Changes made:
Validation run locally on Windows: npm run test --workspace=packages/core -- src/utils/sideQuery.test.ts src/tools/web-fetch.test.ts src/utils/subagentGenerator.test.ts -t "output language|JSON system instruction"
npm run test --workspace=packages/core -- src/utils/sideQuery.test.ts src/tools/web-fetch.test.ts src/utils/subagentGenerator.test.ts
npx prettier --check packages/core/src/utils/sideQuery.test.ts packages/core/src/tools/web-fetch.test.ts packages/core/src/utils/subagentGenerator.test.ts
npx eslint packages/core/src/utils/sideQuery.test.ts packages/core/src/tools/web-fetch.test.ts packages/core/src/utils/subagentGenerator.test.ts
npm run typecheck --workspace=packages/core
npm run lint --workspace=packages/core
git diff --check |
wenshao
left a comment
There was a problem hiding this comment.
R3: All R2 suggestions addressed. No new issues found. ✅ — qwen3.7-max via Qwen Code /review
wenshao
left a comment
There was a problem hiding this comment.
R4 incremental review (delta since R3: sideQuery.ts memoization + override sentence, sideQuery.test.ts new tests). The memoization and override sentence address the remaining R1 Suggestions. One new Suggestion below regarding cache invalidation for long-running sessions.
— qwen3.7-max via Qwen Code /review
| 'User output language preference from output-language.md:'; | ||
| const OUTPUT_LANGUAGE_PREFERENCE_OVERRIDE = | ||
| 'This preference overrides any earlier language-selection rule in this system instruction.'; | ||
| const outputLanguagePreferenceCache = new Map< |
There was a problem hiding this comment.
[Suggestion] outputLanguagePreferenceCache is a module-level Map with no invalidation path. When a user runs /language mid-session, languageCommand.ts rewrites the file and refreshes the main conversation's system instruction — but this cache still holds the old resolved Promise. All 7 respectOutputLanguagePreference: true consumers (session title, recap, tool-use summary, follow-up suggestions, web-fetch, subagent generation, Arena) will serve the stale language rule for the remainder of the process.
The new memoization test (sideQuery.test.ts) correctly asserts this behavior, but it codifies a regression for daemon mode or any long-lived session where the user changes language.
Suggested fix: Export a clearOutputLanguagePreferenceCache() function and call it from the /language command handler (alongside the existing config.refreshHierarchicalMemory() call), or scope the cache to the Config instance lifetime rather than the module:
export function clearOutputLanguagePreferenceCache(): void {
outputLanguagePreferenceCache.clear();
}Then in languageCommand.ts:
import { clearOutputLanguagePreferenceCache } from '../utils/sideQuery.js';
// ...after updateOutputLanguageFile():
clearOutputLanguagePreferenceCache();— qwen3.7-max via Qwen Code /review
There was a problem hiding this comment.
Fixed in 28aee11. Side-query output-language caching now invalidates on file signature changes, and /language output ... explicitly clears the cached path after rewriting output-language.md. Added regressions for both the side-query refresh and the live /language command cache clear.
wenshao
left a comment
There was a problem hiding this comment.
R5 incremental review (delta since R4: cache invalidation via stat-based signatures + clearOutputLanguagePreferenceCache export). CI 10/10 all pass. 75 tests pass.
Missed caller (out of diff scope): packages/cli/src/ui/commands/summaryCommand.ts:107 calls runSideQuery to generate a user-visible project summary (displayed in TUI and saved to .qwen/PROJECT_SUMMARY.md) but does not set respectOutputLanguagePreference: true. If the user configures output language to Chinese, the project summary would still be generated in English. Consider adding respectOutputLanguagePreference: true to match the PR's stated goal.
— qwen3.7-max via Qwen Code /review
|
|
||
| // Update the rule file with the resolved language | ||
| updateOutputLanguageFile(settingValue); | ||
| clearOutputLanguagePreferenceCache(config?.getOutputLanguageFilePath?.()); |
There was a problem hiding this comment.
[Suggestion] Path mismatch between write target and cache invalidation target.
updateOutputLanguageFile(settingValue) writes to the global path (~/.qwen/output-language.md, hardcoded in languageUtils.ts), but clearOutputLanguagePreferenceCache(config?.getOutputLanguageFilePath?.()) clears the cache for the config's path, which may be a project-level path (<cwd>/.qwen/output-language.md). When a project-level file exists, the write modifies the global file while side queries re-read the unchanged project-level file — the user's /language command has no visible effect on side-query outputs.
Additionally, when config?.getOutputLanguageFilePath?.() returns undefined (e.g., config lacks the method), clearOutputLanguagePreferenceCache(undefined) fires the clear-all branch, wiping every cached path — including entries from other sessions in daemon mode.
| clearOutputLanguagePreferenceCache(config?.getOutputLanguageFilePath?.()); | |
| clearOutputLanguagePreferenceCache(); |
Using the no-arg version clears all entries as a belt-and-suspenders approach. The stat-based invalidation in readOutputLanguagePreference handles correctness automatically on the next call, so the only cost is one extra readFile per session.
— qwen3.7-max via Qwen Code /review
There was a problem hiding this comment.
Addressed in the current branch. /language output now calls clearOutputLanguagePreferenceCache() without a path after updateOutputLanguageFile(), so project/global path differences cannot leave a stale side-query language preference cached.
| } | ||
| >(); | ||
|
|
||
| export function clearOutputLanguagePreferenceCache(filePath?: string): void { |
There was a problem hiding this comment.
[Suggestion] This new public API has two branches (specific-path deletion and clear-all) but no behavioral unit tests. It is only verified via vi.fn() mock in languageCommand.test.ts, which confirms the function was called but not that it works.
Consider adding tests in sideQuery.test.ts:
- Specific-path deletion: populate cache via
runSideQuery, callclearOutputLanguagePreferenceCache(path), verify next call re-reads the file. - Clear-all: populate cache with entries, call
clearOutputLanguagePreferenceCache(), verify all entries are removed.
Also consider adding a JSDoc note that this function is a performance hint — stat-based invalidation in readOutputLanguagePreference handles correctness automatically.
— qwen3.7-max via Qwen Code /review
There was a problem hiding this comment.
Covered in the current branch. sideQuery.test.ts now exercises both cache-clear branches: specific-path deletion preserves other cached entries, and no-argument clearing refreshes all cached preferences. Verified with npm run test --workspace=packages/core -- src/utils/sideQuery.test.ts.
| ], | ||
| }, | ||
| ], | ||
| respectOutputLanguagePreference: true, |
There was a problem hiding this comment.
[Suggestion] Output language rule is appended twice to this side query. The main session's systemInstruction (extracted at line 99-105 via getGenerationConfig().systemInstruction) already embeds output-language.md content as part of userMemory (loaded through getExtensionContextFilePaths() → loadServerHierarchicalMemory()). Setting respectOutputLanguagePreference: true causes applyOutputLanguagePreference in sideQuery.ts to read the same file and append the language rule a second time.
The model receives the instruction duplicated in every summary generation — redundant tokens and a confusing artifact for anyone debugging system instructions.
Consider either:
- Removing
respectOutputLanguagePreference: truehere since the carried-overchatSystemInstructionalready contains the rule, or - Stripping the output-language section from
chatSystemInstructionbefore passing it, and keeping the explicit flag.
— qwen3.7-max via Qwen Code /review
There was a problem hiding this comment.
Addressed in the current branch. summaryCommand no longer sets respectOutputLanguagePreference: true for the project-summary side query because it carries over the main chat system instruction, which already includes the configured output-language rule through user memory.
| expect(secondCallArg.systemInstruction).not.toContain('Chinese'); | ||
| }); | ||
|
|
||
| it('clears one cached output language preference by file path', async () => { |
There was a problem hiding this comment.
[Suggestion] This test creates only a single file and verifies its cache was cleared, but does not include a negative assertion — it never checks that a different cached file's entry survived the selective clear. If clearOutputLanguagePreferenceCache(file) were accidentally changed to call .clear() instead of .delete(filePath), this test would still pass.
Consider creating a second file, caching it via runSideQuery, then after clearOutputLanguagePreferenceCache(firstFile), running a side query for the second file and asserting it still contains the original (stale) preference — proving its cache was not disturbed. This would make the two new tests properly complementary (one proves selective clear, the other proves blanket clear).
— qwen3.7-max via Qwen Code /review
There was a problem hiding this comment.
Covered in the current branch. The specific-path cache test now populates two files, clears only the first path, and asserts the second cached preference survives instead of being globally cleared. Verified with npm run test --workspace=packages/core -- src/utils/sideQuery.test.ts.
| signature, | ||
| preference: readFile(filePath, 'utf8') | ||
| .then((content) => content.trim() || undefined) | ||
| .catch(() => undefined), |
There was a problem hiding this comment.
[Suggestion] Transient readFile failure permanently caches undefined until the file's mtime or size changes.
When stat() succeeds but readFile() fails (EMFILE, permission flap, transient IO error), .catch(() => undefined) stores a Promise<undefined> as cached.preference with the current signature. On subsequent calls, stat() returns the same mtimeMs:size, the cache hits, and the language preference is silently lost.
| .catch(() => undefined), | |
| .catch(() => { | |
| outputLanguagePreferenceCache.delete(filePath); | |
| return undefined; | |
| }), |
— qwen3.7-max via Qwen Code /review
There was a problem hiding this comment.
Addressed in the current branch. readOutputLanguagePreference() now deletes the cache entry when readFile() fails for the current signature, so transient read failures do not permanently cache undefined.
Verification Report — PR #4519Reviewer: wenshao 1. Build & Type Check
2. Unit TestsPR-related tests (all pass):
Full core suite: 9438 passed, 1 failed, 3 skipped (344 files). The single failure ( 3. Code ReviewArchitecture (sideQuery.ts)The design is clean and well-layered:
Callers — Correct Opt-in7 side-query callers correctly opt in with
Test CoverageThe new tests thoroughly cover:
4. Issues Found
5. SummaryThis is a well-designed, well-tested fix. The opt-in Recommendation: ✅ Approve. Verified locally on 2026-05-28 |
|
Updated in b86309b for the transient read failure cache issue. If Validation run locally on Windows: npm run test --workspace=packages/core -- src/utils/sideQuery.test.ts -t "output language"
npm run test --workspace=packages/core -- src/utils/sideQuery.test.ts
npx prettier --check packages/core/src/utils/sideQuery.ts packages/core/src/utils/sideQuery.test.ts
npx eslint packages/core/src/utils/sideQuery.ts packages/core/src/utils/sideQuery.test.ts
npm run typecheck --workspace=packages/core |
|
CI note for the latest run: the macOS failure appears unrelated to this PR's side-query change. The failed test is in This PR only changes npm run test --workspace=packages/core -- src/utils/sideQuery.test.ts -t "output language"
npm run test --workspace=packages/core -- src/utils/sideQuery.test.ts
npm run test --workspace=packages/cli -- src/ui/components/messages/AskUserQuestionDialog.test.tsx -t "keeps bare k/j"I tried to rerun the failed workflow, but GitHub rejected it because I do not have repository admin rights. Could a maintainer rerun the failed macOS job? |
BZ-D
left a comment
There was a problem hiding this comment.
LGTM. The respectOutputLanguagePreference plumbing is thorough — correctly wired into session title, recap, tool-use summary, suggestion generator, web fetch, subagent generator, and arena manager. The stat-based cache with mtime+size signature is a good balance of freshness vs performance, and clearOutputLanguagePreferenceCache gives the /language command an explicit flush path. Edge cases (missing file, blank content, transient read failures) are all covered.
6ae137a to
21f8a79
Compare
21f8a79 to
63c46de
Compare
|
|
||
| import type { Content } from '@google/genai'; | ||
| import { describe, expect, it } from 'vitest'; | ||
| import { buildSessionRecapSideQueryOptions } from './sessionRecap.js'; |
There was a problem hiding this comment.
[Suggestion] The test was refactored from calling generateSessionRecap end-to-end to only testing buildSessionRecapSideQueryOptions. The old test exercised the full pipeline — config resolution, file read, system-instruction append, LLM call, extractRecap tag parsing, and the result.text / abort / error-swallowing branches. None of those paths are covered now.
Consider keeping the old integration test alongside the new unit test, or adding at least one test that calls generateSessionRecap through to the mocked LLM and asserts the final returned string (including the <recap> tag unwrap). The describe block name ('generateSessionRecap') also no longer matches what is actually tested.
— qwen3.7-max via Qwen Code /review
There was a problem hiding this comment.
Addressed in the current branch. sessionRecap.test.ts now keeps an end-to-end generateSessionRecap() test that exercises the side-query call and verifies tagged <recap> extraction, alongside the focused options-builder coverage for output-language opt-in.
BZ-D
left a comment
There was a problem hiding this comment.
Reviewed the diff and ran the targeted side-query language preference tests; no blocking issues found.
Possible regression:
|
7ef958e to
f9c3199
Compare
f9c3199 to
52f785e
Compare
wenshao
left a comment
There was a problem hiding this comment.
No high-confidence review findings. The implementation is sound: stat-based cache invalidation is correct, Promise deduplication across concurrent callers is efficient, and the opt-in flag is consistently applied to all user-visible callers. The session recap E2E test coverage gap flagged in R9 has been addressed in this commit. 255 tests pass, ESLint and TypeScript clean.
Needs Human Review (low-confidence, terminal-only):
readOutputLanguagePreferencecatch blocks silently returnundefinedwith no debug logging — consider addingcreateDebugLoggerfor troubleshooting.- Em-dash regressed to ASCII hyphen in
sessionRecap.tscomment during extraction. languageCommand.tscomment slightly overstatesclearOutputLanguagePreferenceCachenecessity.
Note: PR has merge conflicts with main that need resolution before merging.
— qwen3.7-max via Qwen Code /review
…output-language # Conflicts: # packages/cli/src/services/insight/generators/DataProcessor.test.ts
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
DragonnZhang
left a comment
There was a problem hiding this comment.
Review Summary
Verdict: No new findings. This PR has been reviewed exhaustively (8+ prior review rounds, 23 inline comments, 9 issue comments). After reading all 24 changed files in full and cross-referencing every existing comment, I found no high-confidence issues that are not already tracked.
What was checked
- sideQuery.ts core logic:
respectOutputLanguagePreferenceopt-in gate,readOutputLanguagePreferencecaching with stat-based invalidation,appendSystemInstructionTextfor all 5systemInstructionvariants,clearOutputLanguagePreferenceCacheselective and full clear,isJsonOptionstype guard, concurrent-call safety via sharedPromisededuplication. Implementation is correct. - All 11 caller sites: sessionTitle, sessionRecap, toolUseSummary, suggestionGenerator, web-fetch, subagentGenerator, ArenaManager, DataProcessor (both
analyzeSessionandgenerateQualitativeInsights), LlmRewriter, summaryCommand (correctly skipped with explanatory comment). Each user-visible caller opts in; internal/classifier callers do not. - Test coverage: ~512 new lines of tests covering positive opt-in for text and JSON modes, negative paths (missing file, blank file, undefined path),
Part[]/Content/single-Partsystem instruction variants, cache refresh on file change, selective and full cache clear, transient read failure recovery, and opt-in gate negative test. Coverage is thorough. - languageCommand.ts:
clearOutputLanguagePreferenceCache()called afterupdateOutputLanguageFile()with clear-all (no path arg) to handle global-vs-project path differences. Correct. - config.ts:
getOutputLanguageFilePathrelocated for grouping; no behavioral change.
CI status
CI is currently failing (Lint, CodeQL, Test, Post Coverage Comment, Classify PR). Some failures appear environment-related (the macOS AskUserQuestionDialog test failure mentioned by the author). Approve downgraded to Comment per presubmit policy.
Open items from prior reviewers
Several open inline comments remain valid and tracked:
- Path mismatch between
updateOutputLanguageFile(global) andgetOutputLanguageFilePath(project-scoped) — mitigated by clear-all strategy sessionRecap.test.tsreduced from end-to-end to builder-only testingsummaryCommand.tsdouble-injection concern — addressed by inline comment explaining the carried system instruction already includes the language rule
— qwen-code via Qwen Code /review
|
Updated this branch against current The
Local validation on Windows:
Note: full |
DragonnZhang
left a comment
There was a problem hiding this comment.
Incremental Review (R2) — delta since R1
HEAD: 8730d9f06579 (two new commits: merge from origin/main + test: normalize dev launcher paths on Windows)
CI: 13/13 checks pass.
Delta reviewed
scripts/tests/dev.test.js— AddsnormalizePathhelper that converts Windows backslashes to forward slashes before substring assertions. Correct fix:String(filePath).endsWith('node_modules/tsx/dist/cli.mjs')would fail on native Windows paths wherefilePathcontains backslashes. The refactor to destructurespawnMock.mock.calls[0]and assert each field individually is clearer than the prior monolithictoHaveBeenCalledWith.
Verdict
No high-confidence findings in the delta. The Windows path normalization is a clean, well-scoped test fix.
The six stale inline comments from prior reviewers (IDs 3300572305, 3301123073, 3307003468, 3308736776, 3312297210, 3337943627) are all addressed in the current code:
- sessionRecap.ts language override — the appended text includes an explicit override sentence.
- JSON negative coverage — mirrored text-mode negative tests now present.
- languageCommand path mismatch —
clearOutputLanguagePreferenceCache()(no-arg) correctly clears the entire map. - summaryCommand double injection — correctly omits
respectOutputLanguagePreference: truesince the carried-over system instruction already embeds the rule. - Transient readFile caching — the
.catchhandler now deletes the cache entry on read failure. - sessionRecap test refactor — the
generateSessionRecapend-to-end test is retained withrunSideQuerymocked;buildSessionRecapSideQueryOptionsgets separate unit coverage.
— qwen3-coder via Qwen Code /review
…output-language # Conflicts: # scripts/tests/dev.test.js
DragonnZhang
left a comment
There was a problem hiding this comment.
Review: APPROVE
Scope: 30 files, +1151/-101 — adds opt-in respectOutputLanguagePreference to runSideQuery for user-visible side queries (session title, recap, tool-use summary, suggestions, web-fetch summaries, subagent artifacts, arena summaries, /insight reports, LLM rewriter). Includes stat-based file cache with invalidation on language change. Bundles two small fixes (cronScheduler creation-minute guard, chatCompression hard-threshold formula) from upstream main merges.
CI: All checks pass (Lint, Test x3 platforms, CodeQL).
Findings: No high-confidence bugs, security issues, or critical logic errors identified.
Analysis
- Side-query opt-in coverage: Correctly applied to all 10 user-visible call sites (sessionTitle, sessionRecap, toolUseSummary, suggestionGenerator, web-fetch, subagentGenerator, ArenaManager, LlmRewriter, DataProcessor x2). Intentionally omitted from internal classifiers (permissions/classifier, relevanceSelector, forget, nextSpeakerChecker, chatCompression) and from summaryCommand (which already carries the language rule through user memory).
- Caching: Stat-based signature (
mtimeMs:size) with module-level Map keyed by file path. Race-safe: concurrent reads share the same Promise; readFile failures clean up only the current entry.clearOutputLanguagePreferenceCache()called on language change in languageCommand, clearing all entries to handle global/project path differences. appendSystemInstructionText: Correctly handles all system instruction shapes (string, Part, Part[], Content with parts array, and single Part fallback).- Session recap refactor: Extracted
buildSessionRecapSideQueryOptionsfor testability without behavioral change. - Bundled fixes from main merges: cronScheduler
lastFiredAtinitialization and chatCompressionhardthreshold formula — both sound and already validated by their own PRs.
What this PR does
Honors the configured output language for user-visible side queries without duplicating the language instruction in project-summary prompts.
Why it's needed
Some side-query results are shown directly to users, such as session titles, session recaps, tool-use summaries, suggestions, web-fetch summaries, generated subagent artifacts, arena summaries, and
/insightreports. These should followoutput-language.md. Internal structured side queries and summary prompts that already carry the language instruction should not receive duplicate language rules.Reviewer Test Plan
How to verify
Review the changed
runSideQuerycall sites and confirm only user-visible side queries passrespectOutputLanguagePreference: true, while internal classifier/selector-style calls and project-summary prompts do not. Then run the targeted checks below.npm run test:ci --workspace=packages/core -- src/permissions/classifier.test.ts src/services/sessionRecap.test.ts src/utils/sideQuery.test.ts npm run test --workspace=packages/cli -- src/services/insight/generators/DataProcessor.test.ts npx prettier --check packages/core/src/services/sessionRecap.test.ts packages/cli/src/services/insight/generators/DataProcessor.ts packages/cli/src/services/insight/generators/DataProcessor.test.ts npx eslint packages/core/src/services/sessionRecap.test.ts packages/cli/src/services/insight/generators/DataProcessor.ts packages/cli/src/services/insight/generators/DataProcessor.test.ts npm run typecheck --workspace=packages/core git diff --checkExpected result: the core side-query tests pass, the insight generator tests pass, and the DataProcessor calls for
insight-session-analysisandinsight-qualitative-generateboth includerespectOutputLanguagePreference: true.Evidence (Before & After)
Before: switching
runSideQueryfrom unconditional output-language injection to opt-in could make/insightreports ignoreoutput-language.md, and the session recap path lacked direct coverage thatgenerateSessionRecap()wires the recap side query with output-language opt-in.After:
DataProcessoropts in for both user-visible/insightJSON side queries,DataProcessor.test.tsverifies both call boundaries, andsessionRecap.test.tsverifiesgenerateSessionRecap()calls the recap side query withrespectOutputLanguagePreference: trueand extracts the tagged recap. This is prompt-construction behavior, so TUI screenshots are N/A.Tested on
Environment (optional)
Local Windows/PowerShell checkout with repository npm workspaces. Full local
packages/clitypecheck is not listed as a passing local check because this checkout currently reports unrelated generated/package-link errors in serve/acp/channel/web-template areas; the changed files pass targeted tests and eslint, and CI runs the repository matrix.Risk & Scope
Linked Issues
Fixes #4494
中文说明
这个 PR 做了什么
让用户可见的 side query 遵守
output-language.md,同时避免在 project summary prompt 里重复注入语言要求。为什么需要
session title、session recap、tool-use summary、follow-up suggestion、web-fetch summary、subagent artifact、arena summary 和
/insightreport 这些内容会展示给用户,因此应该遵守输出语言偏好。内部 classifier/selector 类 side query 不应该被扩大影响。Reviewer Test Plan
审核时可以检查用户可见的
runSideQuery调用是否传入respectOutputLanguagePreference: true,内部结构化调用是否没有 opt in。已在 Windows 本地验证 core side-query/sessionRecap 组合测试、CLI DataProcessor 测试、changed-file prettier/eslint、core typecheck 和git diff --check。证据
修改前:opt-in 切换后
/insight这类用户可见报告可能漏掉输出语言偏好,session recap 外层调用也缺少直接覆盖。修改后:DataProcessor两个/insightside query 都 opt in,测试覆盖 call boundary;generateSessionRecap()测试确认 recap side query opt in 并提取<recap>内容。该改动是 prompt 构造逻辑,TUI 截图不适用。风险和范围
风险主要在 opt-in 覆盖范围:需要覆盖用户可见 side query,但不扩大到内部 classifier/selector。没有公共 API 或迁移影响。