fix(core): use per-model settings for fast model side queries#3815
Conversation
2af695d to
3bbad53
Compare
| // Note: there is currently no "fallback mode" model routing; the model used | ||
| // is always the one explicitly requested by the caller. | ||
| }); | ||
|
|
There was a problem hiding this comment.
[Critical] createContentGeneratorForModel has zero test coverage for its success path.
The test "should resolve per-model config when the requested model differs" explicitly acknowledges reliance on createContentGenerator implicitly failing in the test environment ("Since createContentGenerator will fail in the test env (no auth), it falls back to the main content generator"), but never mocks createContentGenerator to verify the success path. The core purpose of this PR — calling buildAgentContentGeneratorConfig to construct the correct per-model config for the target model — is untested.
Impact: Regressions in buildAgentContentGeneratorConfig or createContentGenerator integration would pass all existing tests. Per-model extra_body, samplingParams, and other behaviors that customers rely on are untested.
| // Mock createContentGenerator to return a test double, then assert: | |
| // (a) buildAgentContentGeneratorConfig receives the correct arguments (authType, baseUrl, envKey from resolvedModel) | |
| // (b) the returned dedicated content generator is actually called instead of mockContentGenerator |
— deepseek-v4-pro via Qwen Code /review
| ); | ||
|
|
||
| return await createContentGenerator(targetConfig, this.config); | ||
| } catch { |
There was a problem hiding this comment.
[Suggestion] The catch {} block in createContentGeneratorForModel is completely silent — zero logging.
If createContentGenerator throws due to validation failure (missing API key, unsupported authType), all diagnostic information is discarded. In production, the fast model appears to "work" (falling back to the main generator), but abnormal behavior becomes difficult to diagnose.
| } catch { | |
| } catch (err: unknown) { | |
| debugLogger.warn( | |
| `Failed to create content generator for model "${model}", falling back to main generator.`, | |
| err instanceof Error ? err.message : String(err), | |
| ); | |
| return this.getContentGeneratorOrFail(); | |
| } |
— deepseek-v4-pro via Qwen Code /review
| COMPRESSION_TOKEN_THRESHOLD, | ||
| } from '../services/chatCompressionService.js'; | ||
| import { LoopDetectionService } from '../services/loopDetectionService.js'; | ||
| import { buildAgentContentGeneratorConfig } from '../models/content-generator-config.js'; |
There was a problem hiding this comment.
[Suggestion] buildAgentContentGeneratorConfig is imported from ../models/content-generator-config.js, but placed under the // Services section comment.
This import comes from ../models/, while all other imports are consistent with their section titles. This breaks the carefully maintained import grouping convention in the file.
| import { buildAgentContentGeneratorConfig } from '../models/content-generator-config.js'; | |
| // Move this import to after the // Services section and before the // Tools section, or create a new // Models section |
— deepseek-v4-pro via Qwen Code /review
…ueries Three changes from the review: 1. Test coverage for createContentGenerator success path — Added a test that mocks createContentGenerator to return a dedicated content generator and verifies it is used instead of the main content generator. Includes a module-level vi.mock for createContentGenerator with beforeEach override. 2. Debug logging in createContentGeneratorForModel catch block — Added debugLogger.warn with model name and error details so production failures (missing API key, unsupported authType) are diagnosable instead of silently falling back to the main generator. 3. Import grouping — Moved buildAgentContentGeneratorConfig import from under // Services to a new // Models section. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Thanks for the review — all three items have been addressed.
Silent catch block — Added Import grouping — Moved Validation:
|
wenshao
left a comment
There was a problem hiding this comment.
Suggestion: retryWithBackoff uses main model's authType for fast model queries
In generateContent (line ~1117), retryWithBackoff receives this.config.getContentGeneratorConfig()?.authType — the main model's authType. After this PR, the content generator is now per-model, but the retry logic still uses the main model's authType. In retry.ts, QWEN_OAUTH quota exhaustion detection checks against the wrong provider when the fast model differs from the main model. Currently protected by getFastModel() which validates same authType, but this is a future maintenance risk if cross-provider fast models are ever supported.
Suggestion: Missing test for !resolvedModel fallback path
The createContentGeneratorForModel method has a if (!resolvedModel) early return branch that is not covered by tests. The existing fallback test covers the catch block (when createContentGenerator rejects), but not the case where getResolvedModel returns undefined. Consider adding a test that mocks getResolvedModel to return undefined and verifies the main generator is used and buildAgentContentGeneratorConfig is not called.
— deepseek-v4-pro via Qwen Code /review
| .getModelsConfig() | ||
| .getResolvedModel(authType, model); | ||
|
|
||
| if (!resolvedModel) { |
There was a problem hiding this comment.
[Suggestion] The !resolvedModel early return path has no logging, unlike the catch block which now has debugLogger.warn. When getResolvedModel returns undefined (model not found in registry), the fallback to the main content generator is completely silent. This makes model configuration errors invisible in production.
| if (!resolvedModel) { | |
| if (!resolvedModel) { | |
| debugLogger.warn( | |
| `Model "${model}" not found in registry for authType "${authType}", falling back to main generator.`, | |
| ); | |
| return this.getContentGeneratorOrFail(); | |
| } |
— deepseek-v4-pro via Qwen Code /review
|
Commit: 1c53cce33a2fd53a0a8b5204ce58cf12d87751d0 I have addressed the reviewer feedback regarding per-model settings risk:
This change ensures that fast model configuration is consistent across the entire application and eliminates cross-auth contamination. Please let me know if there's anything else needed! |
- Use fast model's authType for retryWithBackoff instead of main model's authType. Added createRetryAuthTypeForModel() helper that resolves the target model's authType from the registry, so QWEN_OAUTH quota detection checks against the right provider. - Add debugLogger.debug() to the !resolvedModel early return path so model registry misses are diagnosable instead of silent. - Add test: model not in registry falls back to main generator without throwing. - Add test: fast model authType is used for retry, not main model authType.
Verifies buildAgentContentGeneratorConfig is NOT called when getResolvedModel returns undefined (model not in registry). Addresses PR QwenLM#3815 review round 3 feedback from @wenshao.
|
Thanks for catching that gap. The existing test for the registry-miss fallback verified the main generator was used, but didn't assert that `buildAgentContentGeneratorConfig` was never called. Added that assertion. All reviewer concerns across the three review rounds should now be covered:
Validation:
|
| ): Promise<ContentGenerator> { | ||
| try { | ||
| const authType = | ||
| this.config.getContentGeneratorConfig()?.authType ?? |
There was a problem hiding this comment.
[Suggestion] createContentGeneratorForModel 使用主模型的 authType 在模型注册表中查找目标模型(getResolvedModel(authType, model))。如果快速模型注册在不同的 auth type 下(如主模型用 QWEN_OAUTH,快模型用 Anthropic),查找会返回 undefined,静默回退到主 ContentGenerator——但请求仍携带快模型的 model ID。当前调用者均限制为同 provider 所以无实际影响,但值得面向未来加固。
| this.config.getContentGeneratorConfig()?.authType ?? | |
| // Search across all authTypes instead of only the main config's authType: | |
| private resolveModelAcrossProviders(modelId: string) { | |
| const modelsConfig = this.config.getModelsConfig(); | |
| const authTypes = [ | |
| AuthType.QWEN_OAUTH, AuthType.USE_OPENAI, | |
| AuthType.USE_VERTEX_AI, AuthType.USE_ANTHROPIC, | |
| ]; | |
| const mainAuthType = this.config.getContentGeneratorConfig()?.authType; | |
| if (mainAuthType) { | |
| const resolved = modelsConfig.getResolvedModel(mainAuthType as AuthType, modelId); | |
| if (resolved) return resolved; | |
| } | |
| for (const at of authTypes) { | |
| if (at === mainAuthType) continue; | |
| const resolved = modelsConfig.getResolvedModel(at, modelId); | |
| if (resolved) return resolved; | |
| } | |
| return undefined; | |
| } |
— deepseek-v4-pro via Qwen Code /review
There was a problem hiding this comment.
Addressed in commit 5fdc505. Added resolveModelAcrossAuthTypes() helper that searches across all authTypes to find the target model, handling the cross-provider case. The main authType is tried first for early exit, then remaining authTypes are iterated.
A cleaner data-layer solution (adding getModelAcrossAuthTypes to ModelRegistry) is noted for a follow-up PR.
Validation:
npx vitest run packages/core/src/core/client.test.ts— 68 tests passednpm run build— 0 errors
There was a problem hiding this comment.
Addressed in commit 09b44d2. Changes:
-
Cross-provider lookup — Added
resolveModelAcrossAuthTypes()that searches all authTypes (QWEN_OAUTH, USE_OPENAI, USE_VERTEX_AI, USE_ANTHROPIC, USE_GEMINI). Main authType tried first for early exit.createContentGeneratorForModelandcreateRetryAuthTypeForModelboth use it. Same-provider and cross-provider cases now handled. -
Caching — Added
perModelGeneratorCache(Map<string, ContentGenerator>) so side queries (recap, title, tool summary) reuse the generator after first instantiation. -
Orphan JSDoc fixed — Moved to directly above
createContentGeneratorForModel. -
Fallback visibility — Promoted missing-model log from
debugtowarnso misconfigured model settings are visible in production.
A cleaner data-layer solution (adding getModelAcrossAuthTypes to ModelRegistry) is TODO'd in code for a follow-up PR.
Validation: npx vitest run packages/core/src/core/client.test.ts — 68/68 passed. npm run build — 0 errors.
| AuthType.USE_OPENAI; | ||
| const resolvedModel = this.config | ||
| .getModelsConfig() | ||
| .getResolvedModel(authType, model); |
There was a problem hiding this comment.
[Critical] 跨 provider 的 authType 不匹配导致 fast model 静默解析失败
createContentGeneratorForModel(以及 createRetryAuthTypeForModel L1180)使用主 model 的 authType 来查找目标 model:
const authType = this.config.getContentGeneratorConfig()?.authType ?? AuthType.USE_OPENAI;
const resolvedModel = this.config.getModelsConfig().getResolvedModel(authType, model);但 ModelRegistry.getModel(authType, modelId) 是按 Map<AuthType, Map<string, ResolvedModelConfig>> 索引的。如果主 model 使用 QWEN_OAUTH,而 fast model 注册在 openai 下,getResolvedModel 返回 undefined,代码静默回退到主 content generator(带着错误凭证)。
测试 mock 了 getResolvedModel 使其在任何 authType 下都返回结果,掩盖了这个问题。
| .getResolvedModel(authType, model); | |
| // 修复方向:遍历所有已注册的 authType 来解析 model,或在 ModelRegistry 中添加无 authType 范围的方法 | |
| // 同时修复 createRetryAuthTypeForModel(L1180)中的相同问题 | |
| // 测试 mock 应强制执行 authType 匹配以反映真实行为 |
— deepseek-v4-pro via Qwen Code /review
|
|
||
| return await createContentGenerator(targetConfig, this.config); | ||
| } catch (err: unknown) { | ||
| debugLogger.warn( |
There was a problem hiding this comment.
[Suggestion] 回退路径无可观测性——仅有 debug 日志
catch 块仅调用 debugLogger.warn,不会输出到 stderr 或结构化遥测。生产环境中 fast model 静默降级时(包括因 Finding 1 导致的降级),仪表盘上无任何信号。
| debugLogger.warn( | |
| // 在 catch 和 !resolvedModel(L1196)分支中: | |
| // 1. 调用 logApiError 写入结构化遥测 | |
| // 2. 递增 fast_model_fallback_total 计数器 | |
| // 3. 至少输出一条 stderr warning(与 retry heartbeat 风格一致) |
— deepseek-v4-pro via Qwen Code /review
| AuthType.USE_OPENAI) | ||
| : this.config.getContentGeneratorConfig()?.authType; | ||
|
|
||
| const contentGenerator = isPerModel |
There was a problem hiding this comment.
[Suggestion] ContentGenerator created outside retry loop — transient failures become permanent degradation
createContentGeneratorForModel executes before retryWithBackoff. If it fails for transient reasons (OAuth token refresh race, DNS timeout), it permanently falls back to the main generator with no retry opportunity.
| const contentGenerator = isPerModel | |
| // Wrap createContentGeneratorForModel in retry logic, or at least locally retry transient errors before degrading |
— deepseek-v4-pro via Qwen Code /review
| // the target model's provider, not the main session's provider. This | ||
| // ensures QWEN_OAUTH quota detection checks against the right provider. | ||
| const retryAuthType = isPerModel | ||
| ? (this.createRetryAuthTypeForModel(model) ?? |
There was a problem hiding this comment.
[Suggestion] retryAuthType mismatched with actual generator on fallback
When createRetryAuthTypeForModel successfully resolves the fast model's authType (e.g. openai), but createContentGeneratorForModel fails and falls back to the main generator (e.g. QWEN_OAUTH), retryAuthType remains openai — while the actual API call uses QWEN_OAUTH.
| ? (this.createRetryAuthTypeForModel(model) ?? | |
| // On the fallback path in createContentGeneratorForModel, recompute retryAuthType to match the generator actually used |
— deepseek-v4-pro via Qwen Code /review
| return resolvedModel?.authType; | ||
| } | ||
|
|
||
| private async createContentGeneratorForModel( |
There was a problem hiding this comment.
[Suggestion] ContentGenerator recreated on every side query (no caching)
Each call to createContentGeneratorForModel creates a ContentGenerator from scratch (including new OpenAI(...) HTTP client and connection pool). Session recap and tool summaries may be triggered multiple times within the same conversation turn.
| private async createContentGeneratorForModel( | |
| // Cache ContentGenerator keyed by model string in GeminiClient, invalidate on setModel | |
| private perModelGenerators = new Map<string, ContentGenerator>(); |
— deepseek-v4-pro via Qwen Code /review
| expect(buildAgentContentGeneratorConfig).not.toHaveBeenCalled(); | ||
| }); | ||
|
|
||
| it('should use fast model authType for retry, not main model authType', async () => { |
There was a problem hiding this comment.
[Suggestion] retryAuthType test is a false positive — actual retryAuthType never verified
The "should use fast model authType for retry" test only asserts that getResolvedModel was called, but never verifies the actual authType value passed to retryWithBackoff. retryWithBackoff isn't even mocked in this test file.
| it('should use fast model authType for retry, not main model authType', async () => { | |
| // Mock retryWithBackoff and assert the authType it receives: | |
| // expect(mockRetryWithBackoff).toHaveBeenCalledWith( | |
| // expect.any(Function), | |
| // expect.objectContaining({ authType: 'openai' }), | |
| // ); |
— deepseek-v4-pro via Qwen Code /review
ReviewBuild + 68 tests pass. The fix is correct for the documented scenario (same provider, fast model in registry). A few concerns about scope and missing coverage. 🟡 Substantive1. The cross-provider scenario from the issue is not actually fixed. Issue #3765 explicitly calls out: "If the fast model lives on a different provider/base URL than the main model, side queries today would silently go to the wrong endpoint with the wrong key. It only 'works' if both models happen to share the same base URL and API key." But the PR's lookup uses the main authType: const authType = this.config.getContentGeneratorConfig()?.authType ?? AuthType.USE_OPENAI;
const resolvedModel = this.config.getModelsConfig().getResolvedModel(authType, model);If the fast model is registered under a different authType in The same-provider case (the OP's setup, both on Dashscope) is fixed. The cross-provider case is not. Suggestions:
If cross-provider is intentionally out of scope, please update the PR description and issue thread so #3765 isn't closed under a broader claim than what's delivered. 2. No caching — every side query rebuilds the ContentGenerator.
private perModelGeneratorCache = new Map<string, ContentGenerator>();
private async createContentGeneratorForModel(model: string) {
const cached = this.perModelGeneratorCache.get(model);
if (cached) return cached;
// ... existing build logic ...
this.perModelGeneratorCache.set(model, generator);
return generator;
}Cache invalidation needed when fast-model settings change ( 3. Tests verify routing, not the actual behavior change. All 5 new tests assert " That's the user-visible behavior in #3765. If someone later breaks Suggested: a test that mocks 4. The assertion only verifies 🟢 Smaller items5. Orphan JSDoc.
6. Once in 7. Fallback log is too quiet. When Consider promoting the fallback to a one-time visible warning (via Verification
Priority
Direction is right and the per-model hook lands in a sensible place. The main thing is making sure scope is explicit so the issue reporter's secondary complaint isn't silently dropped. |
Address PR QwenLM#3815 review round 3: createContentGeneratorForModel and createRetryAuthTypeForModel now search across all authTypes when resolving a model, preventing silent fallback when the fast model is registered under a different authType than the main model. Added resolveModelAcrossAuthTypes() helper in client.ts that tries the main authType first for early exit, then iterates remaining authTypes. A cleaner data-layer solution (adding getModelAcrossAuthTypes to ModelRegistry) is noted for a follow-up PR. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…kup, caching, JSDoc, visible fallback Address review feedback from wenshao (deepseek-v4-pro) on PR QwenLM#3815: 1. Cross-provider model resolution for fast model side queries - Added resolveModelAcrossAuthTypes() that searches across all authTypes - Handles case where fast model is registered under a different authType than the main model (e.g. main=QWEN_OAUTH, fast=USE_ANTHROPIC) - createContentGeneratorForModel and createRetryAuthTypeForModel now use it - Main authType tried first for early exit, then remaining authTypes iterated - TODO in code to move to ModelRegistry in a follow-up PR 2. Per-model ContentGenerator caching - Added perModelGeneratorCache (Map<string, ContentGenerator>) - Avoids rebuilding generator on every side query (recap, title, tool summary) - Typical session triggers 3-5 side queries; now only first instantiates 3. Fixed orphan JSDoc block - Moved "Return a ContentGenerator..." JSDoc from between functions to directly above createContentGeneratorForModel 4. Promoted fallback logging from debug to warn - When model not found across all authTypes, logs at warn level - Makes misconfigured model settings visible in production - Prevents silent degradation (the QwenLM#3765 symptom) Validation: - npx vitest run packages/core/src/core/client.test.ts — 68/68 passed - npm run build — 0 errors Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
wenshao
left a comment
There was a problem hiding this comment.
Round 4 review
Real progress this round — resolveModelAcrossAuthTypes lands the cross-provider fix from round 3 #1 (the must-fix), the orphan JSDoc is moved to the right function, and the fallback log is promoted from debug to warn. CI is green and 68/68 tests still pass. That's the bulk of round 3 addressed correctly.
But three things from the previous review are either undone or partially done in a way that creates a new gap, and the new ~80 lines have zero test coverage.
🔴 Cache claims invalidation it doesn't have
/**
* Cache of per-model ContentGenerators keyed by model ID.
* Avoids rebuilding the generator (SDK instantiation, config resolution)
* on every side query (recap, title, tool summary).
* Cleared on config changes that could affect model settings. ← never happens
*/
private perModelGeneratorCache = new Map<string, ContentGenerator>();I grep'd the whole file — nothing ever clears this Map. The closing line of the JSDoc is a promise the code does not keep, which is worse than no comment at all because future readers (and reviewers) will trust it.
Concrete failure mode: user edits modelProviders.fast.extra_body.enable_thinking from true to false mid-session. Same model name → cache hit → returns the pre-edit ContentGenerator → the new setting is silently ignored. That's literally the #3765 symptom. The cache fix to one perf problem reintroduces the original correctness bug under config edits.
Pick one:
- Implement what the JSDoc claims — wire a hook on the config-change path (or expose
clearPerModelGeneratorCache()and call it from the same place that swapscontentGeneratorConfig). - Drop the comment line and change scope — make the cache short-lived (e.g. cleared on
resetChat()or scoped per side-query batch). Then the JSDoc should describe the actual lifetime.
Either is fine. The current state — caching + JSDoc lying about invalidation — is not.
🔴 ~80 lines of new code with 0 tests
packages/core/src/core/client.test.ts diff is empty in this commit. The new logic is non-trivial:
resolveModelAcrossAuthTypescross-authType iteration — main authType hits / main misses but secondary hits / all miss. None covered.perModelGeneratorCachehit path — second call with same model should not invokecreateContentGenerator. Not covered.
Round 3 #1 was a must-fix; merging the implementation without tests means a future change to getResolvedModel ordering, the hardcoded allAuthTypes array, or the cache key strategy will silently regress. Suggested skeletons:
it('resolveModelAcrossAuthTypes: falls through to a non-main authType when main lookup misses', async () => {
// mock getModelsConfig().getResolvedModel:
// (mainAuthType, 'fast-id') → undefined
// (USE_ANTHROPIC, 'fast-id') → { authType: USE_ANTHROPIC, ... }
// expect createContentGeneratorForModel('fast-id') to use the USE_ANTHROPIC config
});
it('perModelGeneratorCache: second call for same model reuses generator', async () => {
const createSpy = vi.mocked(createContentGenerator);
await client['createContentGeneratorForModel']('fast-id'); // build
await client['createContentGeneratorForModel']('fast-id'); // hit
expect(createSpy).toHaveBeenCalledTimes(1);
});🟡 Round 3 #3 (actual-behavior regression test) and #4 (test rename) untouched
These were should-fix last round and are unchanged in this commit:
- #3 — still no test asserting the fast model's
extra_bodyactually appears in the outgoing request, only that the right config-resolution function was invoked. The user-visible behavior in #3765 is still not pinned. - #4 —
'should use fast model authType for retry, not main model authType'still asserts only thatgetResolvedModelwas called with the main authType, not that the retry path actually receives the fast model's authType.
Carry forward, please.
🟢 Smaller items (non-blocking)
-
Hardcoded
allAuthTypesarray is acknowledged by the in-code TODO ("Move cross-authType resolution to ModelRegistry"). Fine as a stop-gap, butObject.values(AuthType)is a one-line change that would auto-cover any future authType added to the enum:for (const authType of Object.values(AuthType)) { if (authType === mainAuthType) continue; ... }
Same correctness, lower maintenance footprint.
-
debugLogger.warnvisibility — depends on whether the qwen-codedebugLoggergateswarnbehind a debug flag. If yes, this is still effectively silent in production and round 3 #7 isn't really fixed; if no, it's good. Quick way to check: grep for whatever sets the underlyingdebugenable bit. If it gateswarn, surface viaeventEmitter/ stderr instead.
Priority
- Must fix before approve: cache invalidation (🔴 above) + tests for the new code (🔴 above).
- Should fix: carry forward round 3 #3 + #4.
- Optional:
Object.values(AuthType)swap,debugLogger.warnvisibility check.
Net: this is genuinely close. The cross-provider fix is correct, the cache is the right idea — just needs invalidation wired up and a couple of tests on the new code, plus the two carry-overs.
…r generator cache and fix false-positive retryAuthType test
|
Quick FYI before you spin another round: Lint CI is red on
Also one substantive thing — I think round 4 #1 got crossed wires. The "cache claims invalidation it doesn't have" item wasn't about a race condition; it's about stale cache after settings edits: user edits Switching Also still outstanding from round 4: a test for Happy to take another careful pass once lint is green and those two are addressed. |
- Fix lint error: pass AbortSignal instead of undefined in cache test - Fix cache JSDoc: remove false claim about config-change invalidation, add actual cache clear in resetChat() - Add test: resolveModelAcrossAuthTypes cross-authType fallback (main miss → secondary hit) - Add test: per-model generator cache cleared on resetChat
c25c0c0 to
1d033db
Compare
|
Commit: Thanks for the thorough round 4 review. All three items addressed: 1. Cache JSDoc lying about invalidation (🔴): Fixed. Removed the false "Cleared on config changes" line from the JSDoc. Added 2. ~80 lines with 0 tests (🔴): Added two new tests:
3. Lint CI red: Fixed the Validation: |
|
Thanks —
Only thing left: branch is currently |
…wenLM#3765) When side queries (session recap, title generation, tool summary) run on the fast model, they previously used the main model's ContentGeneratorConfig, causing extra_body/samplingParams/reasoning settings to leak from the main model into fast model requests. Changes: - client.ts: In generateContent(), when the requested model differs from the main model, resolve the target model's own ContentGeneratorConfig via buildAgentContentGeneratorConfig() and create a dedicated ContentGenerator. Falls back to the main content generator if the model is not in the registry or if creation fails (e.g. test environments). - client.test.ts: Add 2 tests verifying per-model config resolution is attempted for different models and skipped when model matches main. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ueries Three changes from the review: 1. Test coverage for createContentGenerator success path — Added a test that mocks createContentGenerator to return a dedicated content generator and verifies it is used instead of the main content generator. Includes a module-level vi.mock for createContentGenerator with beforeEach override. 2. Debug logging in createContentGeneratorForModel catch block — Added debugLogger.warn with model name and error details so production failures (missing API key, unsupported authType) are diagnosable instead of silently falling back to the main generator. 3. Import grouping — Moved buildAgentContentGeneratorConfig import from under // Services to a new // Models section. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Use fast model's authType for retryWithBackoff instead of main model's authType. Added createRetryAuthTypeForModel() helper that resolves the target model's authType from the registry, so QWEN_OAUTH quota detection checks against the right provider. - Add debugLogger.debug() to the !resolvedModel early return path so model registry misses are diagnosable instead of silent. - Add test: model not in registry falls back to main generator without throwing. - Add test: fast model authType is used for retry, not main model authType.
Verifies buildAgentContentGeneratorConfig is NOT called when getResolvedModel returns undefined (model not in registry). Addresses PR QwenLM#3815 review round 3 feedback from @wenshao.
Address PR QwenLM#3815 review round 3: createContentGeneratorForModel and createRetryAuthTypeForModel now search across all authTypes when resolving a model, preventing silent fallback when the fast model is registered under a different authType than the main model. Added resolveModelAcrossAuthTypes() helper in client.ts that tries the main authType first for early exit, then iterates remaining authTypes. A cleaner data-layer solution (adding getModelAcrossAuthTypes to ModelRegistry) is noted for a follow-up PR. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…kup, caching, JSDoc, visible fallback Address review feedback from wenshao (deepseek-v4-pro) on PR QwenLM#3815: 1. Cross-provider model resolution for fast model side queries - Added resolveModelAcrossAuthTypes() that searches across all authTypes - Handles case where fast model is registered under a different authType than the main model (e.g. main=QWEN_OAUTH, fast=USE_ANTHROPIC) - createContentGeneratorForModel and createRetryAuthTypeForModel now use it - Main authType tried first for early exit, then remaining authTypes iterated - TODO in code to move to ModelRegistry in a follow-up PR 2. Per-model ContentGenerator caching - Added perModelGeneratorCache (Map<string, ContentGenerator>) - Avoids rebuilding generator on every side query (recap, title, tool summary) - Typical session triggers 3-5 side queries; now only first instantiates 3. Fixed orphan JSDoc block - Moved "Return a ContentGenerator..." JSDoc from between functions to directly above createContentGeneratorForModel 4. Promoted fallback logging from debug to warn - When model not found across all authTypes, logs at warn level - Makes misconfigured model settings visible in production - Prevents silent degradation (the QwenLM#3765 symptom) Validation: - npx vitest run packages/core/src/core/client.test.ts — 68/68 passed - npm run build — 0 errors Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…r generator cache and fix false-positive retryAuthType test
- Fix lint error: pass AbortSignal instead of undefined in cache test - Fix cache JSDoc: remove false claim about config-change invalidation, add actual cache clear in resetChat() - Add test: resolveModelAcrossAuthTypes cross-authType fallback (main miss → secondary hit) - Add test: per-model generator cache cleared on resetChat
1d033db to
1a5e2eb
Compare
|
Rebased onto latest main (commit |
Local verification
The revert step is the meaningful one — it confirms the new tests actually exercise the fix and aren't tautological. The single case that still passes is Failures map cleanly to the symptoms of #3765:
Minor: PR description lists |
wenshao
left a comment
There was a problem hiding this comment.
Verified locally per comment above; regression check confirms the new tests catch the bug. LGTM.
- Fix lint error: pass AbortSignal instead of undefined in cache test - Fix cache JSDoc: remove false claim about config-change invalidation, add actual cache clear in resetChat() - Add test: resolveModelAcrossAuthTypes cross-authType fallback (main miss → secondary hit) - Add test: per-model generator cache cleared on resetChat
Fixes #3765. Side queries (session recap, title generation, tool-use summary) running on the fast model previously inherited the main model's ContentGeneratorConfig, leaking extra_body / samplingParams / reasoning between models. GeminiClient.generateContent() now resolves the target model's own config via buildAgentContentGeneratorConfig() when the requested model differs from the main model, and creates a dedicated ContentGenerator (cached, cleared on resetChat). Cross-authType resolution lets the fast model live under a different provider than the main model. Uses the target model's authType for retry logic so provider-specific checks (e.g. QWEN_OAUTH quota detection) reference the correct provider. Falls back to the main generator if the model is not in the registry.
…#3815) Fixes QwenLM#3765. Side queries (session recap, title generation, tool-use summary) running on the fast model previously inherited the main model's ContentGeneratorConfig, leaking extra_body / samplingParams / reasoning between models. GeminiClient.generateContent() now resolves the target model's own config via buildAgentContentGeneratorConfig() when the requested model differs from the main model, and creates a dedicated ContentGenerator (cached, cleared on resetChat). Cross-authType resolution lets the fast model live under a different provider than the main model. Uses the target model's authType for retry logic so provider-specific checks (e.g. QWEN_OAUTH quota detection) reference the correct provider. Falls back to the main generator if the model is not in the registry.
Summary
Fixes #3765. When side queries (session recap, title generation, tool-use summary) run on the fast model, they previously used the main model's
ContentGeneratorConfig, causingextra_body,samplingParams, andreasoningsettings to leak from the main model into fast model requests.For example, if the main model has
extra_body.enable_thinking: trueand the fast model hasextra_body.enable_thinking: false, side queries would still sendenable_thinking: trueto the fast model because the main model's config was used.Changes
`client.ts`: In
GeminiClient.generateContent(), when the requested model differs from the main model, resolve the target model's ownContentGeneratorConfigviabuildAgentContentGeneratorConfig()(same pattern used for subagents) and create a dedicatedContentGeneratorwith the correct per-model settings.Falls back to the main content generator if the model is not in the registry or if creation fails (e.g. test environments without full auth).
`client.test.ts`: Add 2 tests:
getResolvedModelis called when model differs from maingetResolvedModelis NOT called when model matches mainValidation
npm run build— 0 errorsnpx vitest run packages/core/src/core/client.test.ts— 67 tests passnpx vitest run packages/core/src/services/sessionRecap.test.ts— passesnpx vitest run packages/core/src/services/sessionTitle.test.ts— 15 tests passnpx vitest run packages/core/src/services/toolUseSummary.test.ts— 35 tests passRisk
Low. Only affects the code path where
model !== mainModelingenerateContent(). The main model path is unchanged. The fallback to the main content generator ensures graceful degradation if the target model is not in the registry.