Skip to content

feat(cli): add Anthropic model listing support (Option A)#3863

Closed
B-A-M-N wants to merge 12 commits into
QwenLM:mainfrom
B-A-M-N:feature/anthropic-model-listing
Closed

feat(cli): add Anthropic model listing support (Option A)#3863
B-A-M-N wants to merge 12 commits into
QwenLM:mainfrom
B-A-M-N:feature/anthropic-model-listing

Conversation

@B-A-M-N

@B-A-M-N B-A-M-N commented May 6, 2026

Copy link
Copy Markdown
Contributor

Summary

Implement Option A from PR #3799 suggestion — support USE_ANTHROPIC in /model list by fetching from the official Anthropic /v1/models endpoint.

Changes

  • Add fetchAnthropicModels() in packages/cli/src/commands/auth/anthropic.ts
  • Uses x-api-key + anthropic-version: 2023-06-01 headers per API docs
  • Returns same { data: [...] } response shape as OpenAI-compatible endpoints
  • Update manageModels.ts to support 'anthropic' source
  • Add Anthropic tab to ManageModelsDialog.tsx
  • Pass authType into fetch logic so headers are constructed correctly
  • Fallback to default models on API error

Behavior

  • Gateways without /models support (MiniMax, DeepSeek, etc.) surface proper errors (404/empty) instead of the current blanket rejection
  • Official api.anthropic.com works out of the box with correct headers
  • Third-party gateways with /models support work; those without show clear error messages

Test plan

  • Verify fetchAnthropicModels() with valid API key returns models
  • Verify header construction (x-api-key, anthropic-version)
  • Verify ManageModelsDialog shows Anthropic tab
  • Verify error handling for gateways without /models support

🤖 Generated with Claude Code

B-A-M-N and others added 12 commits May 2, 2026 17:08
…ailures

Add classifyError() to categorize errors without retrying deterministic
request errors (400, 401, 403, 404, 422). Only retry transport/
provider failures: 429, 408, 409 (transient), 500-599, and network
errors (ECONNRESET, ETIMEDOUT, socket closed, etc.).

Also export isRetryableNetworkError() for use in retry logic.
…lict

The message.includes('409') fallback made ALL 409 errors appear
transient, preventing deterministic conflict classification. Only actual
transient indicators (lock/contention/conflict) should trigger it.
- geminiChat.ts: delegate shouldRetryOnError to classifyError for 408/409/network errors
- retry.ts: deduplicate defaultShouldRetry → classifyError (single source of truth)
- retry.ts: update isTransientCapacityError for persistent mode (408 + network errors)
- retry.test.ts: add 15 integration tests for retryWithBackoff (408/409/network paths)
- retry.test.ts: add 6 tests for updated isTransientCapacityError
- geminiChat.test.ts: add 4 tests (408 retry, 409 transient retry, 409 deterministic no-retry, network retry)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Remove `await` from `expect(promise).rejects.toThrow()` calls that are
followed by `vi.runAllTimersAsync()`. In vitest 3.2.4, awaiting the
assertion promise before advancing fake timers creates a deadlock: the
inner promise can't reject until timers advance, but timers can't advance
while the microtask queue is blocked by the await.

Also add explicit `initialDelayMs: 10` to tests using default options
(7 attempts × 1500ms default = 76s of fake time, triggering the 5s
test timeout), and reduce abort-signal test delay from 10s to 100ms.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Remove await from expect(promise).rejects.toThrow() calls that are
followed by vi.runAllTimersAsync(). In vitest 3.2.4, awaiting the
assertion promise before advancing fake timers creates a deadlock:
the inner promise can't reject until timers advance, but timers
can't advance while the microtask queue is blocked.

Also add explicit initialDelayMs: 10 to tests using default options
(7 attempts x 1500ms default = 76s of fake time, triggering the 5s
test timeout), and reduce abort-signal test delay from 10s to 100ms.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Remove 'conflict' from isTransientConflict keywords (too broad: matches
  standard HTTP 409 reason phrase "Conflict"). Keep only 'lock' and 'contention'.
- Fix 409 test to use proper Error instance instead of plain object
  (getErrorMessage returns "[object Object]" for plain objects, masking the bug).
- Scope isTransientCapacityError to 429/529 only. 408 and network errors are
  no longer persistent-retryable since they can indicate permanent config issues
  (wrong endpoint, firewall) that would hang unattended jobs for hours.
- Update isTransientCapacityError tests to match new scope.
- Document accepted retryable categories in geminiChat.ts shouldRetryOnError.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds an in-flight discovery guard to McpClientManager.discoverMcpToolsForServer()
that prevents concurrent re-discovery of the same server. Without this guard,
overlapping calls (e.g. from health-check reconnect + incremental discovery)
could spawn duplicate MCP child processes because the disconnect→connect
sequence is not atomic.

Changes:
- Add inFlightDiscoveries Set to track per-server discovery in progress
- Early-return if discovery is already in flight for the same server
- Clean up failed clients in catch block so subsequent retries can proceed
- Stop health check timer during re-discovery disconnect path
- Clear inFlightDiscoveries in stop() for completeness
- Add 2 tests: concurrent discovery deduplication + in-flight cleanup

Closes QwenLM#3817

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
retry.test.ts:
- Remove erroneous await before expect().rejects.toThrow() in ~15 fake-timer
  test locations that caused deadlocks (await blocked before runAllTimersAsync)
- Update 408/409/network-error tests to use explicit classifyError-based
  shouldRetryOnError instead of relying on defaultShouldRetry (see below)

retry.ts:
- Narrow defaultShouldRetry to only retry 429/5xx, matching original behavior.
  Previously it delegated to classifyError which also retried 408, transient
  409, and network errors — silently expanding retry behavior for all callers
  using the default (baseLlmClient.generateJson, client.generateContent).
  Callers that want broader retry (geminiChat) now use classifyError directly
  via a custom shouldRetryOnError.

mcp-client-manager.ts:
- Fix TOCTOU race: move inFlightDiscoveries.add() to immediately after the
  has() guard and before await disconnect(), preventing concurrent calls from
  slipping through during the async gap.
- Wrap new McpClient() + connect/discover in try/finally so discoveryDone()
  runs even if the constructor throws, preventing permanent entry leaks.
- Clean inFlightDiscoveries in disconnectServer() to prevent stale entries
  from blocking future discoveries.
- Guard reconnectServer() against running when a discovery is already in
  flight, preventing false "Successfully reconnected" logs.

geminiChat.ts:
- Update comment to clarify isSchemaDepthError/isInvalidArgumentError are
  independent safety nets not covered by classifyError.
- Remove redundant if (status === 400) return false (classifyError handles it).
- Remove unused getErrorStatus import.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… tests

Remove await from const assertionPromise = expect(promise).rejects.toThrow()
in ~15 locations where the assertion is followed by vi.runAllTimersAsync().
The await blocks the microtask queue before timers can advance, causing
deadlocks in vitest 3.2.4.

Add eslint-disable-next-line vitest/valid-expect to prevent the pre-commit
hook from re-adding the await (the rule doesn't account for this fake-timer
pattern where await must come after runAllTimersAsync).

All 105 retry tests pass.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- removeServer() now cleans inFlightDiscoveries and isReconnecting
- reconnectServer() no longer has racy in-flight check; verifies
  server is actually connected before reporting success
- Add test for cleanup state after server removal

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- retry.ts: remove overly-broad ENOTFOUND/EHOSTUNREACH/"network error"
  substring from isRetryableNetworkError; fix isTransientConflict to use
  word-boundary regex preventing false positives on "blocked"/"clock"/"flock"
- retry.test.ts: update tests to reflect removed retryable codes
- geminiChat.test.ts: add test confirming invalid argument errors are not retried
- mcp-client-manager.ts: add inFlightDiscoveries guard in reconnectServer()
  after reconnect delay to prevent unnecessary reconnects

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Implement Option A from PR QwenLM#3799 suggestion to support Anthropic
model listing via the official /v1/models endpoint.

Changes:
- Add fetchAnthropicModels() in commands/auth/anthropic.ts
  - Uses x-api-key + anthropic-version headers per API docs
  - Returns same { data: [...] } shape as OpenAI-compatible
- Update manageModels.ts to support 'anthropic' source
- Add Anthropic tab to ManageModelsDialog.tsx
- Fix type-safety by removing dangerous casts

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@tanzhenxin

Copy link
Copy Markdown
Collaborator

Thanks for the contribution. Closing this one as not planned, and I want to be transparent about why so future work doesn't run into the same dead end.

Direction: We've decided not to ship /model list as a feature. The space of OpenAI-compatible providers is too fragmented for a single command to behave consistently:

  • Authentication shape varies — OpenAI uses Authorization: Bearer, Anthropic uses x-api-key + anthropic-version, Azure injects api-key, others use query params.
  • Many production gateways (MiniMax, some DeepSeek deployments, custom proxies) don't implement /v1/models at all and return 404 or HTML error pages.
  • Response shapes differ — even among providers that do support listing, fields beyond id are inconsistent and meaningful (e.g. context windows, deprecation dates) but not portable.

Each new provider would push us toward a per-provider branch in fetchModels, and the user-facing UX is unreliable by construction. We'd rather direct users to provider documentation than ship a command that works for some endpoints and silently misleads on others. That decision also applies to PR #3799, which we're closing for the same reason.

Issues with this PR specifically, separate from the direction call — flagging these because they'd block any PR from this branch regardless of feature:

  • Three binary database files committed under .gemini_security/ (graphiti.db, pulse.db, second_brain.db). These look like local agent-tool state and shouldn't be in the repo.
  • .serena/.gitignore and .serena/project.yml — personal codebase-indexer config, not project tooling. Belongs in your ~/.gitignore_global or local-only.
  • RUN2.md (599 lines) is a captured Claude Code session transcript, including hook errors, internal .prforge/state.json snapshots, and step-by-step agent commentary. This shouldn't be committed.
  • The PR title says "Anthropic model listing", but the diff also rewrites packages/core/src/utils/retry.ts (+171 lines), packages/core/src/tools/mcp-client-manager.ts (+63), and packages/core/src/core/geminiChat.ts (+14). These changes are unrelated to model listing and need their own PRs with their own motivation if you'd like to propose them — bundling makes review impossible.

If you'd like to contribute in other areas of the codebase, we're happy to review focused PRs. Thanks again for your interest in the project.

@tanzhenxin tanzhenxin closed this May 6, 2026
yiliang114 added a commit that referenced this pull request May 18, 2026
Add docs/design/code-review/ covering the PR review automation
system this branch is building toward:

- code-review-design.md (569 lines): problem statement, design
  principles, 4-stage workflow pipeline, Design Gate spec with
  PR shape generation and fail modes, Feature PR Readiness Gate,
  author/maintainer feedback loop with override, historical
  PR/issue detection, incremental cache wiring, App integration
  plan, testing strategy, risks.
- roadmap.md (179 lines): 7-phase rollout, each phase scoped to
  an independent PR with acceptance criteria.
- compare.md (86 lines): capability comparison vs claude-code,
  coderabbit, copilot review, cursor bugbot, greptile.

The design treats Direction/Scope/History/Validation as workflow
preflight gates separate from bundled /review, so direction issues
are decided in the first 30s rather than after a full deep pass.
Anchors are existing repo artifacts (roadmap.md, architecture.md,
docs/design/*) and historical close comments (#3863, #3627),
not new team-policy docs.
xaelistic pushed a commit to xaelistic/qwen-code that referenced this pull request Jun 7, 2026
QwenLM#3863)

Co-authored-by: Ioannis Papapanagiotou <iduckhd@hotmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants