Skip to content

fix: alibaba provider default endpoint and model list#3484

Merged
teknium1 merged 1 commit into
mainfrom
hermes/hermes-5a68ad9d
Mar 28, 2026
Merged

fix: alibaba provider default endpoint and model list#3484
teknium1 merged 1 commit into
mainfrom
hermes/hermes-5a68ad9d

Conversation

@teknium1

Copy link
Copy Markdown
Contributor

Summary

Fixes the Alibaba/DashScope provider configuration which was using an Anthropic-format endpoint (dashscope-intl.aliyuncs.com/apps/anthropic) as the default base URL, but routing through the OpenAI SDK (chat_completions api_mode). The OpenAI SDK appends /chat/completions to the base URL, producing .../apps/anthropic/chat/completions which 404s.

Changes

auth.py — Default inference_base_url changed to https://coding-intl.dashscope.aliyuncs.com/v1 (OpenAI-compatible endpoint on the DashScope Coding platform).

models.py — Updated curated model list:

  • Removed models unavailable on coding-intl: qwen3-max, qwen-plus-latest, qwen3.5-flash, qwen-vl-max
  • Added third-party models available on the platform: glm-5, glm-4.7, kimi-k2.5, MiniMax-M2.5
  • Added comments documenting the two DashScope services and how to override for classic DashScope keys

main.py — Updated provider picker description from "Anthropic-compatible" to "Qwen + multi-provider".

config.py — Updated env var descriptions to reflect the new default endpoint.

tests — Updated runtime provider resolution tests to match new default URL. Tests now cover:

  • Default: coding-intl /v1 → chat_completions mode
  • Override: /apps/anthropic → auto-detects anthropic_messages mode

How routing works

The existing auto-detection in runtime_provider.py (line 408) handles both endpoints correctly:

  • URLs ending in /anthropicanthropic_messages api_mode → Anthropic SDK
  • URLs ending in /v1chat_completions api_mode → OpenAI SDK

Users with classic DashScope keys can override DASHSCOPE_BASE_URL to:

  • https://dashscope-intl.aliyuncs.com/compatible-mode/v1 (OpenAI-compat)
  • https://dashscope-intl.aliyuncs.com/apps/anthropic (Anthropic-compat)

Verified

All 8 models tested with both chat completions and tool calling on the coding-intl /v1 endpoint:

✔ qwen3.5-plus       ✔ tools
✔ qwen3-coder-plus   ✔ tools
✔ qwen3-coder-next   ✔ tools
✔ glm-5              ✔ tools
✔ glm-4.7            ✔ tools
✔ kimi-k2.5          ✔ tools
✔ MiniMax-M2.5       ✔ tools

6530 tests passed (1 pre-existing flaky failure unrelated to this PR).

- Change default inference_base_url from dashscope-intl Anthropic-compat
  endpoint to coding-intl OpenAI-compat /v1 endpoint. The old Anthropic
  endpoint 404'd when used with the OpenAI SDK (which appends
  /chat/completions to a /apps/anthropic base URL).

- Update curated model list: remove models unavailable on coding-intl
  (qwen3-max, qwen-plus-latest, qwen3.5-flash, qwen-vl-max), add
  third-party models available on the platform (glm-5, glm-4.7,
  kimi-k2.5, MiniMax-M2.5).

- URL-based api_mode auto-detection still works: overriding
  DASHSCOPE_BASE_URL to an /apps/anthropic endpoint automatically
  switches to anthropic_messages mode.

- Update provider description and env var descriptions to reflect the
  coding-intl multi-provider platform.

- Update tests to match new default URL and test the anthropic override
  path instead.
@teknium1 teknium1 merged commit 09796b1 into main Mar 28, 2026
4 checks passed
teknium1 added a commit that referenced this pull request Mar 28, 2026
… pages

Fixes found by auditing docs against recent PRs/commits:

Critical (misleading):
- hooks.md: Remove stale 'planned — not yet wired' markers for 4 hooks
  that are now active (#3542). Add correct callback signatures.
- security.md: Update tirith verdict behavior — block verdicts now go
  through approval flow instead of hard-blocking (#3428). Add pkill/killall
  self-termination guard and gateway-run backgrounding patterns (#3593).

New feature docs:
- configuration.md: Add tool_use_enforcement section with value table
  (auto/true/false/list) from #3551/#3528.
- configuration.md: Expand auxiliary config with per-task timeouts
  (compression 120s, web_extract 30s, approval 30s) from #3597.
- api-server.md: Add /v1/health alias, Security Headers section,
  CORS details (Max-Age, SSE headers, Idempotency-Key) from
  #3572/#3573/#3576/#3580/#3530.

Stale/incomplete:
- configuration.md: Fix Alibaba model name qwen-plus -> qwen3.5-plus (#3484).
- environment-variables.md: Specify actual DashScope default URL.
- cli-commands.md: Add alibaba to --provider list.
- fallback-providers.md: Add Alibaba/DashScope to provider table.
- email.md: Document noreply/automated sender filtering (#3606).
- toolsets-reference.md: Add 4 missing platform toolsets — matrix,
  mattermost, dingtalk, api-server (#3583).
- skills.md: List default GitHub taps including garrytan/gstack (#3605).
teknium1 added a commit that referenced this pull request Mar 28, 2026
… pages (#3618)

Fixes found by auditing docs against recent PRs/commits:

Critical (misleading):
- hooks.md: Remove stale 'planned — not yet wired' markers for 4 hooks
  that are now active (#3542). Add correct callback signatures.
- security.md: Update tirith verdict behavior — block verdicts now go
  through approval flow instead of hard-blocking (#3428). Add pkill/killall
  self-termination guard and gateway-run backgrounding patterns (#3593).

New feature docs:
- configuration.md: Add tool_use_enforcement section with value table
  (auto/true/false/list) from #3551/#3528.
- configuration.md: Expand auxiliary config with per-task timeouts
  (compression 120s, web_extract 30s, approval 30s) from #3597.
- api-server.md: Add /v1/health alias, Security Headers section,
  CORS details (Max-Age, SSE headers, Idempotency-Key) from
  #3572/#3573/#3576/#3580/#3530.

Stale/incomplete:
- configuration.md: Fix Alibaba model name qwen-plus -> qwen3.5-plus (#3484).
- environment-variables.md: Specify actual DashScope default URL.
- cli-commands.md: Add alibaba to --provider list.
- fallback-providers.md: Add Alibaba/DashScope to provider table.
- email.md: Document noreply/automated sender filtering (#3606).
- toolsets-reference.md: Add 4 missing platform toolsets — matrix,
  mattermost, dingtalk, api-server (#3583).
- skills.md: List default GitHub taps including garrytan/gstack (#3605).
angelburgosrosado pushed a commit to angelburgosrosado/hermes-agent that referenced this pull request Apr 27, 2026
)

- Change default inference_base_url from dashscope-intl Anthropic-compat
  endpoint to coding-intl OpenAI-compat /v1 endpoint. The old Anthropic
  endpoint 404'd when used with the OpenAI SDK (which appends
  /chat/completions to a /apps/anthropic base URL).

- Update curated model list: remove models unavailable on coding-intl
  (qwen3-max, qwen-plus-latest, qwen3.5-flash, qwen-vl-max), add
  third-party models available on the platform (glm-5, glm-4.7,
  kimi-k2.5, MiniMax-M2.5).

- URL-based api_mode auto-detection still works: overriding
  DASHSCOPE_BASE_URL to an /apps/anthropic endpoint automatically
  switches to anthropic_messages mode.

- Update provider description and env var descriptions to reflect the
  coding-intl multi-provider platform.

- Update tests to match new default URL and test the anthropic override
  path instead.
angelburgosrosado pushed a commit to angelburgosrosado/hermes-agent that referenced this pull request Apr 27, 2026
… pages (NousResearch#3618)

Fixes found by auditing docs against recent PRs/commits:

Critical (misleading):
- hooks.md: Remove stale 'planned — not yet wired' markers for 4 hooks
  that are now active (NousResearch#3542). Add correct callback signatures.
- security.md: Update tirith verdict behavior — block verdicts now go
  through approval flow instead of hard-blocking (NousResearch#3428). Add pkill/killall
  self-termination guard and gateway-run backgrounding patterns (NousResearch#3593).

New feature docs:
- configuration.md: Add tool_use_enforcement section with value table
  (auto/true/false/list) from NousResearch#3551/NousResearch#3528.
- configuration.md: Expand auxiliary config with per-task timeouts
  (compression 120s, web_extract 30s, approval 30s) from NousResearch#3597.
- api-server.md: Add /v1/health alias, Security Headers section,
  CORS details (Max-Age, SSE headers, Idempotency-Key) from
  NousResearch#3572/NousResearch#3573/NousResearch#3576/NousResearch#3580/NousResearch#3530.

Stale/incomplete:
- configuration.md: Fix Alibaba model name qwen-plus -> qwen3.5-plus (NousResearch#3484).
- environment-variables.md: Specify actual DashScope default URL.
- cli-commands.md: Add alibaba to --provider list.
- fallback-providers.md: Add Alibaba/DashScope to provider table.
- email.md: Document noreply/automated sender filtering (NousResearch#3606).
- toolsets-reference.md: Add 4 missing platform toolsets — matrix,
  mattermost, dingtalk, api-server (NousResearch#3583).
- skills.md: List default GitHub taps including garrytan/gstack (NousResearch#3605).
02356abc pushed a commit to 02356abc/hermes-agent that referenced this pull request May 14, 2026
)

- Change default inference_base_url from dashscope-intl Anthropic-compat
  endpoint to coding-intl OpenAI-compat /v1 endpoint. The old Anthropic
  endpoint 404'd when used with the OpenAI SDK (which appends
  /chat/completions to a /apps/anthropic base URL).

- Update curated model list: remove models unavailable on coding-intl
  (qwen3-max, qwen-plus-latest, qwen3.5-flash, qwen-vl-max), add
  third-party models available on the platform (glm-5, glm-4.7,
  kimi-k2.5, MiniMax-M2.5).

- URL-based api_mode auto-detection still works: overriding
  DASHSCOPE_BASE_URL to an /apps/anthropic endpoint automatically
  switches to anthropic_messages mode.

- Update provider description and env var descriptions to reflect the
  coding-intl multi-provider platform.

- Update tests to match new default URL and test the anthropic override
  path instead.
02356abc pushed a commit to 02356abc/hermes-agent that referenced this pull request May 14, 2026
… pages (NousResearch#3618)

Fixes found by auditing docs against recent PRs/commits:

Critical (misleading):
- hooks.md: Remove stale 'planned — not yet wired' markers for 4 hooks
  that are now active (NousResearch#3542). Add correct callback signatures.
- security.md: Update tirith verdict behavior — block verdicts now go
  through approval flow instead of hard-blocking (NousResearch#3428). Add pkill/killall
  self-termination guard and gateway-run backgrounding patterns (NousResearch#3593).

New feature docs:
- configuration.md: Add tool_use_enforcement section with value table
  (auto/true/false/list) from NousResearch#3551/NousResearch#3528.
- configuration.md: Expand auxiliary config with per-task timeouts
  (compression 120s, web_extract 30s, approval 30s) from NousResearch#3597.
- api-server.md: Add /v1/health alias, Security Headers section,
  CORS details (Max-Age, SSE headers, Idempotency-Key) from
  NousResearch#3572/NousResearch#3573/NousResearch#3576/NousResearch#3580/NousResearch#3530.

Stale/incomplete:
- configuration.md: Fix Alibaba model name qwen-plus -> qwen3.5-plus (NousResearch#3484).
- environment-variables.md: Specify actual DashScope default URL.
- cli-commands.md: Add alibaba to --provider list.
- fallback-providers.md: Add Alibaba/DashScope to provider table.
- email.md: Document noreply/automated sender filtering (NousResearch#3606).
- toolsets-reference.md: Add 4 missing platform toolsets — matrix,
  mattermost, dingtalk, api-server (NousResearch#3583).
- skills.md: List default GitHub taps including garrytan/gstack (NousResearch#3605).
olympus-terminal pushed a commit to olympus-terminal/hermes-agent that referenced this pull request May 16, 2026
)

- Change default inference_base_url from dashscope-intl Anthropic-compat
  endpoint to coding-intl OpenAI-compat /v1 endpoint. The old Anthropic
  endpoint 404'd when used with the OpenAI SDK (which appends
  /chat/completions to a /apps/anthropic base URL).

- Update curated model list: remove models unavailable on coding-intl
  (qwen3-max, qwen-plus-latest, qwen3.5-flash, qwen-vl-max), add
  third-party models available on the platform (glm-5, glm-4.7,
  kimi-k2.5, MiniMax-M2.5).

- URL-based api_mode auto-detection still works: overriding
  DASHSCOPE_BASE_URL to an /apps/anthropic endpoint automatically
  switches to anthropic_messages mode.

- Update provider description and env var descriptions to reflect the
  coding-intl multi-provider platform.

- Update tests to match new default URL and test the anthropic override
  path instead.
olympus-terminal pushed a commit to olympus-terminal/hermes-agent that referenced this pull request May 16, 2026
… pages (NousResearch#3618)

Fixes found by auditing docs against recent PRs/commits:

Critical (misleading):
- hooks.md: Remove stale 'planned — not yet wired' markers for 4 hooks
  that are now active (NousResearch#3542). Add correct callback signatures.
- security.md: Update tirith verdict behavior — block verdicts now go
  through approval flow instead of hard-blocking (NousResearch#3428). Add pkill/killall
  self-termination guard and gateway-run backgrounding patterns (NousResearch#3593).

New feature docs:
- configuration.md: Add tool_use_enforcement section with value table
  (auto/true/false/list) from NousResearch#3551/NousResearch#3528.
- configuration.md: Expand auxiliary config with per-task timeouts
  (compression 120s, web_extract 30s, approval 30s) from NousResearch#3597.
- api-server.md: Add /v1/health alias, Security Headers section,
  CORS details (Max-Age, SSE headers, Idempotency-Key) from
  NousResearch#3572/NousResearch#3573/NousResearch#3576/NousResearch#3580/NousResearch#3530.

Stale/incomplete:
- configuration.md: Fix Alibaba model name qwen-plus -> qwen3.5-plus (NousResearch#3484).
- environment-variables.md: Specify actual DashScope default URL.
- cli-commands.md: Add alibaba to --provider list.
- fallback-providers.md: Add Alibaba/DashScope to provider table.
- email.md: Document noreply/automated sender filtering (NousResearch#3606).
- toolsets-reference.md: Add 4 missing platform toolsets — matrix,
  mattermost, dingtalk, api-server (NousResearch#3583).
- skills.md: List default GitHub taps including garrytan/gstack (NousResearch#3605).
gweeteve pushed a commit to gweeteve/hermes-agent that referenced this pull request Jun 2, 2026
)

- Change default inference_base_url from dashscope-intl Anthropic-compat
  endpoint to coding-intl OpenAI-compat /v1 endpoint. The old Anthropic
  endpoint 404'd when used with the OpenAI SDK (which appends
  /chat/completions to a /apps/anthropic base URL).

- Update curated model list: remove models unavailable on coding-intl
  (qwen3-max, qwen-plus-latest, qwen3.5-flash, qwen-vl-max), add
  third-party models available on the platform (glm-5, glm-4.7,
  kimi-k2.5, MiniMax-M2.5).

- URL-based api_mode auto-detection still works: overriding
  DASHSCOPE_BASE_URL to an /apps/anthropic endpoint automatically
  switches to anthropic_messages mode.

- Update provider description and env var descriptions to reflect the
  coding-intl multi-provider platform.

- Update tests to match new default URL and test the anthropic override
  path instead.
gweeteve pushed a commit to gweeteve/hermes-agent that referenced this pull request Jun 2, 2026
… pages (NousResearch#3618)

Fixes found by auditing docs against recent PRs/commits:

Critical (misleading):
- hooks.md: Remove stale 'planned — not yet wired' markers for 4 hooks
  that are now active (NousResearch#3542). Add correct callback signatures.
- security.md: Update tirith verdict behavior — block verdicts now go
  through approval flow instead of hard-blocking (NousResearch#3428). Add pkill/killall
  self-termination guard and gateway-run backgrounding patterns (NousResearch#3593).

New feature docs:
- configuration.md: Add tool_use_enforcement section with value table
  (auto/true/false/list) from NousResearch#3551/NousResearch#3528.
- configuration.md: Expand auxiliary config with per-task timeouts
  (compression 120s, web_extract 30s, approval 30s) from NousResearch#3597.
- api-server.md: Add /v1/health alias, Security Headers section,
  CORS details (Max-Age, SSE headers, Idempotency-Key) from
  NousResearch#3572/NousResearch#3573/NousResearch#3576/NousResearch#3580/NousResearch#3530.

Stale/incomplete:
- configuration.md: Fix Alibaba model name qwen-plus -> qwen3.5-plus (NousResearch#3484).
- environment-variables.md: Specify actual DashScope default URL.
- cli-commands.md: Add alibaba to --provider list.
- fallback-providers.md: Add Alibaba/DashScope to provider table.
- email.md: Document noreply/automated sender filtering (NousResearch#3606).
- toolsets-reference.md: Add 4 missing platform toolsets — matrix,
  mattermost, dingtalk, api-server (NousResearch#3583).
- skills.md: List default GitHub taps including garrytan/gstack (NousResearch#3605).
Egavasyug pushed a commit to Egavasyug/hermes-agent that referenced this pull request Jun 10, 2026
)

- Change default inference_base_url from dashscope-intl Anthropic-compat
  endpoint to coding-intl OpenAI-compat /v1 endpoint. The old Anthropic
  endpoint 404'd when used with the OpenAI SDK (which appends
  /chat/completions to a /apps/anthropic base URL).

- Update curated model list: remove models unavailable on coding-intl
  (qwen3-max, qwen-plus-latest, qwen3.5-flash, qwen-vl-max), add
  third-party models available on the platform (glm-5, glm-4.7,
  kimi-k2.5, MiniMax-M2.5).

- URL-based api_mode auto-detection still works: overriding
  DASHSCOPE_BASE_URL to an /apps/anthropic endpoint automatically
  switches to anthropic_messages mode.

- Update provider description and env var descriptions to reflect the
  coding-intl multi-provider platform.

- Update tests to match new default URL and test the anthropic override
  path instead.
Egavasyug pushed a commit to Egavasyug/hermes-agent that referenced this pull request Jun 10, 2026
… pages (NousResearch#3618)

Fixes found by auditing docs against recent PRs/commits:

Critical (misleading):
- hooks.md: Remove stale 'planned — not yet wired' markers for 4 hooks
  that are now active (NousResearch#3542). Add correct callback signatures.
- security.md: Update tirith verdict behavior — block verdicts now go
  through approval flow instead of hard-blocking (NousResearch#3428). Add pkill/killall
  self-termination guard and gateway-run backgrounding patterns (NousResearch#3593).

New feature docs:
- configuration.md: Add tool_use_enforcement section with value table
  (auto/true/false/list) from NousResearch#3551/NousResearch#3528.
- configuration.md: Expand auxiliary config with per-task timeouts
  (compression 120s, web_extract 30s, approval 30s) from NousResearch#3597.
- api-server.md: Add /v1/health alias, Security Headers section,
  CORS details (Max-Age, SSE headers, Idempotency-Key) from
  NousResearch#3572/NousResearch#3573/NousResearch#3576/NousResearch#3580/NousResearch#3530.

Stale/incomplete:
- configuration.md: Fix Alibaba model name qwen-plus -> qwen3.5-plus (NousResearch#3484).
- environment-variables.md: Specify actual DashScope default URL.
- cli-commands.md: Add alibaba to --provider list.
- fallback-providers.md: Add Alibaba/DashScope to provider table.
- email.md: Document noreply/automated sender filtering (NousResearch#3606).
- toolsets-reference.md: Add 4 missing platform toolsets — matrix,
  mattermost, dingtalk, api-server (NousResearch#3583).
- skills.md: List default GitHub taps including garrytan/gstack (NousResearch#3605).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant