Skip to content

feat(provider): support provider-native TTS / image / vision backends#9954

Closed
kapelame wants to merge 1 commit into
NousResearch:mainfrom
kapelame:feat/minimax-provider-defaults
Closed

feat(provider): support provider-native TTS / image / vision backends#9954
kapelame wants to merge 1 commit into
NousResearch:mainfrom
kapelame:feat/minimax-provider-defaults

Conversation

@kapelame

@kapelame kapelame commented Apr 15, 2026

Copy link
Copy Markdown

Summary

Adds a provider-agnostic hook that wires TTS / image / vision tool defaults to the active chat provider when it serves those capabilities natively on the same credential. Mirrors apply_nous_provider_defaults.

MiniMax is the first registered consumer (international + CN). Adding another provider is one row in NATIVE_TOOLS_BY_PROVIDER plus a set of request helpers.

Files Changed

File Changes
hermes_cli/provider_native_tools.py new (+466) — registry, apply_provider_native_tool_defaults, runtime dispatchers (generate_image, analyze_image), shared helpers (active_provider_api_root, native_credential_present, minimax_endpoint_and_key), MiniMax request bindings
hermes_cli/setup.py +26 / −1 — call the generic hook beside the existing Nous branch; skip the interactive TTS prompt when the active provider already owns tts
tools/image_generation_tool.py +24 — early-call the native dispatcher; fall through on None. Zero MiniMax mentions
tools/vision_tools.py +28 — early-call when user hasn't pinned auxiliary.vision.provider. Zero MiniMax mentions
tools/tts_tool.py +13 / −1 — existing _generate_minimax_tts derives URL + key from minimax_endpoint_and_key so minimax-cn users hit api.minimaxi.com with the CN key automatically
tests/hermes_cli/test_provider_native_tools.py new (+286) — 47 cases

Design Notes

  • URL derivation reuses model.base_url, stripping the /anthropic chat-compat suffix. CN users who pick minimax-cn get api.minimaxi.com automatically — no separate env var.
  • Credential selection flips on region: minimax-cnMINIMAX_CN_API_KEY first, falls back to MINIMAX_API_KEY.
  • Provider detection uses hermes_cli.providers.normalize_provider so hand-edited aliases (minimax-china, minimax_cn) resolve to the canonical id.
  • Vision uses runtime dispatch (no persisted config) because chat models on /anthropic/v1/messages don't accept multimodal content; vision is served by /v1/coding_plan/vlm, the same endpoint MiniMax's official mmx vision CLI targets.

Test Plan

python -m pytest tests/hermes_cli/test_provider_native_tools.py -q
# 47 passed

Live end-to-end on both api.minimax.io and api.minimaxi.com with Token Plan keys: TTS, image, vision all return correctly.

Zero new pip dependencies (stdlib only). Zero behaviour change for chat providers not in NATIVE_TOOLS_BY_PROVIDER.

@kapelame kapelame force-pushed the feat/minimax-provider-defaults branch 3 times, most recently from 7bb5274 to 06d8e3a Compare April 15, 2026 01:18
When a chat provider exposes TTS / image / vision capabilities on the
same API base and credential, the setup wizard wires them as the
default tool backends and the tool dispatchers route through this
module instead of the generic FAL / auxiliary-client paths.

Mirrors the shape of `hermes_cli/nous_subscription.py`: one apply hook
for the wizard, plus runtime helpers consumed by the tool files.
MiniMax is the first registered consumer (international + CN); adding
another provider is one row in `NATIVE_TOOLS_BY_PROVIDER` plus a set
of request helpers.

Source:
- `hermes_cli/provider_native_tools.py` (new): registry,
  `apply_provider_native_tool_defaults` (mirror of
  `apply_nous_provider_defaults`), `generate_image` / `analyze_image`
  runtime dispatchers, `active_provider_api_root` / `native_credential_present`
  / `minimax_endpoint_and_key` helpers, plus the MiniMax request
  bindings inline in the same file — same shape as `nous_subscription.py`.
- `hermes_cli/setup.py` (+26 / −1): call the generic hook beside the
  existing Nous branch; skip the interactive TTS prompt when the
  active provider already owns `tts`.
- `tools/image_generation_tool.py` (+24): early-call the native
  dispatcher, fall through on `None`.  `check_fn` accepts the native
  credential as meeting the requirement.  Zero MiniMax mentions.
- `tools/vision_tools.py` (+28): early-call the native dispatcher
  when the user hasn't pinned `auxiliary.vision.provider`.  Zero
  MiniMax mentions.
- `tools/tts_tool.py` (+13 / −1): existing `_generate_minimax_tts`
  derives its URL + key from `minimax_endpoint_and_key` so CN users
  (`minimax-cn` → `api.minimaxi.com`) and the correct regional key are
  used automatically; falls back to the existing international
  defaults otherwise.

Tests (`tests/hermes_cli/test_provider_native_tools.py`, new): 47
cases — registry, alias normalisation, URL derivation, apply_defaults
idempotency / override preservation, CN parity, describe_changes
phrasing, credential + endpoint helpers.

Design notes:
- URL derivation reuses `model.base_url`, stripping the `/anthropic`
  chat-compat suffix.  CN users who pick `minimax-cn` automatically
  get `api.minimaxi.com` because their `base_url` already points
  there.  No separate `MINIMAX_API_HOST` env var.
- Provider detection uses `hermes_cli.providers.normalize_provider`
  so hand-edited aliases (`minimax-china`, `minimax_cn`) resolve to
  the canonical id.
- Credential selection flips on region: `minimax-cn` prefers
  `MINIMAX_CN_API_KEY`; `minimax` prefers `MINIMAX_API_KEY`.  Either
  falls back to the other on absence.
- Vision uses runtime dispatch (no persisted config value) because
  chat models on `/anthropic/v1/messages` don't accept multimodal
  content; vision is served by `/v1/coding_plan/vlm` (the same
  endpoint MiniMax's official `mmx vision` CLI targets), which takes
  a non-chat request shape.

Zero new pip dependencies (stdlib only).  Zero behaviour change for
users on chat providers not registered in `NATIVE_TOOLS_BY_PROVIDER`.
@kapelame kapelame force-pushed the feat/minimax-provider-defaults branch from 48ca8a3 to db126c5 Compare April 15, 2026 19:12
@alt-glitch alt-glitch added type/feature New feature or request P3 Low — cosmetic, nice to have comp/tools Tool registry, model_tools, toolsets provider/minimax MiniMax (Anthropic transport) tool/vision Vision analysis and image generation tool/tts Text-to-speech and transcription labels Apr 26, 2026
@kapelame kapelame closed this by deleting the head repository May 30, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/tools Tool registry, model_tools, toolsets P3 Low — cosmetic, nice to have provider/minimax MiniMax (Anthropic transport) tool/tts Text-to-speech and transcription tool/vision Vision analysis and image generation type/feature New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants