Skip to content

feat: provider modules — ProviderProfile ABC, 29 providers, fetch_models, transport single-path#14424

Closed
kshitijk4poor wants to merge 1 commit into
mainfrom
feat/provider-modules
Closed

feat: provider modules — ProviderProfile ABC, 29 providers, fetch_models, transport single-path#14424
kshitijk4poor wants to merge 1 commit into
mainfrom
feat/provider-modules

Conversation

@kshitijk4poor

@kshitijk4poor kshitijk4poor commented Apr 23, 2026

Copy link
Copy Markdown
Collaborator

Summary

This PR introduces providers/ as the single source of truth for every inference
provider Hermes supports — all 30 profiles declared, all per-provider quirks
centralised, and the transport reduced to a single profile-driven code path. Adding
a new simple API-key provider now requires one file with zero edits to any other file in the
repo.


What this PR does

providers/ package — 30 profiles, fully wired

Every field a provider needs is declared in one providers/<name>.py file:

Field Type Description
name str Canonical ID — matches keys in PROVIDER_REGISTRY, --provider choices
api_mode str chat_completions | codex_responses | anthropic_messages | bedrock_converse | copilot_acp
aliases tuple[str, ...] Alternative names resolved by get_provider_profile()
display_name str Human label shown in picker / setup wizard (e.g. "GMI Cloud")
description str Picker subtitle (e.g. "GMI Cloud — multi-model direct API")
signup_url str URL shown during first-run setup ("get an API key here")
env_vars tuple[str, ...] API key env var names in priority order; last entry may be *_BASE_URL
base_url str Inference endpoint
models_url str Explicit model catalog URL (falls back to {base_url}/models)
auth_type str api_key | oauth_device_code | oauth_external | copilot | aws_sdk | external_process
fallback_models tuple[str, ...] Curated tool-capable models shown in picker when live fetch fails
hostname str Base hostname for URL→provider reverse-mapping (derived from base_url when empty)
default_headers dict[str, str] Headers sent on every request to this provider
fixed_temperature Any None = use caller's value; OMIT_TEMPERATURE = don't send temperature at all
default_max_tokens int | None Provider-level max_tokens cap (e.g. NVIDIA: 16384)
default_aux_model str Cheap model for auxiliary tasks (compression, vision, summarisation)

4 overridable hooks

Hook Signature Purpose
prepare_messages(messages) list[dict] -> list[dict] Message preprocessing before the API call — Qwen: normalise to list-of-parts, inject cache_control
build_extra_body(**ctx) -> dict Provider-specific extra_body fields — Nous: attribution tags; OpenRouter: provider preferences
build_api_kwargs_extras(**ctx) -> (dict, dict) Returns (extra_body_additions, top_level_kwargs) — handles the Kimi/Qwen split where some fields go top-level, others to extra_body
fetch_models(*, api_key, timeout) -> list[str] | None Live model listing — default hits {models_url or base_url}/models with Bearer auth; override for custom auth (Anthropic), no REST endpoint (Bedrock), or public catalog (OpenRouter)

Auto-wiring table

Adding a providers/<name>.py file automatically extends all of the following —
no edits to any other file required:

What auto-extends Where How
PROVIDER_REGISTRY in auth.py hermes_cli/auth.py:373 Loop over list_providers() at module load; skips names already declared
CANONICAL_PROVIDERS list hermes_cli/models.py:755 Appends ProviderEntry(name, display_name, description) for api_key providers
--provider CLI choices hermes_cli/main.py:7221 _build_provider_choices() derives from CANONICAL_PROVIDERS
_is_profile_api_key_provider() catch-all hermes_cli/main.py:1450 get_provider_profile() lookup — new providers route to _model_flow_api_key_provider without a new elif
provider_model_ids() profile path hermes_cli/models.py:1892 Calls profile.fetch_models() then profile.fallback_models for any api_key provider
Doctor health checks hermes_cli/doctor.py:958 Appends (display_name, env_vars, models_url, base_env, True) for api_key providers not already listed
OPTIONAL_ENV_VARS config registry hermes_cli/config.py:4170 _inject_profile_env_vars() adds each env_vars entry with description, prompt, URL, category
_URL_TO_PROVIDER domain map agent/model_metadata.py:316 get_hostname() auto-derived from base_url; maps domain → provider name for URL-based detection
Transport single-path agent/transports/chat_completions.py:97 _build_kwargs_from_profile() — profile present → all 4 hooks run, legacy flags bypassed
runtime_provider.py api_mode hermes_cli/runtime_provider.py:237 profile.api_mode read directly; no new elif branch in api_mode resolution
auxiliary_client.py aux model agent/auxiliary_client.py:159 profile.default_aux_model read first; falls back to legacy hardcoded dict
run_agent.py profile path run_agent.py:7521 get_provider_profile(self.provider) — every registered provider gets profile-driven kwargs

Provider count: 30 profiles across all api_modes

api_mode Providers
chat_completions (24) ai-gateway, alibaba, arcee, copilot, copilot-acp, custom, deepseek, gemini, gmi, google-gemini-cli, huggingface, kilocode, kimi-coding, kimi-coding-cn, nous, nvidia, ollama-cloud, opencode-go, opencode-zen, openrouter, qwen-oauth, stepfun, xiaomi, zai
codex_responses (2) openai-codex, xai
anthropic_messages (3) anthropic, minimax, minimax-cn
bedrock_converse (1) bedrock

Key code changes

transport — legacy flags gone, profile path is THE path

ChatCompletionsTransport.build_kwargs now has two branches:

  • Profile path (all 30 registered providers): _build_kwargs_from_profile() runs
    the 4 hooks and returns. No flag params needed.
  • Legacy fallback (~25 lines): reached only when get_provider_profile() returns
    None — i.e. a completely unknown custom endpoint not in providers/. Sends a
    clean chat_completions request with no provider-specific quirks.

The 20+ boolean flags that were threaded through build_kwargs (supports_reasoning,
is_openrouter, is_nous, qwen_vl_high_resolution, etc.) are gone from the
function's effective code path for all known providers.

Before:

def build_kwargs(self, model, messages, tools=None,
                 is_openrouter=False, is_nous=False, is_kimi=False,
                 is_qwen=False, supports_reasoning=False,
                 fixed_temperature=None, default_max_tokens=None,
                 extra_body_additions=None, qwen_session_metadata=None,
                 anthropic_max_output=None, ...):  # 20+ params
    # 200 lines of if/elif chains per provider

After:

def build_kwargs(self, model, messages, tools=None, **params):
    _profile = params.get("provider_profile")
    if _profile:
        return self._build_kwargs_from_profile(_profile, model, ...)  # hooks handle it
    # legacy 25-line fallback for unknown providers

run_agent.py — all 30 providers live via profile path

run_agent.py calls get_provider_profile(self.provider). When a profile is found
(all 30 registered providers), it assembles per-call context and passes
provider_profile=_profile to the transport. No per-provider if branches remain
in the kwargs-building path.

runtime_provider.py — reads profile.api_mode

_resolve_api_mode() now calls get_provider_profile(provider) and returns
profile.api_mode for all registered providers. No new elif branch needed for
new providers.

auxiliary_client.py — reads profile.default_aux_model

_get_aux_model_for_provider() tries profile.default_aux_model first, then falls
back to the legacy hardcoded dict (kept for Anthropic only, which predates profiles).

auth.py — auto-extends PROVIDER_REGISTRY

At module load, loops over list_providers() and adds a ProviderConfig entry for
every api_key provider not already declared. New providers need zero changes here.

models.py — auto-extends CANONICAL_PROVIDERS + profile path in provider_model_ids

  • CANONICAL_PROVIDERS list extended at import from list_providers().
  • provider_model_ids(provider) calls profile.fetch_models() then
    profile.fallback_models for any api_key provider with a profile, replacing
    per-provider copy-paste fetch blocks.

doctor.py — auto-extends health checks

The _apikey_providers_static list is extended at runtime from list_providers().
New providers automatically get a /models health check in hermes doctor.

config.py — _inject_profile_env_vars()

Eagerly populates OPTIONAL_ENV_VARS from every profile's env_vars tuple at
import time. Descriptions, prompts, URLs, and categories are derived from the
profile — no manual entries needed.

model_metadata.py — auto-extends _URL_TO_PROVIDER

At module load, calls profile.get_hostname() for every profile and inserts any
hostname not already in _URL_TO_PROVIDER. URL-based provider detection works
automatically for new providers.

main.py — _build_provider_choices() + _is_profile_api_key_provider()

  • _build_provider_choices() derives the --provider argparse choices directly
    from CANONICAL_PROVIDERS, which is already auto-extended by profiles.
  • _is_profile_api_key_provider() is a catch-all in select_provider_and_model()
    that routes any api_key profile to _model_flow_api_key_provider without
    requiring an explicit elif for each provider.

Proof: one file, zero other changes

providers/gmi.py is 26 lines and required zero edits to any other file (for api_key auth_type):

"""GMI Cloud provider profile."""

from providers import register_provider
from providers.base import ProviderProfile

gmi = ProviderProfile(
    name="gmi",
    aliases=("gmi-cloud", "gmicloud"),
    display_name="GMI Cloud",
    description="GMI Cloud — multi-model direct API (slash-form model IDs)",
    signup_url="https://www.gmicloud.ai/",
    env_vars=("GMI_API_KEY", "GMI_BASE_URL"),
    base_url="https://api.gmi-serving.com/v1",
    auth_type="api_key",
    default_aux_model="google/gemini-3.1-flash-lite-preview",
    fallback_models=(
        "zai-org/GLM-5.1-FP8",
        "deepseek-ai/DeepSeek-V3.2",
        "moonshotai/Kimi-K2.5",
        "google/gemini-3.1-flash-lite-preview",
        "anthropic/claude-sonnet-4.6",
        "openai/gpt-5.4",
    ),
)

register_provider(gmi)

After adding this file: GMI appears in --provider choices, hermes doctor checks
its key, hermes model lists its models, the setup wizard prompts for GMI_API_KEY,
and run_agent.py routes requests through the profile path — no other files touched.


Tests

3114 passing (full test suite)
Pre-existing failures: same set as main branch

Provider-specific test breakdown:

203  tests/providers/test_provider_profiles.py   profile declarations, auth_type, field validation
258  tests/providers/test_transport_parity.py    pins legacy flag behavior per provider
290  tests/providers/test_profile_wiring.py      profile path == legacy path output (+ override parity)
118  tests/providers/test_e2e_wiring.py          end-to-end kwargs via profile path
---
869  total provider tests

Files changed (summary)

  • New: providers/ — 28 provider files + __init__.py + base.py + README.md
  • New: acp_adapter/copilot_client.py (ACP client relocated from agent/)
  • New: tests/providers/ — 4 test files, 869 lines
  • Modified: agent/transports/chat_completions.py_build_kwargs_from_profile(), legacy flags removed from known-provider path
  • Modified: run_agent.py — all 30 providers go through profile path
  • Modified: hermes_cli/runtime_provider.py — reads profile.api_mode
  • Modified: agent/auxiliary_client.py — reads profile.default_aux_model
  • Modified: hermes_cli/auth.py — auto-extends PROVIDER_REGISTRY from profiles
  • Modified: hermes_cli/models.py — auto-extends CANONICAL_PROVIDERS, profile path in provider_model_ids()
  • Modified: hermes_cli/doctor.py — auto-extends health checks from profiles
  • Modified: hermes_cli/config.py_inject_profile_env_vars()
  • Modified: agent/model_metadata.py — auto-extends _URL_TO_PROVIDER
  • Modified: hermes_cli/main.py_build_provider_choices() + _is_profile_api_key_provider()

58 files changed, 3811 insertions, 369 deletions


Closes #14418.


def test_base_url(self):
p = get_provider_profile("nvidia")
assert "nvidia.com" in p.base_url
@alt-glitch alt-glitch added type/refactor Code restructuring, no behavior change P3 Low — cosmetic, nice to have comp/agent Core agent loop, run_agent.py, prompt builder labels Apr 23, 2026
@kshitijk4poor kshitijk4poor force-pushed the feat/provider-modules branch 7 times, most recently from f1b0ed4 to 93b00f4 Compare April 23, 2026 07:59
@kshitijk4poor kshitijk4poor force-pushed the feat/provider-modules branch 3 times, most recently from 7c4950c to cecfc88 Compare April 23, 2026 10:33
@kshitijk4poor kshitijk4poor force-pushed the feat/provider-modules branch from cecfc88 to b61e6c6 Compare April 23, 2026 11:04
@alt-glitch

Copy link
Copy Markdown
Collaborator

added ruff and ty rules as well as cleaned up the PR for the same.

@kshitijk4poor kshitijk4poor force-pushed the feat/provider-modules branch 2 times, most recently from dee0cab to 370fa41 Compare April 26, 2026 10:14
@kshitijk4poor kshitijk4poor changed the title feat: add provider modules — ProviderProfile ABC + 7 providers feat: provider modules — ProviderProfile ABC + 8 providers + transport wiring + registry fix Apr 26, 2026
@kshitijk4poor kshitijk4poor force-pushed the feat/provider-modules branch 9 times, most recently from 06398a5 to ecf579e Compare April 26, 2026 14:33
@kshitijk4poor kshitijk4poor force-pushed the feat/provider-modules branch 2 times, most recently from 0e25739 to c4afe52 Compare April 27, 2026 02:27
@github-actions

Copy link
Copy Markdown
Contributor

🚨 CRITICAL Supply Chain Risk Detected

This PR contains a pattern that has been used in real supply chain attacks. A maintainer must review the flagged code carefully before merging.

🚨 CRITICAL: Install-hook file added or modified

These files can execute code during package installation or interpreter startup.

Files:

hermes_cli/setup.py

Scanner only fires on high-signal indicators: .pth files, base64+exec/eval combos, subprocess with encoded commands, or install-hook files. Low-signal warnings were removed intentionally — if you're seeing this comment, the finding is worth inspecting.

@github-actions

Copy link
Copy Markdown
Contributor

🚨 CRITICAL Supply Chain Risk Detected

This PR contains a pattern that has been used in real supply chain attacks. A maintainer must review the flagged code carefully before merging.

🚨 CRITICAL: Install-hook file added or modified

These files can execute code during package installation or interpreter startup.

Files:

hermes_cli/setup.py
skills/productivity/google-workspace/scripts/setup.py

Scanner only fires on high-signal indicators: .pth files, base64+exec/eval combos, subprocess with encoded commands, or install-hook files. Low-signal warnings were removed intentionally — if you're seeing this comment, the finding is worth inspecting.

@kshitijk4poor kshitijk4poor force-pushed the feat/provider-modules branch from 03a71c2 to b398ce2 Compare April 27, 2026 10:55
@github-actions

Copy link
Copy Markdown
Contributor

🚨 CRITICAL Supply Chain Risk Detected

This PR contains a pattern that has been used in real supply chain attacks. A maintainer must review the flagged code carefully before merging.

🚨 CRITICAL: Install-hook file added or modified

These files can execute code during package installation or interpreter startup.

Files:

hermes_cli/setup.py
skills/productivity/google-workspace/scripts/setup.py

Scanner only fires on high-signal indicators: .pth files, base64+exec/eval combos, subprocess with encoded commands, or install-hook files. Low-signal warnings were removed intentionally — if you're seeing this comment, the finding is worth inspecting.

@kshitijk4poor kshitijk4poor force-pushed the feat/provider-modules branch from b398ce2 to 27d8a67 Compare April 27, 2026 11:32
@github-actions

Copy link
Copy Markdown
Contributor

🚨 CRITICAL Supply Chain Risk Detected

This PR contains a pattern that has been used in real supply chain attacks. A maintainer must review the flagged code carefully before merging.

🚨 CRITICAL: Install-hook file added or modified

These files can execute code during package installation or interpreter startup.

Files:

hermes_cli/setup.py
skills/productivity/google-workspace/scripts/setup.py

Scanner only fires on high-signal indicators: .pth files, base64+exec/eval combos, subprocess with encoded commands, or install-hook files. Low-signal warnings were removed intentionally — if you're seeing this comment, the finding is worth inspecting.

@myth05

myth05 commented Apr 27, 2026

Copy link
Copy Markdown

Reviewed the flagged files.

I don’t see supply-chain payload indicators in these changes:

  • no .pth file
  • no base64 + exec/eval
  • no obfuscated subprocess payload
  • no setup() / cmdclass packaging install hook

hermes_cli/setup.py appears to be the interactive Hermes setup wizard, not Python package installation machinery.
skills/productivity/google-workspace/scripts/setup.py is a manually-invoked Google OAuth helper script.

That said, the filename-based scanner is right to force maintainer review here. These filenames are high-signal in package-install contexts, and both should be eyeballed before merge.

One non-supply-chain issue worth checking: the Google OAuth scope extraction change appears to discard the raw callback URL before trying to read scope, so granted-scope persistence may have regressed.

Recommendation:

  1. Maintainer review/acknowledge these two files.
  2. Consider renaming helper scripts away from setup.py where possible, e.g. google_oauth_setup.py, to avoid future install-hook false positives.

@kshitijk4poor kshitijk4poor force-pushed the feat/provider-modules branch from 27d8a67 to ad0ecc6 Compare April 27, 2026 19:38
Comment thread hermes_cli/doctor.py Outdated
continue
if _pp.auth_type != "api_key" or not _pp.env_vars:
continue
_label = _pp.display_name or _pp.name
Comment thread hermes_cli/doctor.py Outdated
continue
_label = _pp.display_name or _pp.name
if _label in _known_doctor_names:
continue
@kshitijk4poor kshitijk4poor force-pushed the feat/provider-modules branch 3 times, most recently from 040a7d6 to bc262ff Compare April 28, 2026 03:26
Comment thread agent/auxiliary_client.py Dismissed
Comment thread agent/auxiliary_client.py Dismissed
Comment thread agent/auxiliary_client.py Dismissed
Comment thread agent/auxiliary_client.py Dismissed
@perlowja

perlowja commented Apr 28, 2026

Copy link
Copy Markdown
Contributor

Concrete implementation offer

The design judgment in #14424 that I would build on is the move to providers/<name>.py profiles as the declaration point for identity, endpoints, fetch_models(), fallback models, headers, and provider quirks. The api_mode taxonomy (chat_completions, codex_responses, anthropic_messages, bedrock_converse, copilot_acp) and auth_type taxonomy (API key, OAuth, Copilot, AWS SDK, external process) mean an operator-facing inventory command can ask the profile registry what exists instead of rediscovering it through transport/config branches.

After #14424 lands, I would be happy to take a follow-up PR for three non-interactive commands, each roughly 50-150 LOC and built entirely on that substrate:

  • hermes models list [--provider=NAME] [--json] — enumerate models from provider.fetch_models() for one or all configured providers, then fall back to fallback_models.
  • hermes models status [--provider=NAME] — per model: configured, auth state, and last test result if Hermes has one recorded.
  • hermes providers list [--json] — list the 30 profiles with auth state, redacted key/env state, base_url, and models_url.

Sketch of the shape I have in mind:

def build_models_payload(provider: str | None = None) -> dict[str, object]:
    profiles = [get_provider_profile(provider)] if provider else list_providers()
    rows = []
    for profile in filter(None, profiles):
        state = resolve_provider_state(profile)
        if not state.configured:
            continue
        model_ids = profile.fetch_models(api_key=state.api_key) or list(profile.fallback_models)
        rows.append({
            "provider": profile.name,
            "auth_state": state.auth_state,
            "models": [{"id": m, "configured": is_configured(profile.name, m)} for m in model_ids],
        })
    return {"schema_version": 1, "providers": rows}

@pytest.fixture
def demo_provider(monkeypatch):
    profile = ProviderProfile(name="demo", base_url="https://demo.test/v1")
    monkeypatch.setattr(profile, "fetch_models", lambda **_: ["demo/model-a"])
    return profile

Issue close-out map I would propose:

Want me to open the follow-up PR against master once #14424 merges, or against this branch as a stacked PR?


Context: I maintain the nclawzero/* distro stack and contribute upstream across the claw-family ecosystem — openclaw/openclaw (PR #70224 merged), zeroclaw-labs/zeroclaw (multiple PRs in flight, several APPROVED). Happy to coordinate the follow-up so it lands in the order/shape that fits your maintainer plan.

@kshitijk4poor kshitijk4poor force-pushed the feat/provider-modules branch 3 times, most recently from 735bf2f to 8d0710a Compare April 29, 2026 14:02
…els, transport single-path

feat: provider modules — ProviderProfile ABC, 29 providers, fetch_models, transport single-path

Introduces providers/ as the single source of truth for every inference
provider. All 29 providers declared with correct data cross-checked against
auth.py, runtime_provider.py and auxiliary_client.py.

Rebased onto main (30307a9). Incorporates post-salvage fixes from
5672414 (gmi aux model google/gemini-3.1-flash-lite-preview, already set in providers/gmi.py).
@kshitijk4poor kshitijk4poor force-pushed the feat/provider-modules branch from 8d0710a to 84d1673 Compare April 29, 2026 14:40
@kshitijk4poor

Copy link
Copy Markdown
Collaborator Author

Concrete implementation offer

The design judgment in #14424 that I would build on is the move to providers/<name>.py profiles as the declaration point for identity, endpoints, fetch_models(), fallback models, headers, and provider quirks. The api_mode taxonomy (chat_completions, codex_responses, anthropic_messages, bedrock_converse, copilot_acp) and auth_type taxonomy (API key, OAuth, Copilot, AWS SDK, external process) mean an operator-facing inventory command can ask the profile registry what exists instead of rediscovering it through transport/config branches.

After #14424 lands, I would be happy to take a follow-up PR for three non-interactive commands, each roughly 50-150 LOC and built entirely on that substrate:

  • hermes models list [--provider=NAME] [--json] — enumerate models from provider.fetch_models() for one or all configured providers, then fall back to fallback_models.
  • hermes models status [--provider=NAME] — per model: configured, auth state, and last test result if Hermes has one recorded.
  • hermes providers list [--json] — list the 30 profiles with auth state, redacted key/env state, base_url, and models_url.

Sketch of the shape I have in mind:

def build_models_payload(provider: str | None = None) -> dict[str, object]:
    profiles = [get_provider_profile(provider)] if provider else list_providers()
    rows = []
    for profile in filter(None, profiles):
        state = resolve_provider_state(profile)
        if not state.configured:
            continue
        model_ids = profile.fetch_models(api_key=state.api_key) or list(profile.fallback_models)
        rows.append({
            "provider": profile.name,
            "auth_state": state.auth_state,
            "models": [{"id": m, "configured": is_configured(profile.name, m)} for m in model_ids],
        })
    return {"schema_version": 1, "providers": rows}

@pytest.fixture
def demo_provider(monkeypatch):
    profile = ProviderProfile(name="demo", base_url="https://demo.test/v1")
    monkeypatch.setattr(profile, "fetch_models", lambda **_: ["demo/model-a"])
    return profile

Issue close-out map I would propose:

Want me to open the follow-up PR against master once #14424 merges, or against this branch as a stacked PR?


Context: I maintain the nclawzero/* distro stack and contribute upstream across the claw-family ecosystem — openclaw/openclaw (PR #70224 merged), zeroclaw-labs/zeroclaw (multiple PRs in flight, several APPROVED). Happy to coordinate the follow-up so it lands in the order/shape that fits your maintainer plan.

Hey @perlowja, that sounds awesome! I’d love to discuss this further to ensure it doesn’t get overlooked. Can we move to Discord? I’m kxee there.

@perlowja

Copy link
Copy Markdown
Contributor

Hey @kxee — happy to coordinate but let's do it over email rather than Discord. Drop a desired-plan note to jperlow@gmail.com (stacking-vs-master-after, target merge order, scope-creep to avoid, anything you want to gate the follow-up on) and we'll go from there.

Reference for the docs surface I have in flight: #15214docs(providers): Together/Groq/Perplexity cookbook via custom_providers on my fork. Once #14424 lands, that cookbook re-roots against your providers/<name>.py profile structure rather than the current custom_providers shape. Useful as the doc pattern that wraps your implementation surface, and a sanity check that the operator-facing story stays coherent end-to-end.

To be clear: your #14424 is the architectural basis. The CLI follow-ups (hermes models list/status, hermes providers list) sit on top of ProviderProfile, the registry, and fetch_models() exactly as you've defined them — I'm not proposing any reshape to your work, just consuming it. Same posture on the issue close-out map; I'd want your read on the close-out scope before claiming any of those issues.

Talk soon.

@perlowja

perlowja commented May 10, 2026 via email

Copy link
Copy Markdown
Contributor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder P3 Low — cosmetic, nice to have type/refactor Code restructuring, no behavior change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

tracking: provider modules refactor — Cycle 2 of transport/provider infrastructure

5 participants