API server ignores per-platform model config (no way to run api_server on a different model than the global default)

## Summary

The API server platform (`gateway/platforms/api_server.py`) always uses the global `model.default` — there is no way to run the API server on a different (e.g. cheaper/faster) model than the rest of the gateway. `_resolve_gateway_model()` ignores any per-platform configuration.

This is a feature gap rather than a crash: operators who want, say, the HTTP API server on Sonnet while CLI/Discord stay on Opus have no supported knob.

## Current behaviour

`gateway/platforms/api_server.py::APIServerAdapter._create_agent` resolves the model with:

```python
model = _resolve_gateway_model()
```

`_resolve_gateway_model(config=None)` (in `gateway/run.py`) only ever reads `model.default` / `model.model`. There is no platform dimension, so every gateway platform that constructs a temporary agent shares one model.

## Proposed fix

Add an **opt-in** `platform` parameter to `_resolve_gateway_model()`. When supplied *and* `platform_models.<platform>` exists in `config.yaml`, that model wins over `model.default`. Every existing call site omits the argument and is byte-for-byte unchanged — only `api_server._create_agent` opts in.

Config shape (additive, optional):

```yaml
model:
  default: claude-opus-4-8
platform_models:
  api_server:
    default: claude-sonnet-4-6   # or a bare string: api_server: claude-sonnet-4-6
```

Note: provider/credentials still come from the global runtime config, so the override must name a model compatible with the active provider. (A future enhancement could thread a per-platform provider too.)

## Diff

```diff
diff --git a/gateway/platforms/api_server.py b/gateway/platforms/api_server.py
index a18630f85..92585e6bd 100644
--- a/gateway/platforms/api_server.py
+++ b/gateway/platforms/api_server.py
@@ -963,9 +963,9 @@ class APIServerAdapter(BasePlatformAdapter):
 
         runtime_kwargs = _resolve_runtime_agent_kwargs()
         reasoning_config = GatewayRunner._load_reasoning_config()
-        model = _resolve_gateway_model()
 
         user_config = _load_gateway_config()
+        model = _resolve_gateway_model(user_config, platform="api_server")
         enabled_toolsets = sorted(_get_platform_tools(user_config, "api_server"))
 
         max_iterations = int(os.getenv("HERMES_MAX_ITERATIONS", "90"))
diff --git a/gateway/run.py b/gateway/run.py
index a2e41c609..f82a276a1 100644
--- a/gateway/run.py
+++ b/gateway/run.py
@@ -1443,14 +1443,39 @@ def _load_gateway_runtime_config() -> dict:
     return expanded if isinstance(expanded, dict) else {}
 
 
-def _resolve_gateway_model(config: dict | None = None) -> str:
+def _resolve_gateway_model(config: dict | None = None, platform: str | None = None) -> str:
     """Read model from config.yaml — single source of truth.
 
     Without this, temporary AIAgent instances (e.g. /compress) fall
     back to the hardcoded default which fails when the active provider is
     openai-codex.
+
+    Per-platform override (opt-in): when ``platform`` is supplied AND
+    ``platform_models.<platform>`` is set in config.yaml, that model wins
+    over the global ``model.default``. This lets a single platform (e.g.
+    the API server) run a cheaper/faster model without affecting any other
+    platform. Callers that omit ``platform`` — every existing call site —
+    are completely unaffected and resolve the global default as before.
+
+    The override value may be a bare model string, or a mapping with a
+    ``default`` (or ``model``) key. Any ``provider`` key in the mapping is
+    NOT consumed here — provider/credentials still come from the global
+    runtime config, so a platform override must name a model that works
+    with the active provider.
     """
     cfg = config if config is not None else _load_gateway_config()
+
+    if platform:
+        platform_models = cfg.get("platform_models")
+        if isinstance(platform_models, dict):
+            override = platform_models.get(platform)
+            if isinstance(override, str) and override:
+                return override
+            if isinstance(override, dict):
+                model = override.get("default") or override.get("model")
+                if model:
+                    return model
+
     model_cfg = cfg.get("model", {})
     if isinstance(model_cfg, str):
         return model_cfg
```

## Tests

8 new regression tests in `tests/test_empty_model_fallback.py::TestResolveGatewayModelPlatformOverride` covering opt-in isolation (no `platform=` → global default), matching/non-matching platforms, bare-string and dict override shapes, and empty/missing/malformed `platform_models`. One existing monkeypatch in `tests/gateway/test_api_server.py` was widened from `lambda:` to `lambda *a, **k:` to accept the new optional arg.

Result with the fix: full `tests/gateway/test_api_server.py`, `tests/gateway/test_api_server_toolset.py`, and `tests/test_empty_model_fallback.py` pass (182 passed), and the other `_resolve_gateway_model` consumers (compress/fast/discord/session_info — 31 tests) are unaffected.

(Heads up: that test suite leaks file descriptors via aiohttp test apps and hits `OSError: [Errno 24] Too many open files` under a low `ulimit -n`. Raising `ulimit -n 4096` makes it green; unrelated to this change but worth a separate look.)

## Environment

- Hermes Agent, local checkout of `main`
- Python 3.11
- macOS 26.5


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

API server ignores per-platform model config (no way to run api_server on a different model than the global default) #34612

Summary

Current behaviour

Proposed fix

Diff

Tests

Environment

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

API server ignores per-platform model config (no way to run api_server on a different model than the global default) #34612

Description

Summary

Current behaviour

Proposed fix

Diff

Tests

Environment

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions