Summary
Auxiliary text-model resolution does not appear to stay consistent with the active runtime when custom endpoints are involved.
In particular, auxiliary provider=auto can resolve through a generic custom path instead of staying aligned with the active custom runtime that the main agent is already using.
When that happens, the auxiliary path can use a different credential source for the same endpoint, which may cause auxiliary-only failures even though the main runtime works.
Reported by Hermes Agent.
Problem statement
There seem to be two different resolution paths for custom endpoints:
- a named custom-provider style path
- a generic
custom runtime path
Those paths do not appear to share a single source of truth for credential resolution.
As a result, the same endpoint can be reached through different runtime-resolution branches with different API key sources.
Why this is logically problematic
If the main agent is already running successfully against a custom endpoint, auxiliary auto should prefer runtime consistency over re-deriving another generic custom runtime from scratch.
Otherwise, the system can end up in a split state where:
- the main runtime is valid
- the auxiliary runtime resolves to the same endpoint URL
- but the auxiliary path uses different credentials or a different config source
- and auxiliary-only features fail
Verified behavior
- the main runtime could resolve and use a custom endpoint successfully
- auxiliary task resolution with
provider=auto resolved to custom
- that auxiliary custom path did not reliably reuse the same credential source as the active runtime
- the same endpoint could therefore be associated with different credential sources depending on whether resolution happened through the main runtime path or the auxiliary auto/custom path
- this inconsistency contributed to auxiliary request failures
Expected behavior
When auxiliary resolution is set to auto, and the active runtime is already a working custom runtime, the auxiliary path should prefer reusing that resolved runtime identity and credentials rather than independently reconstructing a generic custom runtime.
Actual behavior
Auxiliary auto/custom resolution can diverge from the active runtime's credential source, even when both end up targeting the same endpoint.
Suggested fix
Most logical option:
- Reuse the already-resolved active runtime for auxiliary
auto when the active runtime is a custom endpoint
Good fallback option:
2. Unify generic custom and named custom credential resolution so that the same effective endpoint does not resolve different credential sources unless the user explicitly overrides them
Additional safeguard:
3. Add debug-level reporting of which credential source was selected for auxiliary resolution (without printing secret values), so mismatches are diagnosable
Why this matters beyond one tool
This can affect any auxiliary consumer that relies on agent/auxiliary_client.py, not just one specific feature.
A mismatch in runtime resolution can surface as tool-specific failures even though the primary chat runtime is healthy.
Likely relevant code paths
agent/auxiliary_client.py
_resolve_task_provider_model()
_resolve_auto()
_try_custom_endpoint()
_resolve_custom_runtime()
resolve_provider_client()
hermes_cli/runtime_provider.py
resolve_runtime_provider(requested='custom')
- generic custom runtime resolution
- named custom-provider runtime resolution
Summary
Auxiliary text-model resolution does not appear to stay consistent with the active runtime when custom endpoints are involved.
In particular, auxiliary
provider=autocan resolve through a genericcustompath instead of staying aligned with the active custom runtime that the main agent is already using.When that happens, the auxiliary path can use a different credential source for the same endpoint, which may cause auxiliary-only failures even though the main runtime works.
Reported by Hermes Agent.
Problem statement
There seem to be two different resolution paths for custom endpoints:
customruntime pathThose paths do not appear to share a single source of truth for credential resolution.
As a result, the same endpoint can be reached through different runtime-resolution branches with different API key sources.
Why this is logically problematic
If the main agent is already running successfully against a custom endpoint, auxiliary
autoshould prefer runtime consistency over re-deriving another generic custom runtime from scratch.Otherwise, the system can end up in a split state where:
Verified behavior
provider=autoresolved tocustomExpected behavior
When auxiliary resolution is set to
auto, and the active runtime is already a working custom runtime, the auxiliary path should prefer reusing that resolved runtime identity and credentials rather than independently reconstructing a generic custom runtime.Actual behavior
Auxiliary auto/custom resolution can diverge from the active runtime's credential source, even when both end up targeting the same endpoint.
Suggested fix
Most logical option:
autowhen the active runtime is a custom endpointGood fallback option:
2. Unify generic custom and named custom credential resolution so that the same effective endpoint does not resolve different credential sources unless the user explicitly overrides them
Additional safeguard:
3. Add debug-level reporting of which credential source was selected for auxiliary resolution (without printing secret values), so mismatches are diagnosable
Why this matters beyond one tool
This can affect any auxiliary consumer that relies on
agent/auxiliary_client.py, not just one specific feature.A mismatch in runtime resolution can surface as tool-specific failures even though the primary chat runtime is healthy.
Likely relevant code paths
agent/auxiliary_client.py_resolve_task_provider_model()_resolve_auto()_try_custom_endpoint()_resolve_custom_runtime()resolve_provider_client()hermes_cli/runtime_provider.pyresolve_runtime_provider(requested='custom')