Skip to content

[Bug]: auxiliary async model path can return invalid payloads and crash session_search with misleading AttributeError #7264

@woaim65

Description

@woaim65

Bug Description

session_search / auxiliary async model calls can silently accept a non-OpenAI response object, then fail later with a misleading 'str' object has no attribute 'choices' traceback.

In the same custom-provider setup, auxiliary failures also surface as HTTP 400: No models provided and region-mismatch errors, even though the configured main model/provider resolve correctly.

This looks like an auxiliary-client contract / fallback bug, not a user config bug.

Steps to Reproduce

  1. Configure Hermes with a working custom main model in ~/.hermes/config.yaml, for example:
model:
  provider: custom
  default: gpt-5
  base_url: https://httpsgood.abrdns.com/v1
  api_key: ...
  1. Trigger a session_search path (or directly call async_call_llm(task="session_search", ...)).

  2. Force the async auxiliary client path to return a non-OpenAI payload (for example, a bare string from a mocked adapter / malformed fallback path).

  3. Observe that async_call_llm() returns the invalid payload unchanged, and tools/session_search_tool.py crashes later when it assumes a normal OpenAI response object.

Minimal repro used locally:

import asyncio
from agent import auxiliary_client as ac
from tools import session_search_tool as sst

class FakeAsyncCompletions:
    async def create(self, **kwargs):
        return 'not-a-response-object'

class FakeAsyncClient:
    def __init__(self):
        self.chat = type('Chat', (), {'completions': FakeAsyncCompletions()})()
        self.base_url = 'https://example.invalid/v1'

orig_get = ac._get_cached_client
orig_resolve = ac._resolve_task_provider_model
ac._resolve_task_provider_model = lambda *a, **k: ('custom', 'gpt-5', 'https://example.invalid/v1', 'x')
ac._get_cached_client = lambda *a, **k: (FakeAsyncClient(), 'gpt-5')

async def main():
    out = await sst._summarize_session('hello world', 'hello', {'source':'test','started_at':'now'})
    print(repr(out))

asyncio.run(main())

Observed traceback:

WARNING:root:Session summarization failed after 3 attempts: 'str' object has no attribute 'choices'
Traceback (most recent call last):
  File "/home/oz/hermes-agent/tools/session_search_tool.py", line 164, in _summarize_session
    content = extract_content_or_reasoning(response)
  File "/home/oz/hermes-agent/agent/auxiliary_client.py", line 2114, in extract_content_or_reasoning
    msg = response.choices[0].message
AttributeError: 'str' object has no attribute 'choices'

Expected Behavior

  • async_call_llm() should enforce a response contract and reject malformed payloads immediately with a clear error.
  • Auxiliary task failures should not degrade into misleading downstream AttributeErrors.
  • A valid model.provider=custom + model.default=gpt-5 config should not randomly surface as No models provided in auxiliary paths if the main config resolves correctly.

Actual Behavior

  • Main config resolves correctly.
  • Auxiliary async path can propagate an invalid payload unchanged.
  • session_search then fails later in extract_content_or_reasoning() with 'str' object has no attribute 'choices'.
  • In real gateway logs, related auxiliary failures showed up as:
    • HTTP 400: No models provided
    • Session summarization failed after 3 attempts
    • 403 This model is not available in your region.

Evidence Collected

I verified directly in the current codebase that the main config resolves correctly:

from hermes_cli.config import load_config
from agent.auxiliary_client import _read_main_model, _read_main_provider, get_text_auxiliary_client, resolve_provider_client

print(load_config().get('model'))
print(_read_main_model())      # 'gpt-5'
print(_read_main_provider())   # 'custom'
print(get_text_auxiliary_client('session_search'))
print(resolve_provider_client('custom'))

Observed locally:

CONFIG_MODEL {'provider': 'custom', 'default': 'gpt-5', 'base_url': 'https://httpsgood.abrdns.com/v1', ...}
MAIN_MODEL 'gpt-5'
MAIN_PROVIDER 'custom'
AUX session_search MODEL 'gpt-5' CLIENT OpenAI BASE https://httpsgood.abrdns.com/v1/
RESOLVE_CUSTOM 'gpt-5' OpenAI https://httpsgood.abrdns.com/v1/

So the primary config path is fine; the failure happens later in the async auxiliary execution path.

Suspected Root Cause

agent/auxiliary_client.py does not validate the return shape of client.chat.completions.create(...) in async_call_llm() before passing the result upward.

Relevant flow:

  • tools/session_search_tool.py:155 calls await async_call_llm(task="session_search", ...)
  • agent/auxiliary_client.py:2234 returns await client.chat.completions.create(**kwargs) with no contract check
  • tools/session_search_tool.py:164 calls extract_content_or_reasoning(response)
  • agent/auxiliary_client.py:2114 assumes response.choices[0].message

If any async adapter / fallback / malformed client returns a bare string or some other non-OpenAI object, the real failure is delayed and misreported.

Separately, the No models provided log line is likely a sibling symptom from auxiliary fallback / config-failure handling, not proof that model.default is missing in the user's config.

Suggested Fix

  1. Add response-shape validation in both call_llm() and async_call_llm() before returning.
    • Fail fast with a clear TypeError / RuntimeError if the payload lacks choices[0].message.
  2. Add a regression test covering malformed async auxiliary responses.
  3. Audit auxiliary fallback/config-error paths that can produce HTTP 400: No models provided despite a valid custom main model.

Affected Component

  • agent/auxiliary_client.py
  • tools/session_search_tool.py
  • auxiliary async task routing / fallback paths

Messaging Platform (if gateway-related)

Telegram (symptom observed there), but root cause appears platform-agnostic.

Operating System

Linux

Python Version

3.13

Hermes Version

main branch as of 2026-04-10 local checkout

Are you willing to submit a PR for this?

  • I'd like to fix this myself and submit a PR

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions