Describe the bug
While looking into a self-hosted/OpenAI-compatible provider setup, I noticed the OpenAI chat backend is pretty trusting about the shape of responses coming back from the SDK/proxy.
There are a few places in src/llm/backends/openai.py where Honcho assumes the response is a normal OpenAI ChatCompletion with at least one choice and a usage attribute. If an OpenAI-compatible gateway returns an empty or odd response shape, Honcho can crash with a raw AttributeError, TypeError, or IndexError instead of turning that into a controlled provider error or trying the next fallback.
The main spots I noticed:
# complete(), structured parse path
parsed = response.choices[0].message.parsed
raw_content = response.choices[0].message.content or ""
# _normalize_response()
usage = response.usage
finish_reason = response.choices[0].finish_reason
message = response.choices[0].message
# _parse_or_repair_structured_content()
raw_content = response.choices[0].message.content or ""
refusal = getattr(response.choices[0].message, "refusal", None)
This seems especially easy to hit with OpenAI-compatible proxies/gateways, where the HTTP call may technically succeed but the object coming back is missing usage, has choices=[], has choices=None, or is not quite the SDK object Honcho expected.
To Reproduce
I don't have a neat public reproduction server for this, but this should be easy to cover with unit tests by mocking chat.completions.create() or chat.completions.parse() to return one of these shapes:
SimpleNamespace(choices=[], usage=None)
SimpleNamespace(choices=None, usage=None)
SimpleNamespace(choices=[SimpleNamespace(message=None)], usage=None)
SimpleNamespace(choices=[...]) with no usage attribute at all
- a plain dict or other non-
ChatCompletion object from an OpenAI-compatible adapter
For structured output, another useful repro is:
parse(response_model) fails or returns a malformed shape
- fallback
create(response_format=json_schema) also fails or returns malformed content
- Honcho currently does not appear to fall back further to
json_object or plain create()
Expected behaviour
I would expect the OpenAI backend to treat malformed/empty provider responses as a provider failure with a clear ValidationException or similar controlled error, not a raw attribute/index crash.
For structured output, it would also be nice if the fallback chain were a bit more defensive:
parse(response_model)
-> create(json_schema)
-> create(json_object)
-> create(no response_format)
Maybe not every provider should get every fallback, but right now the code only seems to cover part of that path.
Why this matters
This is mostly a robustness issue. The current code is fine when the provider behaves exactly like OpenAI, but self-hosted setups often go through OpenRouter, vLLM, new-api, DashScope/Qwen, Groq-compatible layers, etc. Those paths are close enough to work most of the time, but they are not always faithful about response shape.
When they misbehave, the backend should fail in a way that points at the provider response instead of crashing somewhere like response.choices[0] or response.usage.
Suggested fix
A small helper to extract the first choice/message safely would probably clean this up:
def _first_message(response: Any) -> tuple[Any, Any | None]:
usage = getattr(response, "usage", None)
choices = getattr(response, "choices", None) or []
if not choices:
raise ValidationException("OpenAI response did not include any choices")
choice = choices[0]
message = getattr(choice, "message", None)
if message is None:
raise ValidationException("OpenAI response choice did not include a message")
return choice, usage
Then _normalize_response() and the structured repair path could use getattr(..., 0) for token counts and catch AttributeError, TypeError, and IndexError around structured-response parsing/fallbacks.
Tests that would be worth adding:
- empty
choices
- missing
usage
choices=None
message=None
- non-
ChatCompletion object
json_schema structured fallback fails, then json_object or plain create is attempted
Related issues
This is adjacent to, but not exactly the same as:
Those are more provider-specific compatibility problems. This one is about the OpenAI backend being defensive when an OpenAI-compatible provider returns a weird but possible response shape.
Describe the bug
While looking into a self-hosted/OpenAI-compatible provider setup, I noticed the OpenAI chat backend is pretty trusting about the shape of responses coming back from the SDK/proxy.
There are a few places in
src/llm/backends/openai.pywhere Honcho assumes the response is a normal OpenAIChatCompletionwith at least one choice and ausageattribute. If an OpenAI-compatible gateway returns an empty or odd response shape, Honcho can crash with a rawAttributeError,TypeError, orIndexErrorinstead of turning that into a controlled provider error or trying the next fallback.The main spots I noticed:
This seems especially easy to hit with OpenAI-compatible proxies/gateways, where the HTTP call may technically succeed but the object coming back is missing
usage, haschoices=[], haschoices=None, or is not quite the SDK object Honcho expected.To Reproduce
I don't have a neat public reproduction server for this, but this should be easy to cover with unit tests by mocking
chat.completions.create()orchat.completions.parse()to return one of these shapes:SimpleNamespace(choices=[], usage=None)SimpleNamespace(choices=None, usage=None)SimpleNamespace(choices=[SimpleNamespace(message=None)], usage=None)SimpleNamespace(choices=[...])with nousageattribute at allChatCompletionobject from an OpenAI-compatible adapterFor structured output, another useful repro is:
parse(response_model)fails or returns a malformed shapecreate(response_format=json_schema)also fails or returns malformed contentjson_objector plaincreate()Expected behaviour
I would expect the OpenAI backend to treat malformed/empty provider responses as a provider failure with a clear
ValidationExceptionor similar controlled error, not a raw attribute/index crash.For structured output, it would also be nice if the fallback chain were a bit more defensive:
Maybe not every provider should get every fallback, but right now the code only seems to cover part of that path.
Why this matters
This is mostly a robustness issue. The current code is fine when the provider behaves exactly like OpenAI, but self-hosted setups often go through OpenRouter, vLLM, new-api, DashScope/Qwen, Groq-compatible layers, etc. Those paths are close enough to work most of the time, but they are not always faithful about response shape.
When they misbehave, the backend should fail in a way that points at the provider response instead of crashing somewhere like
response.choices[0]orresponse.usage.Suggested fix
A small helper to extract the first choice/message safely would probably clean this up:
Then
_normalize_response()and the structured repair path could usegetattr(..., 0)for token counts and catchAttributeError,TypeError, andIndexErroraround structured-response parsing/fallbacks.Tests that would be worth adding:
choicesusagechoices=Nonemessage=NoneChatCompletionobjectjson_schemastructured fallback fails, thenjson_objector plaincreateis attemptedRelated issues
This is adjacent to, but not exactly the same as:
json_mode=Truefails on Alibaba/Qwen providers — "messages must contain the word 'json'" #663AsyncOpenAIclient lacksbase_url→ 401 againstapi.openai.com#641Those are more provider-specific compatibility problems. This one is about the OpenAI backend being defensive when an OpenAI-compatible provider returns a weird but possible response shape.