Skip to content

[Bug]: model-fallback path raises 'NoneType' object is not iterable, masking the original error #35848

@fesalfayed

Description

@fesalfayed

Bug Description

When a primary-model API error triggers the model-fallback path, the fallback activation itself can crash with:

'NoneType' object is not iterable

This was observed downstream of the extended-thinking HTTP 400 crash-loop (see companion issue / PR #35846): the non-retryable 400 falls through to the fallback path, and the fallback path then raises this second, distinct error. It turns a single recoverable failure into a hard crash and masks the original error.

Steps to Reproduce

  1. Configure a profile without an explicit fallback chain (or with a partially-populated one).
  2. Trigger a non-retryable primary-model error (the extended-thinking 400 in fix(anthropic): demote dead thinking signature when orphan-strip mutates the latest turn #35846 is one reliable trigger; any provider error that routes into try_activate_fallback works).
  3. Observe the fallback path raise 'NoneType' object is not iterable instead of cleanly reporting "no fallback available" / surfacing the original error.

Note: the full traceback was not captured at the time (the profile was mid-crash-loop). This issue is filed to track root-causing the exact frame; the candidate locus is below.

Expected Behavior

The fallback path should fail gracefully: when no fallback provider is available, surface the original primary error (or a clear "no fallback configured" message), never raise a secondary TypeError. A None chain / None provider_data / None reasoning_details should be treated as empty, not iterated.

Actual Behavior

The fallback activation raises 'NoneType' object is not iterable, crashing the turn and obscuring the real upstream cause.

Affected Component

Agent Core (conversation loop, context compression, memory)

Messaging Platform (if gateway-related)

Discord

Operating System

macOS 26.4 (Apple Silicon)

Python Version

3.11.15

Hermes Version

v0.15.1 (2026.5.29)

Root Cause Analysis (candidate)

The crash is in the fallback activation / response-normalization path. Candidate frames to audit (agent/chat_completion_helpers.py, run_agent.py):

  • try_activate_fallback() (agent/chat_completion_helpers.py:1005) iterates agent._fallback_chain; if the chain is None (rather than []) at any unguarded access, for ... in chain raises 'NoneType' object is not iterable. _has_pending_fallback() guards with or [], but the activation path should be audited for the same defensive default.
  • reasoning_details handling (agent/chat_completion_helpers.py:910 / :925) and provider_data assembly — if a normalized response carries reasoning_details=None/provider_data=None and a downstream consumer iterates it without a guard, same failure mode.

Recommend: normalize all of _fallback_chain, provider_data, and reasoning_details to []/{} at their boundaries, and add a test that drives try_activate_fallback with an unset/None chain.

Proposed Fix

Defensive or [] / or {} defaults at the iteration boundaries above, plus a regression test that a primary error with no configured fallback surfaces the original error rather than a TypeError. (Filing for maintainer triage — happy to follow up with a PR once the exact frame is confirmed from a captured traceback.)

Are you willing to submit a PR for this?

  • I'd like to fix this myself and submit a PR

Metadata

Metadata

Assignees

No one assigned

    Labels

    P1High — major feature broken, no workaroundcomp/agentCore agent loop, run_agent.py, prompt builderprovider/anthropicAnthropic native Messages APItype/bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions