Skip to content

Three bugs in v0.9.0: context_length override, thinking-block sessions, config.yaml vs env precedence #11096

@c0nSpIc0uS7uRk3r

Description

@c0nSpIc0uS7uRk3r

Hi team — surfacing three bugs we hit running v0.9.0 (v2026.4.13, commit 1af2e18d) in production. All three are reproducible on a standard install; all three have local workarounds we've shipped, but the root cause lives upstream.

Environment

  • Hermes v0.9.0 / v2026.4.13, commit 1af2e18d
  • Running on Ubuntu 24.04, Python 3.11 venv
  • Five Hermes instances (prime on Anthropic direct, three Horsemen on Gemini / GPT-5.4 proxy / Grok, one JiMMY on local llama-server)
  • Platforms: Telegram + Signal via signal-cli HTTP bridge

Bug 1 — MINIMUM_CONTEXT_LENGTH gate ignores the documented model.context_length config override

Commit that introduced it: c8aff7463 ("prevent agent from stopping mid-task", 2026-04-11) added the 64K minimum context guard at run_agent.py:1355-1366.

Symptom: Agents running against models with a native context window below 64K (e.g. Qwen 3.5 35B-A3B at 32K) fail at agent init with ValueError: context_length below MINIMUM_CONTEXT_LENGTH. Every hermes chat invocation fails in ~1s.

Why it's a bug: The error message explicitly documents an escape hatch:

"...or set model.context_length in config.yaml to override"

But the guard fires unconditionally — it doesn't check whether the operator set model.context_length in config.yaml. The escape hatch is cosmetic.

Minimal repro:

  1. Set model.context_length: 32000 in config.yaml
  2. Run hermes chat -q "hello" -Q
  3. Error: agent init fails despite the override

Our fix: run_agent.py:1358-1366, add and _config_context_length is None to the gate condition. _config_context_length is already a local in the same __init__.

from agent.model_metadata import MINIMUM_CONTEXT_LENGTH
_ctx = getattr(self.context_compressor, "context_length", 0)
if _ctx and _ctx < MINIMUM_CONTEXT_LENGTH and _config_context_length is None:
    raise ValueError(...)

Impact on us: Every agent run against a sub-64K model broke at init. With the guard change, all Tier 1 evals pass against that model.


Bug 2 — Trailing thinking block in assistant turn poisons the session → every subsequent API call returns HTTP 400

Symptom: Once per session, typically after context compression or session truncation, an assistant turn is left with a thinking (or redacted_thinking) block as its final content block — no text or tool_use block after it. Anthropic's API then rejects every subsequent request with:

HTTP 400 - messages.N: The final block in an assistant message cannot be `thinking`.

The session is effectively poisoned until manually purged (move the .jsonl out of sessions/).

Relevant call site: agent/anthropic_adapter.py, convert_messages_to_anthropic(). The function already has extensive thinking-block signature management (strip/downgrade/merge), but doesn't enforce that the final block in an assistant message is non-thinking.

Reproduction: Hard to trigger deterministically — seems to correlate with thinking-mode interrupts and context compression. We have two captured request dumps where messages[5] and messages[9] each end on a lone thinking block that Anthropic rejects. An anonymized minimal repro skeleton (one of the dumps with all user text, system prompt, tools, and auth stripped) is included inline below (full raw dumps available privately on request).

How to detect in the wild: Any afflicted session .jsonl will have an assistant turn whose final content block is thinking or redacted_thinking. Grep each session file: if the last assistant content[-1].type is one of those, the session is poisoned.

Our fix: Added a guard at the end of convert_messages_to_anthropic(), after the existing thinking-block processing loop. If any assistant message's final block is still thinking/redacted_thinking, append a minimal text block:

for m in result:
    if m.get("role") != "assistant" or not isinstance(m.get("content"), list) or not m["content"]:
        continue
    last_block = m["content"][-1]
    if isinstance(last_block, dict) and last_block.get("type") in _THINKING_TYPES:
        m["content"].append({"type": "text", "text": "(continuing)"})

Validated by replay: our captured 400-causing request now converts to a valid payload with messages[9] ending in the sentinel text block. Reasoning content above is preserved.

Root cause guess: Probably in agent/context_engine.py or the compressor — upstream mutation snips the trailing text/tool_use while leaving the thinking block in place. A proper fix would prevent creating the state, not just paper over it; our guard is a safety net.

Impact on us: Session corruption causes user-visible "Error code: 400" fallback messages (Prime emits them to the chat as a last-ditch reply). Every operator restart cost us a full conversation.


Bug 3 — config.yaml signal.enabled: false is silently ignored when SIGNAL_* env vars are set

Relevant code: gateway/config.py:820-830:

signal_url = os.getenv("SIGNAL_HTTP_URL")
signal_account = os.getenv("SIGNAL_ACCOUNT")
if signal_url and signal_account:
    if Platform.SIGNAL not in config.platforms:
        config.platforms[Platform.SIGNAL] = PlatformConfig()
    config.platforms[Platform.SIGNAL].enabled = True  # ← unconditional
    ...

Symptom: Setting signal.enabled: false in both the top-level signal: block and the platforms.signal: block of config.yaml is silently overwritten when both SIGNAL_HTTP_URL and SIGNAL_ACCOUNT env vars are present. The enabled = True assignment on line 827 clobbers the YAML flag with no log line or warning.

Impact on us: We wanted to disable Prime's Signal integration (to give ownership of the Signal reply path to a separate service). We set enabled: false in YAML and restarted — but Prime silently re-enabled Signal from the env vars. Diagnosing this took ~30 minutes of confusion. The fix was to comment out the env vars in .env; the YAML flag was never honored.

Suggested fix: Env vars should supply defaults when YAML is absent, not override explicit YAML values. Either:

  • Check config.platforms[Platform.SIGNAL].enabled before overwriting, OR
  • Keep the YAML read as authoritative, and only use env vars to fill in missing fields (url/account) when the platform is already enabled in YAML

Same pattern likely exists for other platforms (mattermost, matrix, etc.) — any operator who uses config.yaml to disable a platform while env vars happen to be set will hit this silently.


Meta

All three bugs have local patches applied to our shared-codebase install. I'll open three separate PRs (one per bug) for atomic review unless you'd prefer them bundled — just say the word. Full raw request dumps for Bug 2 available privately on request.

Thanks for the great work on v0.9 — it's been a meaningful improvement over v0.8 across the board, these are the few rough edges we hit.


Appendix — Bug 2 minimal repro skeleton (anonymized)

Click to expand — 2.8 KB JSON, shape-only, no user data
{
  "__note__": "Anonymized minimal repro skeleton for Hermes v0.9.0 Bug 2 (trailing thinking block \u2192 HTTP 400). Derived from a real production request dump; all user content, system prompt, tool definitions, phone numbers, and auth headers stripped. Shows only the message-block *shape* that Anthropic rejects.",
  "hermes_version": "v0.9.0 (v2026.4.13)",
  "hermes_commit": "1af2e18d",
  "session_id": "<redacted>",
  "anthropic_error": {
    "status_code": 400,
    "type": "invalid_request_error",
    "message": "messages.9: The final block in an assistant message cannot be `thinking`.",
    "request_id": "req_011Ca7V8wAqYKB785YXoudTE"
  },
  "request_skeleton": {
    "model": "claude-sonnet-4-6",
    "max_tokens": 64000,
    "thinking_enabled": true,
    "system_blocks": 2,
    "tools_count": 41,
    "messages": [
      {
        "role": "user",
        "content_type": "string",
        "content_length_chars": 150
      },
      {
        "role": "assistant",
        "content_blocks": [
          {
            "type": "text",
            "text_length_chars": 67
          }
        ]
      },
      {
        "role": "user",
        "content_type": "string",
        "content_length_chars": 156
      },
      {
        "role": "assistant",
        "content_blocks": [
          {
            "type": "text",
            "text_length_chars": 49
          }
        ]
      },
      {
        "role": "user",
        "content_type": "string",
        "content_length_chars": 158
      },
      {
        "role": "assistant",
        "content_blocks": [
          {
            "type": "text",
            "text_length_chars": 78
          },
          {
            "type": "text",
            "text_length_chars": 74
          }
        ]
      },
      {
        "role": "user",
        "content_type": "string",
        "content_length_chars": 157
      },
      {
        "role": "assistant",
        "content_blocks": [
          {
            "type": "text",
            "text_length_chars": 93
          }
        ]
      },
      {
        "role": "user",
        "content_type": "string",
        "content_length_chars": 15
      },
      {
        "role": "assistant",
        "content_blocks": [
          {
            "type": "thinking",
            "signature_present": true,
            "thinking_length_chars": 38
          }
        ]
      }
    ]
  },
  "diagnosis": "messages[9] is an assistant turn whose content is a single `thinking` block with no trailing `text` or `tool_use` block. Anthropic's API rejects the whole request. Every subsequent call in the session fails identically until the .jsonl is manually removed from sessions/. Detect in the wild: grep each session .jsonl for any assistant turn whose final content block is `thinking` or `redacted_thinking`."
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    P1High — major feature broken, no workaroundarea/configConfig system, migrations, profilescomp/agentCore agent loop, run_agent.py, prompt builderprovider/anthropicAnthropic native Messages APItype/bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions