Skip to content

feat(image_gen): upgrade Recraft V3 → V4 Pro, Nano Banana → Pro#11406

Merged
teknium1 merged 2 commits into
mainfrom
hermes/hermes-c34e758c
Apr 17, 2026
Merged

feat(image_gen): upgrade Recraft V3 → V4 Pro, Nano Banana → Pro#11406
teknium1 merged 2 commits into
mainfrom
hermes/hermes-c34e758c

Conversation

@teknium1

@teknium1 teknium1 commented Apr 17, 2026

Copy link
Copy Markdown
Contributor

Summary

Upgrades two models in the Hermes image-gen catalog to their newer, higher-quality variants. After this merges, the full supported catalog is 8 FAL.ai models, switchable via hermes tools → Image Generation:

Full Model Catalog (post-merge)

Model Speed Strengths Price
fal-ai/flux-2/klein/9b (default) <1s Fast, crisp text $0.006/MP
fal-ai/flux-2-pro ~6s Studio photorealism $0.03/MP
fal-ai/z-image/turbo ~2s Bilingual EN/CN, 6B params $0.005/MP
fal-ai/nano-banana-pro ~8s Gemini 3 Pro, reasoning depth, text rendering $0.15/image (1K)
fal-ai/gpt-image-1.5 ~15s Prompt adherence $0.034/image
fal-ai/ideogram/v3 ~5s Best typography $0.03–0.09/image
fal-ai/recraft/v4/pro/text-to-image ~8s Design, brand systems, production-ready $0.25/image
fal-ai/qwen-image ~12s LLM-based, complex text $0.02/MP

All selectable via arrow-key picker. Agent sees only prompt + aspect_ratio (landscape/square/portrait); size translation, per-model parameter filtering, and quality tier pinning (GPT-Image) happen internally.

What Changed in This PR

Two models upgraded to their newer variants. Everything else in the catalog stays as-is.

Recraft V3 → Recraft V4 Pro

V3 V4 Pro
ID fal-ai/recraft-v3 fal-ai/recraft/v4/pro/text-to-image
Price $0.04/image $0.25/image (6× premium tier)
Required params style enum (none — V4 dropped style entirely)
Optional control colors, background_color (brand palette)
Seed support

V4 Pro is marketed as "designed with designers" — visual taste, brand systems, production-ready. Significant quality jump.

Nano Banana → Nano Banana Pro

Original Pro
ID fal-ai/nano-banana fal-ai/nano-banana-pro
Architecture Gemini 2.5 Flash Image Gemini 3 Pro Image
Price (1K) $0.08/image $0.15/image
Price (4K) $0.30/image
Web search enable_web_search (+$0.015)
Resolution tiers 1K / 2K / 4K
Generation cap limit_generations (force exactly 1)
Speed ~6s ~8s (reasoning depth tradeoff)

Defaults to resolution: "1K" to keep per-image cost predictable for Nous Subscription. Users who want 4K can pass it through the supports whitelist.

Migration

Users with the old IDs in image_gen.model fall through the existing _resolve_fal_model() warning path ("Unknown FAL model 'X' in config; falling back to default") and land on Klein 9B. Re-running hermes tools → Image Generation picks the new version.

No silent alias from old → new IDs. The 2-6× price jumps on these upgrades warrant explicit user re-selection rather than stealth cost escalation.

Nous Portal / Backend-Dev Action

The previous image-gen PR added 7 new IDs that need allowlist verification on fal-queue-gateway.nousresearch.com. This PR swaps two of those for newer variants, so the updated allowlist items are:

Replace:

  • fal-ai/nano-bananafal-ai/nano-banana-pro
  • fal-ai/recraft-v3fal-ai/recraft/v4/pro/text-to-image

Full current list on Hermes's side:

fal-ai/flux-2/klein/9b       (default)
fal-ai/flux-2-pro
fal-ai/z-image/turbo
fal-ai/nano-banana-pro       ← new
fal-ai/gpt-image-1.5
fal-ai/ideogram/v3
fal-ai/recraft/v4/pro/text-to-image   ← new
fal-ai/qwen-image

Portal billing note: Nano Banana Pro's resolution param can multiply per-image cost (2× at 4K). We default to 1K for Nous Subscription users. If the gateway wants to enforce that, strip resolution from request bodies for subscription accounts and rely on the server-side default.

Client-side, the existing 4xx translator still surfaces clear remediation messages if the portal rejects either new ID.

Test Plan

python -m pytest tests/tools/test_image_generation.py \
                 tests/tools/test_managed_media_gateways.py \
                 tests/hermes_cli/test_tools_config.py -o "addopts=" -q
# 85 passed in 0.41s

All existing coverage continues to work against the new IDs:

  • Catalog integrity (required keys, supports whitelist)
  • Size-family translation (nano-banana-pro still uses aspect_ratio)
  • Per-model supports filter (recraft V4 Pro drops style, doesn't get seed)
  • Model resolution fallback
  • Managed gateway 4xx translation still fires cleanly for the new IDs

Updated test_recraft_has_minimal_payload to reflect V4's new supports set (colors, background_color, enable_safety_checker replacing V3's style).

Docs

  • image-generation.md: model table updated, aspect-ratio mapping reference updated
  • overview.md: features list updated
  • tool-gateway.md: 8-model summary updated

Upstream asked for these two upgrades ASAP — the old entries show
stale models when newer, higher-quality versions are available on FAL.

Recraft V3 → Recraft V4 Pro
  ID:    fal-ai/recraft-v3 → fal-ai/recraft/v4/pro/text-to-image
  Price: $0.04/image → $0.25/image (6x — V4 Pro is premium tier)
  Schema: V4 dropped the required `style` enum entirely; defaults
          handle taste now. Added `colors` and `background_color`
          to supports for brand-palette control. `seed` is not
          supported by V4 per the API docs.

Nano Banana → Nano Banana Pro
  ID:    fal-ai/nano-banana → fal-ai/nano-banana-pro
  Price: $0.08/image → $0.15/image (1K); $0.30 at 4K
  Schema: Aspect ratio family unchanged. Added `resolution`
          (1K/2K/4K, default 1K for billing predictability),
          `enable_web_search` (real-time info grounding, +$0.015),
          and `limit_generations` (force exactly 1 image).
  Architecture: Gemini 2.5 Flash → Gemini 3 Pro Image. Quality
                and reasoning depth improved; slower (~6s → ~8s).

Migration: users who had the old IDs in `image_gen.model` will
fall through the existing 'unknown model → default' warning path
in `_resolve_fal_model()` and get the Klein 9B default on the next
run. Re-run `hermes tools` → Image Generation to pick the new
version. No silent cost-upgrade aliasing — the 2-6x price jump
on these tiers warrants explicit user re-selection.

Portal note: both new model IDs need to be allowlisted on the
Nous fal-queue-gateway alongside the previous 7 additions, or
users on Nous Subscription will see the 'managed gateway rejected
model' error we added previously (which is clear and
self-remediating, just noisy).
Docusaurus's MDX parser treats unquoted '<' as the start of JSX, and
'<1s' fails because '1' isn't a valid tag-name start character. This
was broken on main since PR #11265 (never noticed because
docs-site-checks was failing on OTHER issues at the time and we
admin-merged through it).

Wrapping in backticks also gives the cell monospace styling which
reads more cleanly alongside the inline-code model ID in the same row.

The other '<1s' occurrence (line 52) is inside a fenced code block
and is already safe — code fences bypass MDX parsing.
@teknium1 teknium1 merged commit 220fa7d into main Apr 17, 2026
6 checks passed
@teknium1 teknium1 deleted the hermes/hermes-c34e758c branch April 17, 2026 05:05
angelburgosrosado pushed a commit to angelburgosrosado/hermes-agent that referenced this pull request Apr 27, 2026
…arch#2070)

* fix(codex): treat reasoning-only responses as incomplete, not stop

When a Codex Responses API response contains only reasoning items
(encrypted thinking state) with no message text or tool calls, the
_normalize_codex_response method was setting finish_reason='stop'.
This sent the response into the empty-content retry loop, which
burned 3 retries and then failed — exactly the pattern Nester
reported in Discord.

Two fixes:
1. _normalize_codex_response: reasoning-only responses (reasoning_items_raw
   non-empty but no final_text) now get finish_reason='incomplete', routing
   them to the Codex continuation path instead of the retry loop.
2. Incomplete handling: also checks for codex_reasoning_items when deciding
   whether to preserve an interim message, so encrypted reasoning state is
   not silently dropped when there is no visible reasoning text.

Adds 4 regression tests covering:
- Unit: reasoning-only → incomplete, reasoning+content → stop
- E2E: reasoning-only → continuation → final answer succeeds
- E2E: encrypted reasoning items preserved in interim messages

* fix(codex): ensure reasoning items have required following item in API input

Follow-up to the reasoning-only response fix. Three additional issues
found by tracing the full replay path:

1. _chat_messages_to_responses_input: when a reasoning-only interim
   message was converted to Responses API input, the reasoning items
   were emitted as the last items with no following item. The Responses
   API requires a following item after each reasoning item (otherwise:
   'missing_following_item' error, as seen in OpenHands NousResearch#11406). Now
   emits an empty assistant message as the required following item when
   content is empty but reasoning items were added.

2. Duplicate detection: two consecutive reasoning-only incomplete
   messages with identical empty content/reasoning but different
   encrypted codex_reasoning_items were incorrectly treated as
   duplicates, silently dropping the second response's reasoning state.
   Now includes codex_reasoning_items in the duplicate comparison.

3. Added tests for both the API input conversion path and the duplicate
   detection edge case.

Research context: verified against OpenCode (uses Vercel AI SDK, no
retry loop so avoids the issue), Clawdbot (drops orphaned reasoning
blocks entirely), and OpenHands (hit the missing_following_item error).
Our approach preserves reasoning continuity while satisfying the API
constraint.

---------

Co-authored-by: Test <test@test.com>
angelburgosrosado pushed a commit to angelburgosrosado/hermes-agent that referenced this pull request Apr 28, 2026
…I input

Follow-up to the reasoning-only response fix. Three additional issues
found by tracing the full replay path:

1. _chat_messages_to_responses_input: when a reasoning-only interim
   message was converted to Responses API input, the reasoning items
   were emitted as the last items with no following item. The Responses
   API requires a following item after each reasoning item (otherwise:
   'missing_following_item' error, as seen in OpenHands NousResearch#11406). Now
   emits an empty assistant message as the required following item when
   content is empty but reasoning items were added.

2. Duplicate detection: two consecutive reasoning-only incomplete
   messages with identical empty content/reasoning but different
   encrypted codex_reasoning_items were incorrectly treated as
   duplicates, silently dropping the second response's reasoning state.
   Now includes codex_reasoning_items in the duplicate comparison.

3. Added tests for both the API input conversion path and the duplicate
   detection edge case.

Research context: verified against OpenCode (uses Vercel AI SDK, no
retry loop so avoids the issue), Clawdbot (drops orphaned reasoning
blocks entirely), and OpenHands (hit the missing_following_item error).
Our approach preserves reasoning continuity while satisfying the API
constraint.
ulasbilgen pushed a commit to ulasbilgen/hermes-adhd-agent that referenced this pull request May 1, 2026
…Research#11406)

* feat(image_gen): upgrade Recraft V3 → V4 Pro, Nano Banana → Pro

Upstream asked for these two upgrades ASAP — the old entries show
stale models when newer, higher-quality versions are available on FAL.

Recraft V3 → Recraft V4 Pro
  ID:    fal-ai/recraft-v3 → fal-ai/recraft/v4/pro/text-to-image
  Price: $0.04/image → $0.25/image (6x — V4 Pro is premium tier)
  Schema: V4 dropped the required `style` enum entirely; defaults
          handle taste now. Added `colors` and `background_color`
          to supports for brand-palette control. `seed` is not
          supported by V4 per the API docs.

Nano Banana → Nano Banana Pro
  ID:    fal-ai/nano-banana → fal-ai/nano-banana-pro
  Price: $0.08/image → $0.15/image (1K); $0.30 at 4K
  Schema: Aspect ratio family unchanged. Added `resolution`
          (1K/2K/4K, default 1K for billing predictability),
          `enable_web_search` (real-time info grounding, +$0.015),
          and `limit_generations` (force exactly 1 image).
  Architecture: Gemini 2.5 Flash → Gemini 3 Pro Image. Quality
                and reasoning depth improved; slower (~6s → ~8s).

Migration: users who had the old IDs in `image_gen.model` will
fall through the existing 'unknown model → default' warning path
in `_resolve_fal_model()` and get the Klein 9B default on the next
run. Re-run `hermes tools` → Image Generation to pick the new
version. No silent cost-upgrade aliasing — the 2-6x price jump
on these tiers warrants explicit user re-selection.

Portal note: both new model IDs need to be allowlisted on the
Nous fal-queue-gateway alongside the previous 7 additions, or
users on Nous Subscription will see the 'managed gateway rejected
model' error we added previously (which is clear and
self-remediating, just noisy).

* docs: wrap '<1s' in backticks to unblock MDX compilation

Docusaurus's MDX parser treats unquoted '<' as the start of JSX, and
'<1s' fails because '1' isn't a valid tag-name start character. This
was broken on main since PR NousResearch#11265 (never noticed because
docs-site-checks was failing on OTHER issues at the time and we
admin-merged through it).

Wrapping in backticks also gives the cell monospace styling which
reads more cleanly alongside the inline-code model ID in the same row.

The other '<1s' occurrence (line 52) is inside a fenced code block
and is already safe — code fences bypass MDX parsing.
aj-nt pushed a commit to aj-nt/hermes-agent that referenced this pull request May 1, 2026
…Research#11406)

* feat(image_gen): upgrade Recraft V3 → V4 Pro, Nano Banana → Pro

Upstream asked for these two upgrades ASAP — the old entries show
stale models when newer, higher-quality versions are available on FAL.

Recraft V3 → Recraft V4 Pro
  ID:    fal-ai/recraft-v3 → fal-ai/recraft/v4/pro/text-to-image
  Price: $0.04/image → $0.25/image (6x — V4 Pro is premium tier)
  Schema: V4 dropped the required `style` enum entirely; defaults
          handle taste now. Added `colors` and `background_color`
          to supports for brand-palette control. `seed` is not
          supported by V4 per the API docs.

Nano Banana → Nano Banana Pro
  ID:    fal-ai/nano-banana → fal-ai/nano-banana-pro
  Price: $0.08/image → $0.15/image (1K); $0.30 at 4K
  Schema: Aspect ratio family unchanged. Added `resolution`
          (1K/2K/4K, default 1K for billing predictability),
          `enable_web_search` (real-time info grounding, +$0.015),
          and `limit_generations` (force exactly 1 image).
  Architecture: Gemini 2.5 Flash → Gemini 3 Pro Image. Quality
                and reasoning depth improved; slower (~6s → ~8s).

Migration: users who had the old IDs in `image_gen.model` will
fall through the existing 'unknown model → default' warning path
in `_resolve_fal_model()` and get the Klein 9B default on the next
run. Re-run `hermes tools` → Image Generation to pick the new
version. No silent cost-upgrade aliasing — the 2-6x price jump
on these tiers warrants explicit user re-selection.

Portal note: both new model IDs need to be allowlisted on the
Nous fal-queue-gateway alongside the previous 7 additions, or
users on Nous Subscription will see the 'managed gateway rejected
model' error we added previously (which is clear and
self-remediating, just noisy).

* docs: wrap '<1s' in backticks to unblock MDX compilation

Docusaurus's MDX parser treats unquoted '<' as the start of JSX, and
'<1s' fails because '1' isn't a valid tag-name start character. This
was broken on main since PR NousResearch#11265 (never noticed because
docs-site-checks was failing on OTHER issues at the time and we
admin-merged through it).

Wrapping in backticks also gives the cell monospace styling which
reads more cleanly alongside the inline-code model ID in the same row.

The other '<1s' occurrence (line 52) is inside a fenced code block
and is already safe — code fences bypass MDX parsing.
02356abc pushed a commit to 02356abc/hermes-agent that referenced this pull request May 14, 2026
…arch#2070)

* fix(codex): treat reasoning-only responses as incomplete, not stop

When a Codex Responses API response contains only reasoning items
(encrypted thinking state) with no message text or tool calls, the
_normalize_codex_response method was setting finish_reason='stop'.
This sent the response into the empty-content retry loop, which
burned 3 retries and then failed — exactly the pattern Nester
reported in Discord.

Two fixes:
1. _normalize_codex_response: reasoning-only responses (reasoning_items_raw
   non-empty but no final_text) now get finish_reason='incomplete', routing
   them to the Codex continuation path instead of the retry loop.
2. Incomplete handling: also checks for codex_reasoning_items when deciding
   whether to preserve an interim message, so encrypted reasoning state is
   not silently dropped when there is no visible reasoning text.

Adds 4 regression tests covering:
- Unit: reasoning-only → incomplete, reasoning+content → stop
- E2E: reasoning-only → continuation → final answer succeeds
- E2E: encrypted reasoning items preserved in interim messages

* fix(codex): ensure reasoning items have required following item in API input

Follow-up to the reasoning-only response fix. Three additional issues
found by tracing the full replay path:

1. _chat_messages_to_responses_input: when a reasoning-only interim
   message was converted to Responses API input, the reasoning items
   were emitted as the last items with no following item. The Responses
   API requires a following item after each reasoning item (otherwise:
   'missing_following_item' error, as seen in OpenHands NousResearch#11406). Now
   emits an empty assistant message as the required following item when
   content is empty but reasoning items were added.

2. Duplicate detection: two consecutive reasoning-only incomplete
   messages with identical empty content/reasoning but different
   encrypted codex_reasoning_items were incorrectly treated as
   duplicates, silently dropping the second response's reasoning state.
   Now includes codex_reasoning_items in the duplicate comparison.

3. Added tests for both the API input conversion path and the duplicate
   detection edge case.

Research context: verified against OpenCode (uses Vercel AI SDK, no
retry loop so avoids the issue), Clawdbot (drops orphaned reasoning
blocks entirely), and OpenHands (hit the missing_following_item error).
Our approach preserves reasoning continuity while satisfying the API
constraint.

---------

Co-authored-by: Test <test@test.com>
02356abc pushed a commit to 02356abc/hermes-agent that referenced this pull request May 14, 2026
…Research#11406)

* feat(image_gen): upgrade Recraft V3 → V4 Pro, Nano Banana → Pro

Upstream asked for these two upgrades ASAP — the old entries show
stale models when newer, higher-quality versions are available on FAL.

Recraft V3 → Recraft V4 Pro
  ID:    fal-ai/recraft-v3 → fal-ai/recraft/v4/pro/text-to-image
  Price: $0.04/image → $0.25/image (6x — V4 Pro is premium tier)
  Schema: V4 dropped the required `style` enum entirely; defaults
          handle taste now. Added `colors` and `background_color`
          to supports for brand-palette control. `seed` is not
          supported by V4 per the API docs.

Nano Banana → Nano Banana Pro
  ID:    fal-ai/nano-banana → fal-ai/nano-banana-pro
  Price: $0.08/image → $0.15/image (1K); $0.30 at 4K
  Schema: Aspect ratio family unchanged. Added `resolution`
          (1K/2K/4K, default 1K for billing predictability),
          `enable_web_search` (real-time info grounding, +$0.015),
          and `limit_generations` (force exactly 1 image).
  Architecture: Gemini 2.5 Flash → Gemini 3 Pro Image. Quality
                and reasoning depth improved; slower (~6s → ~8s).

Migration: users who had the old IDs in `image_gen.model` will
fall through the existing 'unknown model → default' warning path
in `_resolve_fal_model()` and get the Klein 9B default on the next
run. Re-run `hermes tools` → Image Generation to pick the new
version. No silent cost-upgrade aliasing — the 2-6x price jump
on these tiers warrants explicit user re-selection.

Portal note: both new model IDs need to be allowlisted on the
Nous fal-queue-gateway alongside the previous 7 additions, or
users on Nous Subscription will see the 'managed gateway rejected
model' error we added previously (which is clear and
self-remediating, just noisy).

* docs: wrap '<1s' in backticks to unblock MDX compilation

Docusaurus's MDX parser treats unquoted '<' as the start of JSX, and
'<1s' fails because '1' isn't a valid tag-name start character. This
was broken on main since PR NousResearch#11265 (never noticed because
docs-site-checks was failing on OTHER issues at the time and we
admin-merged through it).

Wrapping in backticks also gives the cell monospace styling which
reads more cleanly alongside the inline-code model ID in the same row.

The other '<1s' occurrence (line 52) is inside a fenced code block
and is already safe — code fences bypass MDX parsing.
olympus-terminal pushed a commit to olympus-terminal/hermes-agent that referenced this pull request May 16, 2026
…arch#2070)

* fix(codex): treat reasoning-only responses as incomplete, not stop

When a Codex Responses API response contains only reasoning items
(encrypted thinking state) with no message text or tool calls, the
_normalize_codex_response method was setting finish_reason='stop'.
This sent the response into the empty-content retry loop, which
burned 3 retries and then failed — exactly the pattern Nester
reported in Discord.

Two fixes:
1. _normalize_codex_response: reasoning-only responses (reasoning_items_raw
   non-empty but no final_text) now get finish_reason='incomplete', routing
   them to the Codex continuation path instead of the retry loop.
2. Incomplete handling: also checks for codex_reasoning_items when deciding
   whether to preserve an interim message, so encrypted reasoning state is
   not silently dropped when there is no visible reasoning text.

Adds 4 regression tests covering:
- Unit: reasoning-only → incomplete, reasoning+content → stop
- E2E: reasoning-only → continuation → final answer succeeds
- E2E: encrypted reasoning items preserved in interim messages

* fix(codex): ensure reasoning items have required following item in API input

Follow-up to the reasoning-only response fix. Three additional issues
found by tracing the full replay path:

1. _chat_messages_to_responses_input: when a reasoning-only interim
   message was converted to Responses API input, the reasoning items
   were emitted as the last items with no following item. The Responses
   API requires a following item after each reasoning item (otherwise:
   'missing_following_item' error, as seen in OpenHands NousResearch#11406). Now
   emits an empty assistant message as the required following item when
   content is empty but reasoning items were added.

2. Duplicate detection: two consecutive reasoning-only incomplete
   messages with identical empty content/reasoning but different
   encrypted codex_reasoning_items were incorrectly treated as
   duplicates, silently dropping the second response's reasoning state.
   Now includes codex_reasoning_items in the duplicate comparison.

3. Added tests for both the API input conversion path and the duplicate
   detection edge case.

Research context: verified against OpenCode (uses Vercel AI SDK, no
retry loop so avoids the issue), Clawdbot (drops orphaned reasoning
blocks entirely), and OpenHands (hit the missing_following_item error).
Our approach preserves reasoning continuity while satisfying the API
constraint.

---------

Co-authored-by: Test <test@test.com>
MickaelV0 pushed a commit to Roxabi/hermes-agent that referenced this pull request May 24, 2026
…arch#2070)

* fix(codex): treat reasoning-only responses as incomplete, not stop

When a Codex Responses API response contains only reasoning items
(encrypted thinking state) with no message text or tool calls, the
_normalize_codex_response method was setting finish_reason='stop'.
This sent the response into the empty-content retry loop, which
burned 3 retries and then failed — exactly the pattern Nester
reported in Discord.

Two fixes:
1. _normalize_codex_response: reasoning-only responses (reasoning_items_raw
   non-empty but no final_text) now get finish_reason='incomplete', routing
   them to the Codex continuation path instead of the retry loop.
2. Incomplete handling: also checks for codex_reasoning_items when deciding
   whether to preserve an interim message, so encrypted reasoning state is
   not silently dropped when there is no visible reasoning text.

Adds 4 regression tests covering:
- Unit: reasoning-only → incomplete, reasoning+content → stop
- E2E: reasoning-only → continuation → final answer succeeds
- E2E: encrypted reasoning items preserved in interim messages

* fix(codex): ensure reasoning items have required following item in API input

Follow-up to the reasoning-only response fix. Three additional issues
found by tracing the full replay path:

1. _chat_messages_to_responses_input: when a reasoning-only interim
   message was converted to Responses API input, the reasoning items
   were emitted as the last items with no following item. The Responses
   API requires a following item after each reasoning item (otherwise:
   'missing_following_item' error, as seen in OpenHands NousResearch#11406). Now
   emits an empty assistant message as the required following item when
   content is empty but reasoning items were added.

2. Duplicate detection: two consecutive reasoning-only incomplete
   messages with identical empty content/reasoning but different
   encrypted codex_reasoning_items were incorrectly treated as
   duplicates, silently dropping the second response's reasoning state.
   Now includes codex_reasoning_items in the duplicate comparison.

3. Added tests for both the API input conversion path and the duplicate
   detection edge case.

Research context: verified against OpenCode (uses Vercel AI SDK, no
retry loop so avoids the issue), Clawdbot (drops orphaned reasoning
blocks entirely), and OpenHands (hit the missing_following_item error).
Our approach preserves reasoning continuity while satisfying the API
constraint.

---------

Co-authored-by: Test <test@test.com>
CumulusService pushed a commit to Cumulus-Service-GmbH/hermes-agent that referenced this pull request May 30, 2026
…I input

Follow-up to the reasoning-only response fix. Three additional issues
found by tracing the full replay path:

1. _chat_messages_to_responses_input: when a reasoning-only interim
   message was converted to Responses API input, the reasoning items
   were emitted as the last items with no following item. The Responses
   API requires a following item after each reasoning item (otherwise:
   'missing_following_item' error, as seen in OpenHands NousResearch#11406). Now
   emits an empty assistant message as the required following item when
   content is empty but reasoning items were added.

2. Duplicate detection: two consecutive reasoning-only incomplete
   messages with identical empty content/reasoning but different
   encrypted codex_reasoning_items were incorrectly treated as
   duplicates, silently dropping the second response's reasoning state.
   Now includes codex_reasoning_items in the duplicate comparison.

3. Added tests for both the API input conversion path and the duplicate
   detection edge case.

Research context: verified against OpenCode (uses Vercel AI SDK, no
retry loop so avoids the issue), Clawdbot (drops orphaned reasoning
blocks entirely), and OpenHands (hit the missing_following_item error).
Our approach preserves reasoning continuity while satisfying the API
constraint.
gweeteve pushed a commit to gweeteve/hermes-agent that referenced this pull request Jun 2, 2026
…arch#2070)

* fix(codex): treat reasoning-only responses as incomplete, not stop

When a Codex Responses API response contains only reasoning items
(encrypted thinking state) with no message text or tool calls, the
_normalize_codex_response method was setting finish_reason='stop'.
This sent the response into the empty-content retry loop, which
burned 3 retries and then failed — exactly the pattern Nester
reported in Discord.

Two fixes:
1. _normalize_codex_response: reasoning-only responses (reasoning_items_raw
   non-empty but no final_text) now get finish_reason='incomplete', routing
   them to the Codex continuation path instead of the retry loop.
2. Incomplete handling: also checks for codex_reasoning_items when deciding
   whether to preserve an interim message, so encrypted reasoning state is
   not silently dropped when there is no visible reasoning text.

Adds 4 regression tests covering:
- Unit: reasoning-only → incomplete, reasoning+content → stop
- E2E: reasoning-only → continuation → final answer succeeds
- E2E: encrypted reasoning items preserved in interim messages

* fix(codex): ensure reasoning items have required following item in API input

Follow-up to the reasoning-only response fix. Three additional issues
found by tracing the full replay path:

1. _chat_messages_to_responses_input: when a reasoning-only interim
   message was converted to Responses API input, the reasoning items
   were emitted as the last items with no following item. The Responses
   API requires a following item after each reasoning item (otherwise:
   'missing_following_item' error, as seen in OpenHands NousResearch#11406). Now
   emits an empty assistant message as the required following item when
   content is empty but reasoning items were added.

2. Duplicate detection: two consecutive reasoning-only incomplete
   messages with identical empty content/reasoning but different
   encrypted codex_reasoning_items were incorrectly treated as
   duplicates, silently dropping the second response's reasoning state.
   Now includes codex_reasoning_items in the duplicate comparison.

3. Added tests for both the API input conversion path and the duplicate
   detection edge case.

Research context: verified against OpenCode (uses Vercel AI SDK, no
retry loop so avoids the issue), Clawdbot (drops orphaned reasoning
blocks entirely), and OpenHands (hit the missing_following_item error).
Our approach preserves reasoning continuity while satisfying the API
constraint.

---------

Co-authored-by: Test <test@test.com>
gweeteve pushed a commit to gweeteve/hermes-agent that referenced this pull request Jun 2, 2026
…Research#11406)

* feat(image_gen): upgrade Recraft V3 → V4 Pro, Nano Banana → Pro

Upstream asked for these two upgrades ASAP — the old entries show
stale models when newer, higher-quality versions are available on FAL.

Recraft V3 → Recraft V4 Pro
  ID:    fal-ai/recraft-v3 → fal-ai/recraft/v4/pro/text-to-image
  Price: $0.04/image → $0.25/image (6x — V4 Pro is premium tier)
  Schema: V4 dropped the required `style` enum entirely; defaults
          handle taste now. Added `colors` and `background_color`
          to supports for brand-palette control. `seed` is not
          supported by V4 per the API docs.

Nano Banana → Nano Banana Pro
  ID:    fal-ai/nano-banana → fal-ai/nano-banana-pro
  Price: $0.08/image → $0.15/image (1K); $0.30 at 4K
  Schema: Aspect ratio family unchanged. Added `resolution`
          (1K/2K/4K, default 1K for billing predictability),
          `enable_web_search` (real-time info grounding, +$0.015),
          and `limit_generations` (force exactly 1 image).
  Architecture: Gemini 2.5 Flash → Gemini 3 Pro Image. Quality
                and reasoning depth improved; slower (~6s → ~8s).

Migration: users who had the old IDs in `image_gen.model` will
fall through the existing 'unknown model → default' warning path
in `_resolve_fal_model()` and get the Klein 9B default on the next
run. Re-run `hermes tools` → Image Generation to pick the new
version. No silent cost-upgrade aliasing — the 2-6x price jump
on these tiers warrants explicit user re-selection.

Portal note: both new model IDs need to be allowlisted on the
Nous fal-queue-gateway alongside the previous 7 additions, or
users on Nous Subscription will see the 'managed gateway rejected
model' error we added previously (which is clear and
self-remediating, just noisy).

* docs: wrap '<1s' in backticks to unblock MDX compilation

Docusaurus's MDX parser treats unquoted '<' as the start of JSX, and
'<1s' fails because '1' isn't a valid tag-name start character. This
was broken on main since PR NousResearch#11265 (never noticed because
docs-site-checks was failing on OTHER issues at the time and we
admin-merged through it).

Wrapping in backticks also gives the cell monospace styling which
reads more cleanly alongside the inline-code model ID in the same row.

The other '<1s' occurrence (line 52) is inside a fenced code block
and is already safe — code fences bypass MDX parsing.
Egavasyug pushed a commit to Egavasyug/hermes-agent that referenced this pull request Jun 10, 2026
…arch#2070)

* fix(codex): treat reasoning-only responses as incomplete, not stop

When a Codex Responses API response contains only reasoning items
(encrypted thinking state) with no message text or tool calls, the
_normalize_codex_response method was setting finish_reason='stop'.
This sent the response into the empty-content retry loop, which
burned 3 retries and then failed — exactly the pattern Nester
reported in Discord.

Two fixes:
1. _normalize_codex_response: reasoning-only responses (reasoning_items_raw
   non-empty but no final_text) now get finish_reason='incomplete', routing
   them to the Codex continuation path instead of the retry loop.
2. Incomplete handling: also checks for codex_reasoning_items when deciding
   whether to preserve an interim message, so encrypted reasoning state is
   not silently dropped when there is no visible reasoning text.

Adds 4 regression tests covering:
- Unit: reasoning-only → incomplete, reasoning+content → stop
- E2E: reasoning-only → continuation → final answer succeeds
- E2E: encrypted reasoning items preserved in interim messages

* fix(codex): ensure reasoning items have required following item in API input

Follow-up to the reasoning-only response fix. Three additional issues
found by tracing the full replay path:

1. _chat_messages_to_responses_input: when a reasoning-only interim
   message was converted to Responses API input, the reasoning items
   were emitted as the last items with no following item. The Responses
   API requires a following item after each reasoning item (otherwise:
   'missing_following_item' error, as seen in OpenHands NousResearch#11406). Now
   emits an empty assistant message as the required following item when
   content is empty but reasoning items were added.

2. Duplicate detection: two consecutive reasoning-only incomplete
   messages with identical empty content/reasoning but different
   encrypted codex_reasoning_items were incorrectly treated as
   duplicates, silently dropping the second response's reasoning state.
   Now includes codex_reasoning_items in the duplicate comparison.

3. Added tests for both the API input conversion path and the duplicate
   detection edge case.

Research context: verified against OpenCode (uses Vercel AI SDK, no
retry loop so avoids the issue), Clawdbot (drops orphaned reasoning
blocks entirely), and OpenHands (hit the missing_following_item error).
Our approach preserves reasoning continuity while satisfying the API
constraint.

---------

Co-authored-by: Test <test@test.com>
Egavasyug pushed a commit to Egavasyug/hermes-agent that referenced this pull request Jun 10, 2026
…Research#11406)

* feat(image_gen): upgrade Recraft V3 → V4 Pro, Nano Banana → Pro

Upstream asked for these two upgrades ASAP — the old entries show
stale models when newer, higher-quality versions are available on FAL.

Recraft V3 → Recraft V4 Pro
  ID:    fal-ai/recraft-v3 → fal-ai/recraft/v4/pro/text-to-image
  Price: $0.04/image → $0.25/image (6x — V4 Pro is premium tier)
  Schema: V4 dropped the required `style` enum entirely; defaults
          handle taste now. Added `colors` and `background_color`
          to supports for brand-palette control. `seed` is not
          supported by V4 per the API docs.

Nano Banana → Nano Banana Pro
  ID:    fal-ai/nano-banana → fal-ai/nano-banana-pro
  Price: $0.08/image → $0.15/image (1K); $0.30 at 4K
  Schema: Aspect ratio family unchanged. Added `resolution`
          (1K/2K/4K, default 1K for billing predictability),
          `enable_web_search` (real-time info grounding, +$0.015),
          and `limit_generations` (force exactly 1 image).
  Architecture: Gemini 2.5 Flash → Gemini 3 Pro Image. Quality
                and reasoning depth improved; slower (~6s → ~8s).

Migration: users who had the old IDs in `image_gen.model` will
fall through the existing 'unknown model → default' warning path
in `_resolve_fal_model()` and get the Klein 9B default on the next
run. Re-run `hermes tools` → Image Generation to pick the new
version. No silent cost-upgrade aliasing — the 2-6x price jump
on these tiers warrants explicit user re-selection.

Portal note: both new model IDs need to be allowlisted on the
Nous fal-queue-gateway alongside the previous 7 additions, or
users on Nous Subscription will see the 'managed gateway rejected
model' error we added previously (which is clear and
self-remediating, just noisy).

* docs: wrap '<1s' in backticks to unblock MDX compilation

Docusaurus's MDX parser treats unquoted '<' as the start of JSX, and
'<1s' fails because '1' isn't a valid tag-name start character. This
was broken on main since PR NousResearch#11265 (never noticed because
docs-site-checks was failing on OTHER issues at the time and we
admin-merged through it).

Wrapping in backticks also gives the cell monospace styling which
reads more cleanly alongside the inline-code model ID in the same row.

The other '<1s' occurrence (line 52) is inside a fenced code block
and is already safe — code fences bypass MDX parsing.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant