feat(image_gen): upgrade Recraft V3 → V4 Pro, Nano Banana → Pro by teknium1 · Pull Request #11406 · NousResearch/hermes-agent

teknium1 · 2026-04-17T04:53:12Z

Summary

Upgrades two models in the Hermes image-gen catalog to their newer, higher-quality variants. After this merges, the full supported catalog is 8 FAL.ai models, switchable via hermes tools → Image Generation:

Full Model Catalog (post-merge)

Model	Speed	Strengths	Price
`fal-ai/flux-2/klein/9b` (default)	<1s	Fast, crisp text	$0.006/MP
`fal-ai/flux-2-pro`	~6s	Studio photorealism	$0.03/MP
`fal-ai/z-image/turbo`	~2s	Bilingual EN/CN, 6B params	$0.005/MP
`fal-ai/nano-banana-pro`	~8s	Gemini 3 Pro, reasoning depth, text rendering	$0.15/image (1K)
`fal-ai/gpt-image-1.5`	~15s	Prompt adherence	$0.034/image
`fal-ai/ideogram/v3`	~5s	Best typography	$0.03–0.09/image
`fal-ai/recraft/v4/pro/text-to-image`	~8s	Design, brand systems, production-ready	$0.25/image
`fal-ai/qwen-image`	~12s	LLM-based, complex text	$0.02/MP

All selectable via arrow-key picker. Agent sees only prompt + aspect_ratio (landscape/square/portrait); size translation, per-model parameter filtering, and quality tier pinning (GPT-Image) happen internally.

What Changed in This PR

Two models upgraded to their newer variants. Everything else in the catalog stays as-is.

Recraft V3 → Recraft V4 Pro

	V3	V4 Pro
ID	`fal-ai/recraft-v3`	`fal-ai/recraft/v4/pro/text-to-image`
Price	$0.04/image	$0.25/image (6× premium tier)
Required params	`style` enum	(none — V4 dropped `style` entirely)
Optional control	—	`colors`, `background_color` (brand palette)
Seed support	✓	✗

V4 Pro is marketed as "designed with designers" — visual taste, brand systems, production-ready. Significant quality jump.

Nano Banana → Nano Banana Pro

	Original	Pro
ID	`fal-ai/nano-banana`	`fal-ai/nano-banana-pro`
Architecture	Gemini 2.5 Flash Image	Gemini 3 Pro Image
Price (1K)	$0.08/image	$0.15/image
Price (4K)	—	$0.30/image
Web search	—	`enable_web_search` (+$0.015)
Resolution tiers	—	`1K` / `2K` / `4K`
Generation cap	—	`limit_generations` (force exactly 1)
Speed	~6s	~8s (reasoning depth tradeoff)

Defaults to resolution: "1K" to keep per-image cost predictable for Nous Subscription. Users who want 4K can pass it through the supports whitelist.

Migration

Users with the old IDs in image_gen.model fall through the existing _resolve_fal_model() warning path ("Unknown FAL model 'X' in config; falling back to default") and land on Klein 9B. Re-running hermes tools → Image Generation picks the new version.

No silent alias from old → new IDs. The 2-6× price jumps on these upgrades warrant explicit user re-selection rather than stealth cost escalation.

Nous Portal / Backend-Dev Action

The previous image-gen PR added 7 new IDs that need allowlist verification on fal-queue-gateway.nousresearch.com. This PR swaps two of those for newer variants, so the updated allowlist items are:

Replace:

fal-ai/nano-banana → fal-ai/nano-banana-pro
fal-ai/recraft-v3 → fal-ai/recraft/v4/pro/text-to-image

Full current list on Hermes's side:

fal-ai/flux-2/klein/9b       (default)
fal-ai/flux-2-pro
fal-ai/z-image/turbo
fal-ai/nano-banana-pro       ← new
fal-ai/gpt-image-1.5
fal-ai/ideogram/v3
fal-ai/recraft/v4/pro/text-to-image   ← new
fal-ai/qwen-image

Portal billing note: Nano Banana Pro's resolution param can multiply per-image cost (2× at 4K). We default to 1K for Nous Subscription users. If the gateway wants to enforce that, strip resolution from request bodies for subscription accounts and rely on the server-side default.

Client-side, the existing 4xx translator still surfaces clear remediation messages if the portal rejects either new ID.

Test Plan

python -m pytest tests/tools/test_image_generation.py \
                 tests/tools/test_managed_media_gateways.py \
                 tests/hermes_cli/test_tools_config.py -o "addopts=" -q
# 85 passed in 0.41s

All existing coverage continues to work against the new IDs:

Catalog integrity (required keys, supports whitelist)
Size-family translation (nano-banana-pro still uses aspect_ratio)
Per-model supports filter (recraft V4 Pro drops style, doesn't get seed)
Model resolution fallback
Managed gateway 4xx translation still fires cleanly for the new IDs

Updated test_recraft_has_minimal_payload to reflect V4's new supports set (colors, background_color, enable_safety_checker replacing V3's style).

Docs

image-generation.md: model table updated, aspect-ratio mapping reference updated
overview.md: features list updated
tool-gateway.md: 8-model summary updated

Upstream asked for these two upgrades ASAP — the old entries show stale models when newer, higher-quality versions are available on FAL. Recraft V3 → Recraft V4 Pro ID: fal-ai/recraft-v3 → fal-ai/recraft/v4/pro/text-to-image Price: $0.04/image → $0.25/image (6x — V4 Pro is premium tier) Schema: V4 dropped the required `style` enum entirely; defaults handle taste now. Added `colors` and `background_color` to supports for brand-palette control. `seed` is not supported by V4 per the API docs. Nano Banana → Nano Banana Pro ID: fal-ai/nano-banana → fal-ai/nano-banana-pro Price: $0.08/image → $0.15/image (1K); $0.30 at 4K Schema: Aspect ratio family unchanged. Added `resolution` (1K/2K/4K, default 1K for billing predictability), `enable_web_search` (real-time info grounding, +$0.015), and `limit_generations` (force exactly 1 image). Architecture: Gemini 2.5 Flash → Gemini 3 Pro Image. Quality and reasoning depth improved; slower (~6s → ~8s). Migration: users who had the old IDs in `image_gen.model` will fall through the existing 'unknown model → default' warning path in `_resolve_fal_model()` and get the Klein 9B default on the next run. Re-run `hermes tools` → Image Generation to pick the new version. No silent cost-upgrade aliasing — the 2-6x price jump on these tiers warrants explicit user re-selection. Portal note: both new model IDs need to be allowlisted on the Nous fal-queue-gateway alongside the previous 7 additions, or users on Nous Subscription will see the 'managed gateway rejected model' error we added previously (which is clear and self-remediating, just noisy).

Docusaurus's MDX parser treats unquoted '<' as the start of JSX, and '<1s' fails because '1' isn't a valid tag-name start character. This was broken on main since PR #11265 (never noticed because docs-site-checks was failing on OTHER issues at the time and we admin-merged through it). Wrapping in backticks also gives the cell monospace styling which reads more cleanly alongside the inline-code model ID in the same row. The other '<1s' occurrence (line 52) is inside a fenced code block and is already safe — code fences bypass MDX parsing.

…arch#2070) * fix(codex): treat reasoning-only responses as incomplete, not stop When a Codex Responses API response contains only reasoning items (encrypted thinking state) with no message text or tool calls, the _normalize_codex_response method was setting finish_reason='stop'. This sent the response into the empty-content retry loop, which burned 3 retries and then failed — exactly the pattern Nester reported in Discord. Two fixes: 1. _normalize_codex_response: reasoning-only responses (reasoning_items_raw non-empty but no final_text) now get finish_reason='incomplete', routing them to the Codex continuation path instead of the retry loop. 2. Incomplete handling: also checks for codex_reasoning_items when deciding whether to preserve an interim message, so encrypted reasoning state is not silently dropped when there is no visible reasoning text. Adds 4 regression tests covering: - Unit: reasoning-only → incomplete, reasoning+content → stop - E2E: reasoning-only → continuation → final answer succeeds - E2E: encrypted reasoning items preserved in interim messages * fix(codex): ensure reasoning items have required following item in API input Follow-up to the reasoning-only response fix. Three additional issues found by tracing the full replay path: 1. _chat_messages_to_responses_input: when a reasoning-only interim message was converted to Responses API input, the reasoning items were emitted as the last items with no following item. The Responses API requires a following item after each reasoning item (otherwise: 'missing_following_item' error, as seen in OpenHands NousResearch#11406). Now emits an empty assistant message as the required following item when content is empty but reasoning items were added. 2. Duplicate detection: two consecutive reasoning-only incomplete messages with identical empty content/reasoning but different encrypted codex_reasoning_items were incorrectly treated as duplicates, silently dropping the second response's reasoning state. Now includes codex_reasoning_items in the duplicate comparison. 3. Added tests for both the API input conversion path and the duplicate detection edge case. Research context: verified against OpenCode (uses Vercel AI SDK, no retry loop so avoids the issue), Clawdbot (drops orphaned reasoning blocks entirely), and OpenHands (hit the missing_following_item error). Our approach preserves reasoning continuity while satisfying the API constraint. --------- Co-authored-by: Test <test@test.com>

…I input Follow-up to the reasoning-only response fix. Three additional issues found by tracing the full replay path: 1. _chat_messages_to_responses_input: when a reasoning-only interim message was converted to Responses API input, the reasoning items were emitted as the last items with no following item. The Responses API requires a following item after each reasoning item (otherwise: 'missing_following_item' error, as seen in OpenHands NousResearch#11406). Now emits an empty assistant message as the required following item when content is empty but reasoning items were added. 2. Duplicate detection: two consecutive reasoning-only incomplete messages with identical empty content/reasoning but different encrypted codex_reasoning_items were incorrectly treated as duplicates, silently dropping the second response's reasoning state. Now includes codex_reasoning_items in the duplicate comparison. 3. Added tests for both the API input conversion path and the duplicate detection edge case. Research context: verified against OpenCode (uses Vercel AI SDK, no retry loop so avoids the issue), Clawdbot (drops orphaned reasoning blocks entirely), and OpenHands (hit the missing_following_item error). Our approach preserves reasoning continuity while satisfying the API constraint.

…Research#11406) * feat(image_gen): upgrade Recraft V3 → V4 Pro, Nano Banana → Pro Upstream asked for these two upgrades ASAP — the old entries show stale models when newer, higher-quality versions are available on FAL. Recraft V3 → Recraft V4 Pro ID: fal-ai/recraft-v3 → fal-ai/recraft/v4/pro/text-to-image Price: $0.04/image → $0.25/image (6x — V4 Pro is premium tier) Schema: V4 dropped the required `style` enum entirely; defaults handle taste now. Added `colors` and `background_color` to supports for brand-palette control. `seed` is not supported by V4 per the API docs. Nano Banana → Nano Banana Pro ID: fal-ai/nano-banana → fal-ai/nano-banana-pro Price: $0.08/image → $0.15/image (1K); $0.30 at 4K Schema: Aspect ratio family unchanged. Added `resolution` (1K/2K/4K, default 1K for billing predictability), `enable_web_search` (real-time info grounding, +$0.015), and `limit_generations` (force exactly 1 image). Architecture: Gemini 2.5 Flash → Gemini 3 Pro Image. Quality and reasoning depth improved; slower (~6s → ~8s). Migration: users who had the old IDs in `image_gen.model` will fall through the existing 'unknown model → default' warning path in `_resolve_fal_model()` and get the Klein 9B default on the next run. Re-run `hermes tools` → Image Generation to pick the new version. No silent cost-upgrade aliasing — the 2-6x price jump on these tiers warrants explicit user re-selection. Portal note: both new model IDs need to be allowlisted on the Nous fal-queue-gateway alongside the previous 7 additions, or users on Nous Subscription will see the 'managed gateway rejected model' error we added previously (which is clear and self-remediating, just noisy). * docs: wrap '<1s' in backticks to unblock MDX compilation Docusaurus's MDX parser treats unquoted '<' as the start of JSX, and '<1s' fails because '1' isn't a valid tag-name start character. This was broken on main since PR NousResearch#11265 (never noticed because docs-site-checks was failing on OTHER issues at the time and we admin-merged through it). Wrapping in backticks also gives the cell monospace styling which reads more cleanly alongside the inline-code model ID in the same row. The other '<1s' occurrence (line 52) is inside a fenced code block and is already safe — code fences bypass MDX parsing.