Skip to content

fix(auxiliary): prefer fallback_providers model for OpenRouter aux tasks#25180

Open
Kewe63 wants to merge 1 commit into
NousResearch:mainfrom
Kewe63:fix/auxiliary-fallback-model
Open

fix(auxiliary): prefer fallback_providers model for OpenRouter aux tasks#25180
Kewe63 wants to merge 1 commit into
NousResearch:mainfrom
Kewe63:fix/auxiliary-fallback-model

Conversation

@Kewe63

@Kewe63 Kewe63 commented May 13, 2026

Copy link
Copy Markdown
Contributor

Problem

Users who configure only OpenRouter :free models in fallback_providers still get billed because auxiliary tasks (title_generation, compression, vision) hardcode google/gemini-3-flash-preview (a paid model).

Root Cause

_try_openrouter() always returns _OPENROUTER_MODEL regardless of the user's fallback_providers configuration.

Fix

Added _resolve_openrouter_aux_model() which checks the user's fallback_providers for an OpenRouter entry with an explicit model. If found, that model is used for auxiliary tasks. Otherwise falls back to the hardcoded default.

Fixes #24029

When the user has configured fallback_providers with an OpenRouter entry
that specifies a model (e.g. :free variants), auxiliary tasks now use
that model instead of the hardcoded paid default.

Previously _try_openrouter() always returned google/gemini-3-flash-preview
for auxiliary tasks, causing unexpected billing for free-tier users who
configured only :free models in their fallback chain.

Fixes NousResearch#24029
@alt-glitch alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/agent Core agent loop, run_agent.py, prompt builder provider/openrouter OpenRouter aggregator labels May 13, 2026
@nujovich

nujovich commented Jun 6, 2026

Copy link
Copy Markdown

Constructive review — this closes the billing half of #24029, but I think it leaves a gap (and, for a common config, can relocate the silent-substitution problem rather than fix it).

What the PR does well

_resolve_openrouter_aux_model() makes auxiliary tasks defer to the user's fallback_providers OpenRouter entry instead of the hardcoded _OPENROUTER_MODEL. That correctly addresses the "don't silently use a hardcoded paid model" complaint.

Gap 1 — meta-routers aren't concrete models

The method returns the first fallback_providers entry where provider == "openrouter" and model is non-empty, verbatim. That assumes the configured model is a concrete slug (e.g. …:free). But openrouter/auto and openrouter/free are meta-routers, not models — openrouter/auto selects a model per-request via NotDiamond, non-deterministically by design.

A very common reliability config is:

fallback_providers:
  - provider: openrouter
    model: openrouter/auto
  - provider: anthropic
    model: anthropic/claude-sonnet-4-20250514

With this PR, _resolve_openrouter_aux_model() returns openrouter/auto, so auxiliary tasks (title generation, compression, vision, web summarization) now get routed through the auto-router. The aux model is no longer hardcoded — but it's now whatever the router picks, with no record of what that was.

Gap 2 — the disclosure half of #24029 is still open

The issue's expected behavior included: "at minimum, log a clear warning that a paid/substitute model is being used." This PR adds no such surfacing — the only log on this path is a generic logger.debug. So a substitution still happens invisibly; it's just sourced from config now instead of a constant.

Evidence that the substitution is invisible

I hit the same fallback_providers → openrouter/auto path on a (non-auxiliary) cron run, instrumented via a telemetry plugin that records each call's response model into SQLite. The run was launched on one model and executed entirely on another, with status: ok and no warning.

Run record (declared at session start):

session_id  cron_7fc40f2b1cef_20260606_154226
model       nous/hermes-4-70b-thinking
provider    openrouter
status      ok
api_calls   6
cost_usd    0.00445815

Per-call records (response_model) for that same session:

ts (UTC) model provider tok_in tok_out cost_usd estimated
13:42:41 google/gemini-2.5-flash-lite openrouter 4677 1273 0.0009769 0
13:42:54 google/gemini-2.5-flash-lite openrouter 4975 310 0.0006215 0
13:43:12 google/gemini-2.5-flash-lite openrouter 7099 1431 0.0013365 0
13:43:21 google/gemini-2.5-flash-lite openrouter 2166 1411 0.00089946 0
13:43:22 google/gemini-2.5-flash-lite openrouter 1695 0 0.00030728 0
13:43:24 google/gemini-2.5-flash-lite openrouter 808 222 0.00031651 0

All 6 calls ran on google/gemini-2.5-flash-lite, estimated=0 (real usage, not estimated); api_calls in the run record matches the 6 rows and the per-call costs sum exactly to the run's cost_usd; and nothing anywhere records the declared nous/hermes-4-70b-thinking as executed. The router's selection is returned in each response's model field — the data exists, it's just never surfaced. This PR would route aux tasks into exactly that opaque path for any config with openrouter/auto in fallback_providers.

Suggested adjustments

  1. In _resolve_openrouter_aux_model(), detect meta-router slugs (openrouter/auto, openrouter/free). Either skip them when picking the aux model, or accept them but capture and surface the resolved model from the response rather than logging the router slug.
  2. Add the disclosure Auxiliary tasks silently fall back to paid OpenRouter models, bypassing user's free-only configuration #24029 asked for — a single notice along the lines of aux task: requested X, ran Y — and make it cover this fallback path, not only the hardcoded one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder P2 Medium — degraded but workaround exists provider/openrouter OpenRouter aggregator type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Auxiliary tasks silently fall back to paid OpenRouter models, bypassing user's free-only configuration

3 participants