Skip to content

fix: properly pass model.max_tokens config to AIAgent in gateway#19991

Open
chengoak wants to merge 2 commits into
NousResearch:mainfrom
chengoak:fix/max_tokens-gateway-config
Open

fix: properly pass model.max_tokens config to AIAgent in gateway#19991
chengoak wants to merge 2 commits into
NousResearch:mainfrom
chengoak:fix/max_tokens-gateway-config

Conversation

@chengoak

@chengoak chengoak commented May 5, 2026

Copy link
Copy Markdown
Contributor

What does this PR do?

Fixes the issue where from config.yaml was not being passed to AIAgent when running via the gateway (Feishu, QQBot, etc.), causing model responses to be truncated due to conservative default output limits.

Changes:

  1. ****: Import and use to read from config
  2. ****: Include in the runtime dict passed to AIAgent
  3. ****: Include in fallback provider resolution
  4. ****: Add parameter with config priority: CLI args > config file > model default

Why is this needed?

For custom providers like ByteDance Ark, the model default output token limit is quite conservative. When is configured but not passed through, users see warnings in platforms like Feishu.

Testing

  • Verified that the config path correctly reads
  • All changes are backward compatible (None is passed when config is not set)
  • Gateway routes correctly unpack the runtime dict including max_tokens

chengoak added 2 commits May 5, 2026 10:12
- Add max_tokens to _resolve_runtime_agent_kwargs() return value
- Add max_tokens to _resolve_turn_agent_config() runtime dict
- Add max_tokens to _try_resolve_fallback_provider() return value
- Add max_tokens parameter to HermesCLI.__init__ with config support

This ensures that model.max_tokens from config.yaml is properly
passed to AIAgent, preventing response truncation for custom providers
like ark/bytedance models that have conservative default output limits.
- Add max_tokens to _KNOWN_KEYS in config.py
- Pass max_tokens through _normalize_custom_provider_entry
- Read max_tokens from runtime provider config in gateway
- Fallback to model.max_tokens if provider-specific config not present
- Propagate max_tokens through credential pool resolution path
@alt-glitch alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/gateway Gateway runner, session dispatch, delivery comp/cli CLI entry point, hermes_cli/, setup wizard area/config Config system, migrations, profiles labels May 5, 2026
@chengoak

chengoak commented May 5, 2026

Copy link
Copy Markdown
Contributor Author

Fixes #20004 - This PR ensures max_tokens from custom_providers is properly passed to AIAgent, with fallback to global model.max_tokens config.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/config Config system, migrations, profiles comp/cli CLI entry point, hermes_cli/, setup wizard comp/gateway Gateway runner, session dispatch, delivery P2 Medium — degraded but workaround exists type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants