fix: properly pass model.max_tokens config to AIAgent in gateway#19991
Open
chengoak wants to merge 2 commits into
Open
fix: properly pass model.max_tokens config to AIAgent in gateway#19991chengoak wants to merge 2 commits into
chengoak wants to merge 2 commits into
Conversation
- Add max_tokens to _resolve_runtime_agent_kwargs() return value - Add max_tokens to _resolve_turn_agent_config() runtime dict - Add max_tokens to _try_resolve_fallback_provider() return value - Add max_tokens parameter to HermesCLI.__init__ with config support This ensures that model.max_tokens from config.yaml is properly passed to AIAgent, preventing response truncation for custom providers like ark/bytedance models that have conservative default output limits.
- Add max_tokens to _KNOWN_KEYS in config.py - Pass max_tokens through _normalize_custom_provider_entry - Read max_tokens from runtime provider config in gateway - Fallback to model.max_tokens if provider-specific config not present - Propagate max_tokens through credential pool resolution path
Contributor
Author
|
Fixes #20004 - This PR ensures max_tokens from custom_providers is properly passed to AIAgent, with fallback to global model.max_tokens config. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does this PR do?
Fixes the issue where from config.yaml was not being passed to AIAgent when running via the gateway (Feishu, QQBot, etc.), causing model responses to be truncated due to conservative default output limits.
Changes:
Why is this needed?
For custom providers like ByteDance Ark, the model default output token limit is quite conservative. When is configured but not passed through, users see warnings in platforms like Feishu.
Testing