fix(transport): apply default temperature and parallel_tool_calls for custom OpenAI-compat endpoints (#18470)#18492
Open
XwanwanX wants to merge 1 commit into
Conversation
…ndpoints - Default chat_completions sampling temperature for provider=custom stacks (Issue NousResearch#18470) - Send parallel_tool_calls when tools are present; optional YAML overrides via custom_providers - Thread options through CLI, gateway runtime, and delegate subagents
Collaborator
Collaborator
|
Likely duplicate of #18483 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Hermes routed
provider=customthroughchat_completionswithout sendingtemperatureorparallel_tool_calls. Many local OpenAI-compatible stacks (e.g. llama.cpp / vLLM) then keep server defaults (oftentemperature=1.0) and only batch tool rounds whenparallel_tool_callsis explicit.Fixes #18470.
Changes
ChatCompletionsTransport.build_kwargsWhen
is_custom_provideris true:temperature: 0.2unless the user opts out (omit_temperaturein compat options).toolsare present, sendparallel_tool_calls: trueby default unless overridden.custom_providers(normalized inhermes_cli/config.py, surfaced viaresolve_runtime_provider):temperature(number),parallel_tool_calls(bool),omit_temperature: true.AIAgent(custom_openai_request_options=...).Delegate: subagents inherit parent compat options when
effective_provider == "custom".Config example