fix(kimi): omit temperature entirely for Kimi/Moonshot models by teknium1 · Pull Request #13157 · NousResearch/hermes-agent

teknium1 · 2026-04-20T19:09:27Z

Summary

Kimi's gateway now selects the correct temperature server-side. Sending any temperature value — even the previously correct one — conflicts with gateway-managed defaults. This strips the temperature key from API kwargs entirely for all kimi-* models.

Inspired by PR #13137 (@kshitijk4poor) which removed the forced values but still passed the caller's temperature through. This goes further: no temperature sent at all.

Changes

agent/auxiliary_client.py: OMIT_TEMPERATURE sentinel + _is_kimi_model() prefix check. _build_call_kwargs() strips temperature when sentinel returned.
run_agent.py: _build_api_kwargs, flush_memories, and summary generation paths handle the sentinel.
trajectory_compressor.py: Direct client calls conditionally include temperature via kwargs dict.
mini_swe_runner.py: Same sentinel handling.
6 test files: All 'forces temperature X' assertions → 'temperature not in kwargs'.

Validation

	Before	After
kimi-for-coding temp	0.6 forced	omitted
kimi-k2.5 @ moonshot.ai	1.0 forced	omitted
kimi-k2-thinking	1.0 forced	omitted
gpt-5.4 temp=0.3	0.3 preserved	0.3 preserved

468 tests pass across all 6 affected test files. E2E verified with real imports.

@kshitijk4poor

Kimi's gateway selects the correct temperature server-side based on the active mode (thinking -> 1.0, non-thinking -> 0.6). Sending any temperature value — even the previously "correct" one — conflicts with gateway-managed defaults. Replaces the old approach of forcing specific temperature values (0.6 for non-thinking, 1.0 for thinking) with an OMIT_TEMPERATURE sentinel that tells all call sites to strip the temperature key from API kwargs entirely. Changes: - agent/auxiliary_client.py: OMIT_TEMPERATURE sentinel, _is_kimi_model() prefix check (covers all kimi-* models), _fixed_temperature_for_model() returns sentinel for kimi models. _build_call_kwargs() strips temp. - run_agent.py: _build_api_kwargs, flush_memories, and summary generation paths all handle the sentinel by popping/omitting temperature. - trajectory_compressor.py: _effective_temperature_for_model returns None for kimi (sentinel mapped), direct client calls use kwargs dict to conditionally include temperature. - mini_swe_runner.py: same sentinel handling via wrapper function. - 6 test files updated: all 'forces temperature X' assertions replaced with 'temperature not in kwargs' assertions. Net: -76 lines (171 added, 247 removed). Inspired by PR #13137 (@kshitijk4poor).

github-actions · 2026-04-20T19:09:43Z

🚨 CRITICAL Supply Chain Risk Detected

This PR contains a pattern that has been used in real supply chain attacks. A maintainer must review the flagged code carefully before merging.

🚨 CRITICAL: Install-hook file added or modified

These files can execute code during package installation or interpreter startup.

Files:

hermes_cli/setup.py

Scanner only fires on high-signal indicators: .pth files, base64+exec/eval combos, subprocess with encoded commands, or install-hook files. Low-signal warnings were removed intentionally — if you're seeing this comment, the finding is worth inspecting.

PRs NousResearch#13148 and NousResearch#13152 landed kimi-k2.6 on OpenRouter, Nous, NVIDIA NIM, kimi-coding, kimi-coding-cn, and moonshot. This fills in the remaining static catalogs (opencode-zen, opencode-go, alibaba, huggingface) plus the 256K context-length entry for the HuggingFace variant. Temperature handling is already covered by PR NousResearch#13157's OMIT_TEMPERATURE sentinel + _is_kimi_model() prefix check, which matches all kimi-* IDs automatically — no per-model entries needed.

@kshitijk4poor

…search#13157) Kimi's gateway selects the correct temperature server-side based on the active mode (thinking -> 1.0, non-thinking -> 0.6). Sending any temperature value — even the previously "correct" one — conflicts with gateway-managed defaults. Replaces the old approach of forcing specific temperature values (0.6 for non-thinking, 1.0 for thinking) with an OMIT_TEMPERATURE sentinel that tells all call sites to strip the temperature key from API kwargs entirely. Changes: - agent/auxiliary_client.py: OMIT_TEMPERATURE sentinel, _is_kimi_model() prefix check (covers all kimi-* models), _fixed_temperature_for_model() returns sentinel for kimi models. _build_call_kwargs() strips temp. - run_agent.py: _build_api_kwargs, flush_memories, and summary generation paths all handle the sentinel by popping/omitting temperature. - trajectory_compressor.py: _effective_temperature_for_model returns None for kimi (sentinel mapped), direct client calls use kwargs dict to conditionally include temperature. - mini_swe_runner.py: same sentinel handling via wrapper function. - 6 test files updated: all 'forces temperature X' assertions replaced with 'temperature not in kwargs' assertions. Net: -76 lines (171 added, 247 removed). Inspired by PR NousResearch#13137 (@kshitijk4poor).

@kshitijk4poor

…search#13157) Kimi's gateway selects the correct temperature server-side based on the active mode (thinking -> 1.0, non-thinking -> 0.6). Sending any temperature value — even the previously "correct" one — conflicts with gateway-managed defaults. Replaces the old approach of forcing specific temperature values (0.6 for non-thinking, 1.0 for thinking) with an OMIT_TEMPERATURE sentinel that tells all call sites to strip the temperature key from API kwargs entirely. Changes: - agent/auxiliary_client.py: OMIT_TEMPERATURE sentinel, _is_kimi_model() prefix check (covers all kimi-* models), _fixed_temperature_for_model() returns sentinel for kimi models. _build_call_kwargs() strips temp. - run_agent.py: _build_api_kwargs, flush_memories, and summary generation paths all handle the sentinel by popping/omitting temperature. - trajectory_compressor.py: _effective_temperature_for_model returns None for kimi (sentinel mapped), direct client calls use kwargs dict to conditionally include temperature. - mini_swe_runner.py: same sentinel handling via wrapper function. - 6 test files updated: all 'forces temperature X' assertions replaced with 'temperature not in kwargs' assertions. Net: -76 lines (171 added, 247 removed). Inspired by PR NousResearch#13137 (@kshitijk4poor).

@kshitijk4poor

…search#13157) Kimi's gateway selects the correct temperature server-side based on the active mode (thinking -> 1.0, non-thinking -> 0.6). Sending any temperature value — even the previously "correct" one — conflicts with gateway-managed defaults. Replaces the old approach of forcing specific temperature values (0.6 for non-thinking, 1.0 for thinking) with an OMIT_TEMPERATURE sentinel that tells all call sites to strip the temperature key from API kwargs entirely. Changes: - agent/auxiliary_client.py: OMIT_TEMPERATURE sentinel, _is_kimi_model() prefix check (covers all kimi-* models), _fixed_temperature_for_model() returns sentinel for kimi models. _build_call_kwargs() strips temp. - run_agent.py: _build_api_kwargs, flush_memories, and summary generation paths all handle the sentinel by popping/omitting temperature. - trajectory_compressor.py: _effective_temperature_for_model returns None for kimi (sentinel mapped), direct client calls use kwargs dict to conditionally include temperature. - mini_swe_runner.py: same sentinel handling via wrapper function. - 6 test files updated: all 'forces temperature X' assertions replaced with 'temperature not in kwargs' assertions. Net: -76 lines (171 added, 247 removed). Inspired by PR NousResearch#13137 (@kshitijk4poor).

@kshitijk4poor

…search#13157) Kimi's gateway selects the correct temperature server-side based on the active mode (thinking -> 1.0, non-thinking -> 0.6). Sending any temperature value — even the previously "correct" one — conflicts with gateway-managed defaults. Replaces the old approach of forcing specific temperature values (0.6 for non-thinking, 1.0 for thinking) with an OMIT_TEMPERATURE sentinel that tells all call sites to strip the temperature key from API kwargs entirely. Changes: - agent/auxiliary_client.py: OMIT_TEMPERATURE sentinel, _is_kimi_model() prefix check (covers all kimi-* models), _fixed_temperature_for_model() returns sentinel for kimi models. _build_call_kwargs() strips temp. - run_agent.py: _build_api_kwargs, flush_memories, and summary generation paths all handle the sentinel by popping/omitting temperature. - trajectory_compressor.py: _effective_temperature_for_model returns None for kimi (sentinel mapped), direct client calls use kwargs dict to conditionally include temperature. - mini_swe_runner.py: same sentinel handling via wrapper function. - 6 test files updated: all 'forces temperature X' assertions replaced with 'temperature not in kwargs' assertions. Net: -76 lines (171 added, 247 removed). Inspired by PR NousResearch#13137 (@kshitijk4poor).

@kshitijk4poor

…search#13157) Kimi's gateway selects the correct temperature server-side based on the active mode (thinking -> 1.0, non-thinking -> 0.6). Sending any temperature value — even the previously "correct" one — conflicts with gateway-managed defaults. Replaces the old approach of forcing specific temperature values (0.6 for non-thinking, 1.0 for thinking) with an OMIT_TEMPERATURE sentinel that tells all call sites to strip the temperature key from API kwargs entirely. Changes: - agent/auxiliary_client.py: OMIT_TEMPERATURE sentinel, _is_kimi_model() prefix check (covers all kimi-* models), _fixed_temperature_for_model() returns sentinel for kimi models. _build_call_kwargs() strips temp. - run_agent.py: _build_api_kwargs, flush_memories, and summary generation paths all handle the sentinel by popping/omitting temperature. - trajectory_compressor.py: _effective_temperature_for_model returns None for kimi (sentinel mapped), direct client calls use kwargs dict to conditionally include temperature. - mini_swe_runner.py: same sentinel handling via wrapper function. - 6 test files updated: all 'forces temperature X' assertions replaced with 'temperature not in kwargs' assertions. Net: -76 lines (171 added, 247 removed). Inspired by PR NousResearch#13137 (@kshitijk4poor).

@kshitijk4poor

…search#13157) Kimi's gateway selects the correct temperature server-side based on the active mode (thinking -> 1.0, non-thinking -> 0.6). Sending any temperature value — even the previously "correct" one — conflicts with gateway-managed defaults. Replaces the old approach of forcing specific temperature values (0.6 for non-thinking, 1.0 for thinking) with an OMIT_TEMPERATURE sentinel that tells all call sites to strip the temperature key from API kwargs entirely. Changes: - agent/auxiliary_client.py: OMIT_TEMPERATURE sentinel, _is_kimi_model() prefix check (covers all kimi-* models), _fixed_temperature_for_model() returns sentinel for kimi models. _build_call_kwargs() strips temp. - run_agent.py: _build_api_kwargs, flush_memories, and summary generation paths all handle the sentinel by popping/omitting temperature. - trajectory_compressor.py: _effective_temperature_for_model returns None for kimi (sentinel mapped), direct client calls use kwargs dict to conditionally include temperature. - mini_swe_runner.py: same sentinel handling via wrapper function. - 6 test files updated: all 'forces temperature X' assertions replaced with 'temperature not in kwargs' assertions. Net: -76 lines (171 added, 247 removed). Inspired by PR NousResearch#13137 (@kshitijk4poor).

teknium1 merged commit 3cba81e into main Apr 20, 2026
3 of 6 checks passed

teknium1 deleted the hermes/hermes-d7fc4ba8 branch April 20, 2026 19:23

teknium1 mentioned this pull request Apr 20, 2026

fix(kimi): drop client-side temperature overrides for Kimi/Moonshot models #13137

Closed

pefontana mentioned this pull request Apr 20, 2026

feat(models): add kimi-k2.6 to opencode-zen/go, alibaba, HuggingFace #13156

Closed

4 tasks

mengjian-github mentioned this pull request Apr 20, 2026

fix(kimi): send max_tokens, reasoning_effort, and thinking for Kimi/Moonshot #13168

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(kimi): omit temperature entirely for Kimi/Moonshot models#13157

fix(kimi): omit temperature entirely for Kimi/Moonshot models#13157
teknium1 merged 1 commit into
mainfrom
hermes/hermes-d7fc4ba8

teknium1 commented Apr 20, 2026

Uh oh!

github-actions Bot commented Apr 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

teknium1 commented Apr 20, 2026

Summary

Changes

Validation

Uh oh!

github-actions Bot commented Apr 20, 2026

🚨 CRITICAL Supply Chain Risk Detected

🚨 CRITICAL: Install-hook file added or modified

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant