fix: sanitize internal fields from API messages#138
Conversation
Mistral and other strict API providers reject extra fields (like finish_reason, reasoning, _flush_sentinel) on chat messages with 422 Unprocessable Entity. Root cause: _build_assistant_message() adds internal bookkeeping fields (finish_reason, reasoning) to message dicts for trajectory saving and logging. When these messages are sent back to the API as conversation history, strict providers reject the unknown fields. Fix: Add a whitelist-based _sanitize_for_api() helper that keeps only standard OpenAI-compatible fields per message role. This replaces the previous fragile blacklist approach (which only removed 'reasoning' but forgot 'finish_reason') and is applied consistently across all 3 API call sites: 1. Main conversation loop 2. Memory flush 3. Max iterations summary (was completely unsanitized) The whitelist includes reasoning_content and reasoning_details for providers that support multi-turn reasoning (OpenRouter, Moonshot AI). Original messages list is unchanged — trajectory saving, session DB, and logging continue to work as before.
Bartok9
left a comment
There was a problem hiding this comment.
LGTM! The whitelist-based sanitization is the right approach.
Key benefits:
- Forward-compatible (new internal fields won't accidentally leak)
- Handles all message roles properly
- Fixed the edge case in
_handle_max_iterations()
The _API_FIELDS_BY_ROLE dict is clean and easy to extend if needed.
Tested locally with Mistral - the 422 errors are resolved. 🎉
Bartok9
left a comment
There was a problem hiding this comment.
LGTM! The whitelist-based sanitization is the right approach.
Key benefits:
- Forward-compatible (new internal fields won't accidentally leak)
- Handles all message roles properly
- Fixed the edge case in _handle_max_iterations()
Tested logic - the 422 errors should be resolved.
|
@cutepawss are we aware of they accept extra_headers kwarg? Thats needed to control a lot of models' reasoning effort, whether they think at all, etc. I'm worried this isn't strict OAI spec but strictly mistral spec? |
|
@teknium1
client.chat.completions.create(
model="...",
messages=api_messages, # ← only this is sanitized
extra_body={"reasoning": {"effort": "high"}}, # ← untouched
extra_headers={"X-Custom": "value"}, # ← untouched
)What we're stripping are internal bookkeeping fields like The whitelist is based on the OpenAI Chat Completion message spec, plus known provider extensions ( |
|
Okay but messages with OpenRouter require a reasoning_content or reasoning field to carry over reasoning content for instance. Scrubbing that would break interleaved reasoning in many models |
|
@teknium1 Thanks for flagging this I want to make sure I'm not breaking anything here. From what I can see, if role == "assistant" and msg.get("reasoning"):
api_msg["reasoning_content"] = msg["reasoning"]So This is also covered by the tests in the PR — That said; is there a provider or flow where |
Fixes #134
Problem
When using Mistral (or any strict OpenAI-compatible provider) as a direct API endpoint, the second message always fails with:
The first message works fine because there's no assistant message in history yet. On the second turn, the previous assistant response — carrying internal fields like
finish_reason— gets sent back to the API, which rejects it.Root Cause
_build_assistant_message()addsfinish_reasonandreasoningto every assistant message dict for trajectory saving and session logging. The API preparation code strippedreasoningbut forgotfinish_reason. Additionally,_handle_max_iterations()sent messages with no sanitization at all.This is a broader design issue: the code used a fragile blacklist (manually popping individual fields), which breaks whenever a new internal field is added.
Fix
Added a whitelist-based
_sanitize_for_api()helper that keeps only standard OpenAI-compatible fields per message role:Applied consistently across all 3 API call sites:
reasoning_contentandreasoning_detailsare kept in the whitelist for providers that support multi-turn reasoning (OpenRouter, Moonshot AI). The originalmessageslist is unchanged, so trajectory saving, session DB logging, and display continue to work.Testing
Added 8 tests in
tests/agent/test_sanitize_api.py:test_finish_reason_strippedfinish_reasonremovedtest_reasoning_becomes_reasoning_contentreasoning→reasoning_contentconversiontest_reasoning_content_not_added_when_emptytest_flush_sentinel_stripped_flush_sentinelon user msgs removedtest_standard_assistant_fields_preservedtool_calls,reasoning_detailspass throughtest_tool_message_preserves_tool_call_idtool_call_id, drop extrastest_system_message_minimalrole+contenttest_unknown_role_defaults_to_role_content