fix(api): strip Gemma special tokens from model output (fixes #1087)#1698
Merged
Merged
Conversation
…1087) SPECIAL_TOKENS_PATTERN covered Llama/Qwen/Mistral/GPT-2 special tokens but omitted Gemma's (<eos>, <bos>, <end_of_turn>, <start_of_turn>), so they leaked into assistant content (notably structured outputs). Add them to the strip pattern — clean_output_text/clean_special_tokens now remove them on every engine output path. Adds test_clean_gemma_special_tokens (and the existing </s>/<s>/<pad> cases still pass, guarding the pattern).
Owner
|
Thanks for fixing this. The change is in the right central cleanup path, so it covers the structured output leak from #1087 without adding model-specific response handling. I verified the focused API utility and thinking tests locally, and this looks good to me. Merging. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #1087.
SPECIAL_TOKENS_PATTERNinomlx/api/utils.pystripped special tokens for the Llama / Qwen / Mistral / GPT-2 families but omitted Gemma's —<eos>,<bos>,<end_of_turn>,<start_of_turn>— so they leaked into assistantmessage.content, most visibly in structured outputs.This adds those four to the strip pattern, so both
clean_special_tokens()andclean_output_text()remove them on every engine output path.Test
Adds
test_clean_gemma_special_tokens(covers<eos>,<end_of_turn>,<start_of_turn>, and<bos>…<eos>). The existing</s>/<s>/<pad>/[PAD]/[SEP]/[CLS]cases still pass, guarding the pattern.pytest tests/test_api_utils.py→ 201 passed.