fix(api): strip Gemma special tokens from model output (fixes #1087) by JimStenstrom · Pull Request #1698 · jundot/omlx

JimStenstrom · 2026-06-06T03:22:02Z

SPECIAL_TOKENS_PATTERN in omlx/api/utils.py stripped special tokens for the Llama / Qwen / Mistral / GPT-2 families but omitted Gemma's — <eos>, <bos>, <end_of_turn>, <start_of_turn> — so they leaked into assistant message.content, most visibly in structured outputs.

This adds those four to the strip pattern, so both clean_special_tokens() and clean_output_text() remove them on every engine output path.

Test

Adds test_clean_gemma_special_tokens (covers <eos>, <end_of_turn>, <start_of_turn>, and <bos>…<eos>). The existing </s> / <s> / <pad> / [PAD] / [SEP] / [CLS] cases still pass, guarding the pattern.

pytest tests/test_api_utils.py → 201 passed.

…1087) SPECIAL_TOKENS_PATTERN covered Llama/Qwen/Mistral/GPT-2 special tokens but omitted Gemma's (<eos>, <bos>, <end_of_turn>, <start_of_turn>), so they leaked into assistant content (notably structured outputs). Add them to the strip pattern — clean_output_text/clean_special_tokens now remove them on every engine output path. Adds test_clean_gemma_special_tokens (and the existing </s>/<s>/<pad> cases still pass, guarding the pattern).

jundot · 2026-06-06T15:59:16Z

Thanks for fixing this. The change is in the right central cleanup path, so it covers the structured output leak from #1087 without adding model-specific response handling. I verified the focused API utility and thinking tests locally, and this looks good to me. Merging.

jundot merged commit a8e15ac into jundot:main Jun 6, 2026

JimStenstrom deleted the fix/1087-strip-gemma-tokens branch June 7, 2026 14:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(api): strip Gemma special tokens from model output (fixes #1087)#1698

fix(api): strip Gemma special tokens from model output (fixes #1087)#1698
jundot merged 1 commit into
jundot:mainfrom
JimStenstrom:fix/1087-strip-gemma-tokens

JimStenstrom commented Jun 6, 2026

Uh oh!

jundot commented Jun 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

JimStenstrom commented Jun 6, 2026

Test

Uh oh!

jundot commented Jun 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants