fix: add generic fallback to detect trailing <think> tags in Jinja templates and handle forced-open reasoning blocks#16426
Conversation
|
(llama-server --jinja)
Works also with "stream": true together with streaming-aware parser #16394 |
|
The CI seems to fail? |
0f63b5b to
0869085
Compare
|
Reproduced it's me, I check... |
89ae131 to
e1f526c
Compare
|
I'm getting "Errors while running CTest" on the CI, and I need to check if there's a regression somewhere else. test-chat / test-chat-template are OK vs. master : |
|
Hi @ServeurpersoCom, I can run an automated high-severity-only LLM review on this PR and post a single focused inline comment. Reply with "approve" or add a comment saying "@tommarques56 approve" to proceed. |
@tommarques56 approve Hey tommarques56, I really like what you’re doing with these automated LLM reviews! That’s a great idea! |
|
@ServeurpersoCom I just blocked this user for spamming. This is not a good way to run such experiments because it introduces a lot of noise into the discussions. |
Ah, that explains the false XSS detection from the bot earlier! perfect, thanks for clarifying! |
…mplates and handle forced-open reasoning blocks - Detect trailing <think> tags in generic chat templates, trim whitespace, and either append the closing tag or mark the reasoning block as forced-open based on enable_thinking - Added a regression test covering a fallback template that opens the reasoning block in the prompt and verifies prompt differences, forced-open behaviour, and reasoning parsing - Now compatible with models using the default Jinja chat template, such as https://huggingface.co/unsloth/GLM-Z1-32B-0414-GGUF
…t through common_chat_params for consistent <think> handling - Added a supports_enable_thinking field to common_chat_params, populate it during template rendering, and reuse it when deciding whether the generic <think> fallback should run - Updated common_chat_templates_support_enable_thinking to consult the tracked capability and expanded the chat template tests to assert the flag for templates that do and do not react to enable_thinking - Updated chat template tests to assert the guarded fallback behaviour and to cover templates that conditionally open <think> blocks.
Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>
756e6ec to
6041e25
Compare
|
I need to find another approach for detecting open tags in Jinja, because right now I either break the CI or I break "Qwen3 30B A3B Thinking" just to fix GLM-Z1-32B-0414-GGUF, which isn’t really worth it (way more people use Qwen3!).. It might work if I replace the hardcoded Jinja part (which I don’t really like) with the proper logic instead. |
|
I switched this one to draft. need another approach |
Add generic fallback to detect trailing tags in Jinja templates and handle forced-open reasoning blocks :
Make sure to read the contributing guidelines before submitting a PR