Skip to content

feat(autoparser): prefer chat deltas from backends when emitted#9224

Merged
mudler merged 1 commit into
masterfrom
feat/autoparser-reasoning
Apr 4, 2026
Merged

feat(autoparser): prefer chat deltas from backends when emitted#9224
mudler merged 1 commit into
masterfrom
feat/autoparser-reasoning

Conversation

@mudler

@mudler mudler commented Apr 4, 2026

Copy link
Copy Markdown
Owner

Description

This PR fixes #

Notes for Reviewers

Signed commits

  • Yes, I signed my commits.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
@mudler mudler force-pushed the feat/autoparser-reasoning branch from 5fa7525 to 1ad5855 Compare April 4, 2026 09:02
@mudler mudler merged commit 716ddd6 into master Apr 4, 2026
35 checks passed
@mudler mudler deleted the feat/autoparser-reasoning branch April 4, 2026 10:12
mudler added a commit that referenced this pull request May 25, 2026
…in pure-content mode (#9991)

When LocalAI templates a thinking model outside of jinja (the default for
the qwen3 gallery family), llama.cpp's chat parser falls back to a
"pure content" PEG parser that dumps the entire raw response into
ChatDelta.Content with an empty ReasoningContent. The Go side then
trusted that content verbatim and overrode tokenCallback's
correctly-split reasoning, so <think>...</think> blocks ended up in the
OpenAI `content` field. Regression from v4.0.0 introduced when the
autoparser ChatDeltas path was added (#9224).

The override now runs Go-side reasoning extraction defensively when the
autoparser delivered content but no reasoning. The streaming worker
gains a sticky preferAutoparser flag that flips on the first chunk
where the autoparser classified reasoning_content; until then we use
the streaming Go-side extractor. Realtime mirrors the non-streaming
fallback. When the autoparser already populated ReasoningContent we
trust it untouched, so jinja-enabled installs are not regressed.

gallery/qwen3.yaml now enables use_jinja, letting the autoparser
classify <think> natively for all 20+ qwen3 family entries that share
this template.

Fixes #9985

Assisted-by: Claude:opus-4-7 [Read] [Edit] [Bash] [Write]

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Co-authored-by: Ettore Di Giacinto <mudler@localai.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant