grammar: optimize big grammar matching by using left-factorization / optimize grammar cloning by pwilkin · Pull Request #20961 · ggml-org/llama.cpp

pwilkin · 2026-03-24T19:34:01Z

Overview

Reduce complexity of parsing for huge grammars by using left-factorization of rules + refactor cloning to use pointer offsets to avoid the four-level loop.

Additional information

Pre-factorizes grammar branches to reduce number of alternate routes. On big and ambiguous grammars, the speed gain is up to 50x. Uses Adds a performance test for grammar parsing.

Hopefully helps alleviate cases such as #20879, to be paired with a PR parametrizing MAX_REPETITION_THRESHOLD.

Requirements

I have read and agree with the contributing guidelines
AI usage disclosure: YES, told Claude to try various optimization passes at grammar, picked the ones with minimum code change + maximum effect.

aldehir · 2026-03-24T20:58:48Z

~~This is a bit much.~~ If we want to do this, we should parse to an AST and left-factor on that instead.

Then again, that would take more lines but would offer more opportunities to optimize the AST.

I'll defer to @ggerganov if he is OK with this PR. I can run a few test scenarios to check for regressions.

Edit: On second look, it's probably close to the minimal implementation for left factoring. However, I do prefer these optimizations be done against an AST. In general, I think many of the outstanding security issues related to grammars can be handled by analyzing an AST.

pwilkin · 2026-03-24T21:04:54Z

But why? This is a 120-line-of-code optimization, mostly enclosed in one transformation function, with a huge performance gain. Sure, we can go full AST mode, but that would mean a complete rewrite of the grammar engine and meanwhile, people are having real problems with grammar efficiency and OpenClaw (see the attached issue). I don't see a problem with reworking the grammar overall, but I also don't see why we should take the "all-or-nothing" approach.

aldehir · 2026-03-24T21:08:10Z

It's not a rewrite, it is changing the parser to emit an AST and then visit to emit the PDA rules.

Like I said, I can offer testing for regressions if this is desirable as-is.

pwilkin · 2026-03-24T21:09:47Z

Sounds like a rewrite to me but if you want to do it then I don't see why not :) but let's see what @ggerganov has to say.

TangLaoya · 2026-03-25T09:19:01Z

Dear @pwilkin ,

I noticed your update, and I tried to use b8508 along with your modified source file and rebuild the code, then run the llama-server again, the gramma error still displayed.

Thanks,
Tang Laoya

grammar: optimize big grammar matching by using left-factorization

d12c8f3

pwilkin requested a review from ggerganov as a code owner March 24, 2026 19:34

github-actions Bot added the testing Everything test related label Mar 24, 2026

pwilkin mentioned this pull request Mar 25, 2026

grammar: increase MAX_REPETITION_THRESHOLD + make it configurable via envvar #21003

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

grammar: optimize big grammar matching by using left-factorization / optimize grammar cloning#20961

grammar: optimize big grammar matching by using left-factorization / optimize grammar cloning#20961
pwilkin wants to merge 1 commit into
ggml-org:masterfrom
pwilkin:grammar-optimize

pwilkin commented Mar 24, 2026

Uh oh!

aldehir commented Mar 24, 2026 •

edited

Loading

Uh oh!

pwilkin commented Mar 24, 2026

Uh oh!

aldehir commented Mar 24, 2026

Uh oh!

pwilkin commented Mar 24, 2026

Uh oh!

TangLaoya commented Mar 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

pwilkin commented Mar 24, 2026

Overview

Additional information

Requirements

Uh oh!

aldehir commented Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pwilkin commented Mar 24, 2026

Uh oh!

aldehir commented Mar 24, 2026

Uh oh!

pwilkin commented Mar 24, 2026

Uh oh!

TangLaoya commented Mar 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

aldehir commented Mar 24, 2026 •

edited

Loading