grammar: optimize big grammar matching by using left-factorization / optimize grammar cloning#20961
grammar: optimize big grammar matching by using left-factorization / optimize grammar cloning#20961pwilkin wants to merge 1 commit into
Conversation
|
Then again, that would take more lines but would offer more opportunities to optimize the AST. I'll defer to @ggerganov if he is OK with this PR. I can run a few test scenarios to check for regressions. Edit: On second look, it's probably close to the minimal implementation for left factoring. However, I do prefer these optimizations be done against an AST. In general, I think many of the outstanding security issues related to grammars can be handled by analyzing an AST. |
|
But why? This is a 120-line-of-code optimization, mostly enclosed in one transformation function, with a huge performance gain. Sure, we can go full AST mode, but that would mean a complete rewrite of the grammar engine and meanwhile, people are having real problems with grammar efficiency and OpenClaw (see the attached issue). I don't see a problem with reworking the grammar overall, but I also don't see why we should take the "all-or-nothing" approach. |
|
It's not a rewrite, it is changing the parser to emit an AST and then visit to emit the PDA rules. Like I said, I can offer testing for regressions if this is desirable as-is. |
|
Sounds like a rewrite to me but if you want to do it then I don't see why not :) but let's see what @ggerganov has to say. |
|
Dear @pwilkin , I noticed your update, and I tried to use b8508 along with your modified source file and rebuild the code, then run the llama-server again, the gramma error still displayed. Thanks, |
Overview
Reduce complexity of parsing for huge grammars by using left-factorization of rules + refactor cloning to use pointer offsets to avoid the four-level loop.
Additional information
Pre-factorizes grammar branches to reduce number of alternate routes. On big and ambiguous grammars, the speed gain is up to 50x. Uses Adds a performance test for grammar parsing.
Hopefully helps alleviate cases such as #20879, to be paired with a PR parametrizing
MAX_REPETITION_THRESHOLD.Requirements