Skip to content

tool parser: add GigaChatV3/3.1 models support in PEG format#19931

Merged
pwilkin merged 1 commit intoggml-org:masterfrom
Mishusha:feature/gigachatv3-v3.1_PEG_parser_support
Mar 12, 2026
Merged

tool parser: add GigaChatV3/3.1 models support in PEG format#19931
pwilkin merged 1 commit intoggml-org:masterfrom
Mishusha:feature/gigachatv3-v3.1_PEG_parser_support

Conversation

@Mishusha
Copy link
Contributor

I have recreated the PR of #17924 for cleaner commits and no merge conflicts

@github-actions github-actions bot added the testing Everything test related label Feb 26, 2026
@Mishusha Mishusha force-pushed the feature/gigachatv3-v3.1_PEG_parser_support branch from f774481 to 2d12cf9 Compare February 26, 2026 16:29
@Mishusha
Copy link
Contributor Author

Mishusha commented Mar 6, 2026

@pwilkin @ggerganov

Can you check the PR please? Thanks)

@pwilkin
Copy link
Contributor

pwilkin commented Mar 6, 2026

@Mishusha sorry, wanted to get the autoparser in first. Could you rebase on latest master? Nothing should change except the positioning of the functions and the detection code in chat.cpp.

@Mishusha
Copy link
Contributor Author

@pwilkin Hello! I tried setting up an autoparser with the GigaChatv3 model.

The tests themselves pass, but the autoparser doesn't save the <|message_sep|>\n\n and <|role_sep|>\n tokens I need in preserved tokens separately, but rather their combined stripped prefix, <|message_sep|>\n\nfunction call<|role_sep|> from chat-template

Therefore, when I run my server without the -sp flag, toolcalls aren't automatically parsed because special tokens are discarded.

What's the best way to do this—set up my own custom parser for GigaChatv3? (Leave the one currently committed) Or do you recommend something else?

P.S. For gigachatv3.1, there's a single <|function_call|> token, and there are no such problems with the autoparser.

@pwilkin
Copy link
Contributor

pwilkin commented Mar 10, 2026

@Mishusha Yeah, I checked it with the autoparser and determined that it needs the separate parser due to the role thing (similar to Functionary), that's why I told you to do that instead of just closing it as deprecated :) sorry, should've made myself clearer.

The parser can stay as it is since it's already PEG based, you just have to move the detection code to the right place in chat.cpp and that's it.

@Mishusha Mishusha force-pushed the feature/gigachatv3-v3.1_PEG_parser_support branch from 2d12cf9 to 6ea01ad Compare March 11, 2026 01:00
@Mishusha
Copy link
Contributor Author

@pwilkin Ok, thank you very much)
All the tests are passing now and everything looks correct on inference.
Please take a look)

Copy link
Contributor

@pwilkin pwilkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Waiting for CI and then will merge.

@Mishusha
Copy link
Contributor Author

@pwilkin Some checks failed, but it seems that this is not related to this PR.

@pwilkin
Copy link
Contributor

pwilkin commented Mar 12, 2026

Yup.

@pwilkin pwilkin merged commit a8304b4 into ggml-org:master Mar 12, 2026
72 of 78 checks passed
ProgenyAlpha pushed a commit to ProgenyAlpha/llama.cpp that referenced this pull request Mar 12, 2026
Co-authored-by: Mishusha <pmv26021975@gmail.com>
tekintian added a commit to tekintian/llama.cpp that referenced this pull request Mar 12, 2026
* 'master' of github.com:ggml-org/llama.cpp: (33 commits)
  convert : better mtp check and fix return [no ci] (ggml-org#20419)
  vulkan: fix SSM_CONV PP scaling with large ubatch sizes (ggml-org#20379)
  New conversations now auto-select the first loaded model (ggml-org#20403)
  ggml-virtgpu: Fix some build commands (ggml-org#20341)
  metal : avoid divisions in bin kernel (ggml-org#20426)
  ci: Setup self-hosted CI for Intel Linux Vulkan backend (ggml-org#20154)
  vulkan: fix l2_norm epsilon handling (ggml-org#20350)
  vulkan: fix OOB check in flash_attn_mask_opt (ggml-org#20296)
  vulkan: Fix ErrorOutOfHostMemory on Intel GPU when loading large models with --no-mmap (ggml-org#20059)
  opencl: use larger workgroup size for get_rows (ggml-org#20316)
  opencl: add cumsum op (ggml-org#18981)
  hip: compile debug builds with -O2 on hip to avoid a compiler bug (ggml-org#20392)
  common/parser: add GigaChatV3/3.1 models support (ggml-org#19931)
  model : add support for Phi4ForCausalLMV (ggml-org#20168)
  graph : add optional scale parameter to build_lora_mm [no ci] (ggml-org#20427)
  common : fix --n-cpu-moe, --cpu-moe for models with fused gate + up (ggml-org#20416)
  ggml-webgpu: Add supports for `GGML_OP_REPEAT` (ggml-org#20230)
  llama : enable chunked fused GDN path (ggml-org#20340)
  llama : whitespace cleanup (ggml-org#20422)
  ggml : add NVFP4 quantization type support (ggml-org#19769)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

testing Everything test related

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants