convert: text-only support for GLM-4.1V-9B-Thinking by jacekpoplawski · Pull Request #14823 · ggml-org/llama.cpp

jacekpoplawski · 2025-07-22T23:14:14Z

This is my first attempt to contribute to llama.cpp.

I used Transformers to compare layers with GLM-4-9B-0414, the text structure appears identical.
the config for GLM-4.1V-9B-Thinking is missing the head_dim field


llama-cli -ngl 99 -n 1024 -m /mnt/models3/git/GLM-4.1V-9B-Thinking/GLM-9B-4.1V-Thinking-F16.gguf -sys "speak in english" 2>/dev/null

speak in english
> write one sentence about llama.cpp
<think>Got it, let's think about what to write about llama.cpp. It's a popular open-source project for running large language models efficiently, maybe on GPUs or specialized hardware. So a sentence
(...)

convert_hf_to_gguf.py

* use language_model part only, ignore visual layers * fix rope_dim calculation

ddpasa · 2025-07-24T09:24:33Z

Please do the mmproj next! This VLM is supposed to be really good.

* use language_model part only, ignore visual layers * fix rope_dim calculation

* origin/master: docs : update HOWTO‑add‑model.md for ModelBase and new model classes (ggml-org#14874) ggml : remove invalid portPos specifiers from dot files (ggml-org#14838) context : restore preemptive sched reset when LLAMA_SET_ROWS=0 (ggml-org#14870) mtmd : fix 32-bit narrowing issue in export-lora and mtmd clip (ggml-org#14503) rpc : check for null buffers in get/set/copy tensor endpoints (ggml-org#14868) sched : fix multiple evaluations of the same graph with pipeline parallelism (ggml-org#14855) musa: upgrade musa sdk to rc4.2.0 (ggml-org#14498) sync : ggml cmake : fix usage issues (ggml/1257) ggml-cpu : remove stdlib include from repack.cpp (ggml/1276) context : perform output reorder lazily upon access after sync (ggml-org#14853) chat : fix kimi-k2 chat template (ggml-org#14852) sycl: fixed semantics of block offset calculation (ggml-org#14814) llama : fix MiniCPM inference after Granite Four changes (ggml-org#14850) docs: add libcurl-dev install hint for Linux distros (ggml-org#14801) metal : fix fusion across different encoders (ggml-org#14849) sycl: fix undefined variable in work group size check (ggml-org#14843) convert : text-only support for GLM-4.1V-9B-Thinking (ggml-org#14823) CUDA: fix overflow in FA, tune performance (ggml-org#14840) CUDA: fix compilation with GGML_CUDA_F16 (ggml-org#14837)

danielhanchen · 2025-07-25T21:44:20Z

I made some quants! https://huggingface.co/unsloth/GLM-4.1V-9B-Thinking-GGUF

* use language_model part only, ignore visual layers * fix rope_dim calculation

github-actions bot added the python python script changes label Jul 22, 2025

jacekpoplawski changed the title ~~convert: text-only support for GLM-4.1V-9B-Thinking (#14495)~~ convert: text-only support for GLM-4.1V-9B-Thinking Jul 22, 2025

CISC reviewed Jul 23, 2025

View reviewed changes

convert_hf_to_gguf.py Outdated Show resolved Hide resolved

convert_hf_to_gguf.py Outdated Show resolved Hide resolved

convert: text-only support for GLM-4.1V-9B-Thinking (ggml-org#14495)

d959644

* use language_model part only, ignore visual layers * fix rope_dim calculation

jacekpoplawski force-pushed the glm4_thinking_support branch from ad66a8f to d959644 Compare July 23, 2025 20:01

CISC approved these changes Jul 23, 2025

View reviewed changes

CISC merged commit a12363b into ggml-org:master Jul 23, 2025
5 checks passed

rujialiu mentioned this pull request Jul 24, 2025

Feature Request: Support GLM-4.1V-9B-Thinking #14495

Closed

4 tasks

taronaeo pushed a commit to taronaeo/llama.cpp-s390x that referenced this pull request Jul 25, 2025

convert : text-only support for GLM-4.1V-9B-Thinking (ggml-org#14823)

bd060d6

* use language_model part only, ignore visual layers * fix rope_dim calculation

blime4 referenced this pull request in blime4/llama.cpp Feb 5, 2026

convert : text-only support for GLM-4.1V-9B-Thinking (#14823)

dca6cd9

* use language_model part only, ignore visual layers * fix rope_dim calculation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

convert: text-only support for GLM-4.1V-9B-Thinking#14823

convert: text-only support for GLM-4.1V-9B-Thinking#14823
CISC merged 1 commit intoggml-org:masterfrom
jacekpoplawski:glm4_thinking_support

jacekpoplawski commented Jul 22, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ddpasa commented Jul 24, 2025

Uh oh!

danielhanchen commented Jul 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

jacekpoplawski commented Jul 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ddpasa commented Jul 24, 2025

Uh oh!

danielhanchen commented Jul 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

jacekpoplawski commented Jul 22, 2025 •

edited

Loading