Skip to content

metal : skip loading all-zero mask#19337

Merged
ggerganov merged 2 commits intomasterfrom
gg/metal-fa-mask-zero-opt
Feb 6, 2026
Merged

metal : skip loading all-zero mask#19337
ggerganov merged 2 commits intomasterfrom
gg/metal-fa-mask-zero-opt

Conversation

@ggerganov
Copy link
Member

@ggerganov ggerganov commented Feb 4, 2026

Similar optimization as in #19281 to skip loading the all-zero mask blocks.

Model Test t/s master t/s gg/metal-fa-mask-zero-opt Speedup
qwen3moe 30B.A3B Q8_0 pp512 2177.84 2180.08 1.00
qwen3moe 30B.A3B Q8_0 pp8192 1731.35 1737.17 1.00
qwen3moe 30B.A3B Q8_0 pp16384 1232.67 1241.81 1.01
qwen3moe 30B.A3B Q8_0 pp32768 775.87 787.72 1.02
qwen3moe 30B.A3B Q8_0 pp65536 442.06 453.65 1.03

@github-actions github-actions bot added ggml changes relating to the ggml tensor library for machine learning Apple Metal https://en.wikipedia.org/wiki/Metal_(API) labels Feb 4, 2026
@ggerganov ggerganov merged commit 7fcf1ef into master Feb 6, 2026
71 of 73 checks passed
@ggerganov ggerganov deleted the gg/metal-fa-mask-zero-opt branch February 6, 2026 07:25
liparetejas pushed a commit to liparetejas/llama.cpp that referenced this pull request Feb 23, 2026
* metal : skip loading all-zero mask

* cont : minor
bartowski1182 pushed a commit to bartowski1182/llama.cpp that referenced this pull request Mar 2, 2026
* metal : skip loading all-zero mask

* cont : minor
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Apple Metal https://en.wikipedia.org/wiki/Metal_(API) ggml changes relating to the ggml tensor library for machine learning

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant