Skip to content

metal : improve concurrency#19555

Merged
ggerganov merged 1 commit intomasterfrom
gg/metal-concurrency-opt
Feb 13, 2026
Merged

metal : improve concurrency#19555
ggerganov merged 1 commit intomasterfrom
gg/metal-concurrency-opt

Conversation

@ggerganov
Copy link
Member

cont #15929

  • Allow more ops to be reordered during graph optimization
  • Look forward a bit further
Model Test t/s master t/s gg/metal-concurrency-opt Speedup
deepseek2 30B.A3B Q8_0 pp512 1358.98 1406.33 1.03
deepseek2 30B.A3B Q8_0 tg32 64.34 65.92 1.02
gemma3 1B Q4_0 pp512 11154.61 10889.65 0.98
gemma3 1B Q4_0 tg32 231.27 239.56 1.04
gemma3 4B Q4_0 pp512 2816.49 2794.43 0.99
gemma3 4B Q4_0 tg32 141.79 144.30 1.02
gpt-oss 120B MXFP4 MoE pp512 1221.94 1215.31 0.99
gpt-oss 120B MXFP4 MoE tg32 89.09 88.71 1.00
gpt-oss 20B MXFP4 MoE pp512 2415.52 2422.57 1.00
gpt-oss 20B MXFP4 MoE tg32 133.24 132.79 1.00
qwen3 0.6B Q4_0 pp512 14416.27 14747.46 1.02
qwen3 0.6B Q4_0 tg32 345.90 342.63 0.99
qwen3 0.6B Q8_0 pp512 14326.00 14602.23 1.02
qwen3 0.6B Q8_0 tg32 281.99 279.62 0.99
qwen3 4B Q8_0 pp512 2479.97 2508.31 1.01
qwen3 4B Q8_0 tg32 114.66 114.30 1.00
qwen3moe 30B.A3B Q4_0 pp512 2158.07 2171.44 1.01
qwen3moe 30B.A3B Q4_0 tg32 110.83 113.80 1.03
qwen3next 80B.A3B Q4_K_M pp512 842.82 850.77 1.01
qwen3next 80B.A3B Q4_K_M tg32 37.37 41.46 1.11

@github-actions github-actions bot added ggml changes relating to the ggml tensor library for machine learning Apple Metal https://en.wikipedia.org/wiki/Metal_(API) labels Feb 12, 2026
@ggerganov ggerganov merged commit 0644bae into master Feb 13, 2026
75 of 78 checks passed
@ggerganov ggerganov deleted the gg/metal-concurrency-opt branch February 13, 2026 05:36
ronaldmannak pushed a commit to PicoMLX/llama.cpp that referenced this pull request Feb 16, 2026
liparetejas pushed a commit to liparetejas/llama.cpp that referenced this pull request Feb 23, 2026
bartowski1182 pushed a commit to bartowski1182/llama.cpp that referenced this pull request Mar 2, 2026
ArberSephirotheca pushed a commit to ArberSephirotheca/llama.cpp that referenced this pull request Mar 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Apple Metal https://en.wikipedia.org/wiki/Metal_(API) ggml changes relating to the ggml tensor library for machine learning

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant