opencl: use larger workgroup size for get_rows by lhez · Pull Request #20316 · ggml-org/llama.cpp

lhez · 2026-03-09T21:03:28Z

Choose the largest possible workgroup size for get_rows for better performance plus small refactor.

lhez · 2026-03-12T04:36:14Z

Failures are irrelevant. Will merge shortly.

* 'master' of github.com:ggml-org/llama.cpp: (33 commits) convert : better mtp check and fix return [no ci] (ggml-org#20419) vulkan: fix SSM_CONV PP scaling with large ubatch sizes (ggml-org#20379) New conversations now auto-select the first loaded model (ggml-org#20403) ggml-virtgpu: Fix some build commands (ggml-org#20341) metal : avoid divisions in bin kernel (ggml-org#20426) ci: Setup self-hosted CI for Intel Linux Vulkan backend (ggml-org#20154) vulkan: fix l2_norm epsilon handling (ggml-org#20350) vulkan: fix OOB check in flash_attn_mask_opt (ggml-org#20296) vulkan: Fix ErrorOutOfHostMemory on Intel GPU when loading large models with --no-mmap (ggml-org#20059) opencl: use larger workgroup size for get_rows (ggml-org#20316) opencl: add cumsum op (ggml-org#18981) hip: compile debug builds with -O2 on hip to avoid a compiler bug (ggml-org#20392) common/parser: add GigaChatV3/3.1 models support (ggml-org#19931) model : add support for Phi4ForCausalLMV (ggml-org#20168) graph : add optional scale parameter to build_lora_mm [no ci] (ggml-org#20427) common : fix --n-cpu-moe, --cpu-moe for models with fused gate + up (ggml-org#20416) ggml-webgpu: Add supports for `GGML_OP_REPEAT` (ggml-org#20230) llama : enable chunked fused GDN path (ggml-org#20340) llama : whitespace cleanup (ggml-org#20422) ggml : add NVFP4 quantization type support (ggml-org#19769) ...

opencl: use larger workgroup size for get_rows

7bd2cbd

github-actions bot added ggml changes relating to the ggml tensor library for machine learning OpenCL Issues specific to the OpenCL backend labels Mar 9, 2026

lhez marked this pull request as ready for review March 10, 2026 17:05

lhez requested a review from max-krasnyansky as a code owner March 10, 2026 17:05

max-krasnyansky approved these changes Mar 11, 2026

View reviewed changes

lhez merged commit 0516e04 into ggml-org:master Mar 12, 2026
145 of 149 checks passed

ProgenyAlpha pushed a commit to ProgenyAlpha/llama.cpp that referenced this pull request Mar 12, 2026

opencl: use larger workgroup size for get_rows (ggml-org#20316)

116f0e0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

opencl: use larger workgroup size for get_rows#20316

opencl: use larger workgroup size for get_rows#20316
lhez merged 1 commit intoggml-org:masterfrom
qualcomm:lh/get-rows-wgsize

lhez commented Mar 9, 2026

Uh oh!

lhez commented Mar 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

lhez commented Mar 9, 2026

Uh oh!

lhez commented Mar 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants