opencl: add MoE support for q4_k, q5_k, q6_k on Adreno by shaofeiqi · Pull Request #23303 · ggml-org/llama.cpp

shaofeiqi · 2026-05-18T23:11:44Z

Overview

Add Q4_K, Q5_K and Q6_K MoE OpenCL support for Adreno.

Requirements

I have read and agree with the contributing guidelines
AI usage disclosure: No

* opencl: add q4_k moe support * opencl: add q5_k moe support * opencl: add q6_k moe support * opencl: adjust format --------- Co-authored-by: Li He <lih@qti.qualcomm.com>

* upstream/HEAD: (25 commits) metal : optimize pad + cpy (ggml-org#23354) snapdragon: update toolchain to v0.6 (ggml-org#23369) ggml-cuda: tune RDNA3 Q6_K MMVQ nwarps (ggml-org#23349) opencl: add MoE support for q4_k, q5_k, q6_k on Adreno (ggml-org#23303) hexagon: add MROPE and IMROPE support in HTP rope op (ggml-org#23317) refactor: Chat Screen UI rendering (ggml-org#23333) github: mention --log-file in issue templates (ggml-org#23277) common: fix --help for --verbosity (ggml-org#23278) common: fix --fit verbosity with --verbosity 4 (ggml-org#23282) convert : update mtp related help (ggml-org#23334) hexagon: enable support for NORM op (ggml-org#23319) model : clarify MTP layer comment in qwen35.cpp [no ci] (ggml-org#23338) llama : MTP clean-up (ggml-org#23269) ui: Bump packages + address build warnings (ggml-org#23300) ci : install libssl-dev (ggml-org#23325) ci : install server kleidiai runner dependencies (ggml-org#23259) server-context: guarantee there is at least 1 token to decode (ggml-org#23280) server : print graphs reused in slot timings (ggml-org#23279) save-load-state : refactor tests and improve readability (ggml-org#23196) llama-eval : add per-task summary stats (ggml-org#23151) ...

shaofeiqi and others added 4 commits May 13, 2026 17:22

opencl: add q4_k moe support

955e7cc

opencl: add q5_k moe support

a753877

opencl: add q6_k moe support

bb43bc0

opencl: adjust format

0cc8ce7

shaofeiqi requested a review from a team as a code owner May 18, 2026 23:11

github-actions Bot added ggml changes relating to the ggml tensor library for machine learning OpenCL Issues specific to the OpenCL backend labels May 18, 2026

lhez approved these changes May 19, 2026

View reviewed changes

max-krasnyansky approved these changes May 19, 2026

View reviewed changes

lhez merged commit b28a2f3 into ggml-org:master May 19, 2026
47 of 58 checks passed

a-ghorbani mentioned this pull request May 24, 2026

chore(deps): upgrade llama.rn to 0.12.3 a-ghorbani/pocketpal-ai#740

Merged

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

opencl: add MoE support for q4_k, q5_k, q6_k on Adreno#23303

opencl: add MoE support for q4_k, q5_k, q6_k on Adreno#23303
lhez merged 4 commits into
ggml-org:masterfrom
qualcomm:sq/opencl-q4_k-q5_k-q6_k-moe

shaofeiqi commented May 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

shaofeiqi commented May 18, 2026

Overview

Requirements

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants