Fix LoongArch test-quantize-fns f16 and q4_0 failed when use LSX by MQ-mengqing · Pull Request #16958 · ggml-org/llama.cpp

MQ-mengqing · 2025-11-03T04:43:35Z

LoongArch mistakenly used {__lasx_xv,__lsx_v}replgr2vr_w(), which can only operate the integer. When operating on float, it will round them to integers, causing error.

HmnSn · 2025-11-03T09:19:52Z

I tested it on 3A6000 with -DGGML_LSX=ON -DGGML_LASX=OFF and -DGGML_LASX=ON. It works well!

* origin/master: (169 commits) opencl: support imrope (ggml-org#16914) fix: Viewing multiple PDF attachments (ggml-org#16974) model-conversion : pass config to from_pretrained (ggml-org#16963) server : add props.model_alias (ggml-org#16943) ggml: CUDA: add head size 72 for flash-attn (ggml-org#16962) mtmd: add --image-min/max-tokens (ggml-org#16921) mtmd: pad mask for qwen2.5vl (ggml-org#16954) ggml : LoongArch fixes (ggml-org#16958) sync: minja (glm 4.6 & minmax m2 templates) (ggml-org#16949) SYCL: optimized repeat_back kernel (3× fewer asm instructions, 2× faster)Feature/sycl repeat back opt (ggml-org#16869) feat(webui): improve LaTeX rendering with currency detection (ggml-org#16508) test-backend-ops : fix segfault in moe-expert-reduce test in support mode and coverage (ggml-org#16936) ci : disable failing riscv cross build (ggml-org#16952) model: add Janus Pro for image understanding (ggml-org#16906) clip : use FA (ggml-org#16837) server : support unified cache across slots (ggml-org#16736) common : move gpt-oss reasoning processing to init params (ggml-org#16937) docs: remove llama_sampler_accept reference in sampling sample usage (ggml-org#16920) CUDA: add FLOOR, CEIL, ROUND, TRUNC unary ops (ggml-org#16917) devops: fix failing s390x docker build (ggml-org#16918) ...

* Fix test-quantize-fns f16 and q4_0 failed when use LSX * Fix LoongArch set float intrinsic when use LSX/LASX

Fix test-quantize-fns f16 and q4_0 failed when use LSX

6911113

MQ-mengqing requested review from ggerganov and slaren as code owners November 3, 2025 04:43

github-actions bot added the ggml changes relating to the ggml tensor library for machine learning label Nov 3, 2025

DajanaV mentioned this pull request Nov 3, 2025

UPSTREAM PR #16958: Fix LoongArch test-quantize-fns f16 and q4_0 failed when use LSX auroralabs-loci/llama.cpp#47

Closed

Fix LoongArch set float intrinsic when use LSX/LASX

35ec083

ggerganov merged commit fcfce04 into ggml-org:master Nov 3, 2025
69 of 71 checks passed

HmnSn mentioned this pull request Nov 3, 2025

GGML Survey 20251103 AOSC-Dev/aosc-os-abbs#13424

Merged

7 tasks

MQ-mengqing deleted the a_fix branch November 4, 2025 00:42

Anico2 added a commit to Anico2/llama.cpp that referenced this pull request Jan 15, 2026

ggml : LoongArch fixes (ggml-org#16958)

6b778ad

* Fix test-quantize-fns f16 and q4_0 failed when use LSX * Fix LoongArch set float intrinsic when use LSX/LASX

blime4 referenced this pull request in blime4/llama.cpp Feb 5, 2026

ggml : LoongArch fixes (#16958)

08eebbe

* Fix test-quantize-fns f16 and q4_0 failed when use LSX * Fix LoongArch set float intrinsic when use LSX/LASX

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix LoongArch test-quantize-fns f16 and q4_0 failed when use LSX#16958

Fix LoongArch test-quantize-fns f16 and q4_0 failed when use LSX#16958
ggerganov merged 2 commits intoggml-org:masterfrom
MQ-mengqing:a_fix

MQ-mengqing commented Nov 3, 2025 •

edited

Loading

Uh oh!

Uh oh!

HmnSn commented Nov 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

MQ-mengqing commented Nov 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

HmnSn commented Nov 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

MQ-mengqing commented Nov 3, 2025 •

edited

Loading