vulkan: Reduce temporary memory usage for TOP_K by jeffbolznv · Pull Request #17623 · ggml-org/llama.cpp

jeffbolznv · 2025-11-30T15:52:47Z

Compute row size for the temp buffer based on the output of the first pass.
Update shader addressing math to use the output row size
Pass the output row size as "ncols_output", what used to be "ncols_output" is now "k"

For the common case of K=40 and src0=(200000,1,1,1), this reduces the temporary buffer from about 3.2MB to 500KB.

- Compute row size for the temp buffer based on the output of the first pass. - Update shader addressing math to use the output row size - Pass the output row size as "ncols_output", what used to be "ncols_output" is now "k" For the common case of K=40 and src0=(200000,1,1,1), this reduces the temporary buffer from about 3.2MB to 500KB.

jeffbolznv requested a review from 0cc4m as a code owner November 30, 2025 15:52

github-actions bot added Vulkan Issues specific to the Vulkan backend ggml changes relating to the ggml tensor library for machine learning labels Nov 30, 2025

loci-dev mentioned this pull request Nov 30, 2025

UPSTREAM PR #17623: vulkan: Reduce temporary memory usage for TOP_K auroralabs-loci/llama.cpp#374

Open

jeffbolznv mentioned this pull request Dec 1, 2025

vulkan: fix top_k bug when there are ties in the input #17659

Merged

loci-dev mentioned this pull request Dec 1, 2025

UPSTREAM PR #17659: vulkan: fix top_k bug when there are ties in the input auroralabs-loci/llama.cpp#391

Open

0cc4m approved these changes Dec 2, 2025

View reviewed changes

0cc4m merged commit 61bde8e into ggml-org:master Dec 2, 2025
72 of 74 checks passed

gabe-l-hart mentioned this pull request Dec 10, 2025

feat: llama.cpp bump (17f7f4) for SSM performance improvements ollama/ollama#13408

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

vulkan: Reduce temporary memory usage for TOP_K#17623

vulkan: Reduce temporary memory usage for TOP_K#17623
0cc4m merged 1 commit intoggml-org:masterfrom
jeffbolznv:topk_memory

jeffbolznv commented Nov 30, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jeffbolznv commented Nov 30, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants