cuda : extend GGML_OP_PAD to work with non-cont src0 by ggerganov · Pull Request #19429 · ggml-org/llama.cpp

ggerganov · 2026-02-08T11:25:02Z

Extend CUDA support
Remove redundant assert in CPU implementation
Add permuted PAD tests

JohannesGaessler · 2026-02-09T17:15:13Z

Does this fix the issue described in #19170 (comment) ?

ggerganov · 2026-02-10T06:06:47Z

Does this fix the issue described in #19170 (comment) ?

No, this is needed to optimize out some of the copies in the Qwen3 Next graphs.

* cuda : extend GGML_OP_PAD to work with non-cont src0 * tests : add permuted pad

ggerganov requested a review from JohannesGaessler February 8, 2026 11:25

github-actions bot added Nvidia GPU Issues specific to Nvidia GPUs ggml changes relating to the ggml tensor library for machine learning labels Feb 8, 2026

ggerganov force-pushed the gg/cuda-pad-non-cont branch from 23aa667 to b3507fa Compare February 8, 2026 15:56

CISC mentioned this pull request Feb 9, 2026

[Model] Qwen3.5 support w/o vision, WIP #19456

Closed

ggerganov added 2 commits February 9, 2026 15:43

cuda : extend GGML_OP_PAD to work with non-cont src0

b7c7c95

tests : add permuted pad

084b3d8

ggerganov force-pushed the gg/cuda-pad-non-cont branch from b3507fa to 084b3d8 Compare February 9, 2026 13:58

JohannesGaessler approved these changes Feb 9, 2026

View reviewed changes

github-actions bot added the testing Everything test related label Feb 9, 2026

AesSedai mentioned this pull request Feb 9, 2026

Add Kimi-K2.5 support #19170

Merged

ggerganov merged commit a0d5855 into master Feb 10, 2026
68 of 79 checks passed

ggerganov deleted the gg/cuda-pad-non-cont branch February 10, 2026 06:07

ggerganov mentioned this pull request Feb 11, 2026

models : optimizing qwen3next graph #19375

Merged

3 tasks

liparetejas pushed a commit to liparetejas/llama.cpp that referenced this pull request Feb 23, 2026

cuda : extend GGML_OP_PAD to work with non-cont src0 (ggml-org#19429)

d858b4b

* cuda : extend GGML_OP_PAD to work with non-cont src0 * tests : add permuted pad

bartowski1182 pushed a commit to bartowski1182/llama.cpp that referenced this pull request Mar 2, 2026

cuda : extend GGML_OP_PAD to work with non-cont src0 (ggml-org#19429)

7d43b4d

* cuda : extend GGML_OP_PAD to work with non-cont src0 * tests : add permuted pad

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cuda : extend GGML_OP_PAD to work with non-cont src0#19429

cuda : extend GGML_OP_PAD to work with non-cont src0#19429
ggerganov merged 2 commits intomasterfrom
gg/cuda-pad-non-cont

ggerganov commented Feb 8, 2026 •

edited

Loading

Uh oh!

JohannesGaessler commented Feb 9, 2026

Uh oh!

ggerganov commented Feb 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ggerganov commented Feb 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

JohannesGaessler commented Feb 9, 2026

Uh oh!

ggerganov commented Feb 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ggerganov commented Feb 8, 2026 •

edited

Loading