Skip to content

cuda : extend GGML_OP_PAD to work with non-cont src0#19429

Merged
ggerganov merged 2 commits intomasterfrom
gg/cuda-pad-non-cont
Feb 10, 2026
Merged

cuda : extend GGML_OP_PAD to work with non-cont src0#19429
ggerganov merged 2 commits intomasterfrom
gg/cuda-pad-non-cont

Conversation

@ggerganov
Copy link
Member

@ggerganov ggerganov commented Feb 8, 2026

  • Extend CUDA support
  • Remove redundant assert in CPU implementation
  • Add permuted PAD tests

@github-actions github-actions bot added Nvidia GPU Issues specific to Nvidia GPUs ggml changes relating to the ggml tensor library for machine learning labels Feb 8, 2026
@ggerganov ggerganov force-pushed the gg/cuda-pad-non-cont branch from 23aa667 to b3507fa Compare February 8, 2026 15:56
@ggerganov ggerganov force-pushed the gg/cuda-pad-non-cont branch from b3507fa to 084b3d8 Compare February 9, 2026 13:58
@JohannesGaessler
Copy link
Contributor

Does this fix the issue described in #19170 (comment) ?

@github-actions github-actions bot added the testing Everything test related label Feb 9, 2026
@AesSedai AesSedai mentioned this pull request Feb 9, 2026
@ggerganov
Copy link
Member Author

Does this fix the issue described in #19170 (comment) ?

No, this is needed to optimize out some of the copies in the Qwen3 Next graphs.

@ggerganov ggerganov merged commit a0d5855 into master Feb 10, 2026
68 of 79 checks passed
@ggerganov ggerganov deleted the gg/cuda-pad-non-cont branch February 10, 2026 06:07
liparetejas pushed a commit to liparetejas/llama.cpp that referenced this pull request Feb 23, 2026
* cuda : extend GGML_OP_PAD to work with non-cont src0

* tests : add permuted pad
bartowski1182 pushed a commit to bartowski1182/llama.cpp that referenced this pull request Mar 2, 2026
* cuda : extend GGML_OP_PAD to work with non-cont src0

* tests : add permuted pad
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs testing Everything test related

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants