Skip to content

vulkan : support conv-2d with large output size#17685

Merged
0cc4m merged 1 commit intoggml-org:masterfrom
Acly:vulkan-conv2d-workgroup-split
Dec 5, 2025
Merged

vulkan : support conv-2d with large output size#17685
0cc4m merged 1 commit intoggml-org:masterfrom
Acly:vulkan-conv2d-workgroup-split

Conversation

@Acly
Copy link
Collaborator

@Acly Acly commented Dec 2, 2025

The main goal is to support convolutions with large output spatial size. Currently number of NPQ=N*OH*OW blocks is limited by maxComputeWorkGroupCount[1] which is typically 2^16. Depending on block size this means ~2M / 8M / 16M elements. 2M elements is not a lot and exceeded by eg. a 1536x1536 image.

Selecting a larger NPQ block size pushes the limit a bit, but even 16M elements doesn't feel comfortable, so I split the workgroups between y & z.

I reorganized code a bit to make pipeline selection not depend on computing the workgroup count, that makes it easier to calculate the split without being affected by blocks shape.

Also cleaned up some obsolete stuff:

  • removed separate conv2d_transpose_push_constants, it's the same as regular conv2d now
  • removed push constants that were changed to spec constants
  • removed checks for conv-transpose push constant size

No changes to actual logic apart from the y/z split. Performance looks unchanged.

@Acly Acly requested review from 0cc4m and ggerganov as code owners December 2, 2025 10:01
@github-actions github-actions bot added testing Everything test related Vulkan Issues specific to the Vulkan backend ggml changes relating to the ggml tensor library for machine learning labels Dec 2, 2025
Copy link
Collaborator

@jeffbolznv jeffbolznv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I also tested for perf/correctness and it was fine.

Copy link
Collaborator

@0cc4m 0cc4m left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, LGTM

@0cc4m 0cc4m merged commit e15cd06 into ggml-org:master Dec 5, 2025
69 of 74 checks passed
JayZenith pushed a commit to JayZenith/llama.cpp that referenced this pull request Dec 7, 2025
0Marble pushed a commit to 0Marble/llama.cpp that referenced this pull request Dec 18, 2025
Anico2 added a commit to Anico2/llama.cpp that referenced this pull request Jan 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ggml changes relating to the ggml tensor library for machine learning testing Everything test related Vulkan Issues specific to the Vulkan backend

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants