Introduction of CUDA Programmatic Dependent Launch to Llama.cpp by agray3 · Pull Request #15480 · ggml-org/llama.cpp

agray3 · 2025-08-21T15:38:20Z

Make sure to read the contributing guidelines before submitting a PR

yeahdongcn · 2025-08-22T06:37:36Z

 static __global__ void acc_f32(const float * x, const float * y, float * dst, const int64_t ne,
        const int64_t ne10, const int64_t ne11, const int64_t ne12, const int64_t ne13,
        const int64_t s11, const int64_t s12, const int64_t s13, const int64_t offset) {
+#if !defined(GGML_USE_HIP) && __CUDA_ARCH__ >= GGML_CUDA_CC_HOPPER


It might be better to define a dedicated macro and use it wherever needed. For example:

#if !defined(GGML_USE_HIP) && __CUDA_ARCH__ >= GGML_CUDA_CC_HOPPER #define XXX_AVAILABLE #endif // !defined(GGML_USE_HIP) && __CUDA_ARCH__ >= GGML_CUDA_CC_HOPPER

agray3 · 2025-09-26T12:03:32Z

Closing as per comments on #15479

Introduction of CUDA Programmatic Dependent Launch to Llama.cpp

614dee0

See ggml-org#15479

agray3 mentioned this pull request Aug 21, 2025

NVIDIA Programmatic Dependent Launch for Llama.cpp #15479

Closed

4 tasks

yeahdongcn reviewed Aug 22, 2025

View reviewed changes

agray3 closed this Sep 26, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduction of CUDA Programmatic Dependent Launch to Llama.cpp#15480

Introduction of CUDA Programmatic Dependent Launch to Llama.cpp#15480
agray3 wants to merge 1 commit into
ggml-org:masterfrom
agray3:ag_cuda_programmatic_dependent_launch

agray3 commented Aug 21, 2025

Uh oh!

yeahdongcn Aug 22, 2025

Uh oh!

agray3 commented Sep 26, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

agray3 commented Aug 21, 2025

Uh oh!

yeahdongcn Aug 22, 2025

Choose a reason for hiding this comment

Uh oh!

agray3 commented Sep 26, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants