llama : separate compute buffer reserve from fattn check by slaren · Pull Request #15696 · ggml-org/llama.cpp

slaren · 2025-08-31T13:24:49Z

Exposes ggml_backend_sched_split_graph() to allow splitting the graph without allocating compute buffers and uses it to split the graph for the automatic Flash Attention check.

) Exposes ggml_backend_sched_split_graph() to allow splitting the graph without allocating compute buffers and uses it to split the graph for the automatic Flash Attention check.

…l-org#15696)"

Exposes ggml_backend_sched_split_graph() to allow splitting the graph without allocating compute buffers and uses it to split the graph for the automatic Flash Attention check.

llama : separate compute buffer reserve from fattn check

d6178ae

Exposes ggml_backend_sched_split_graph() to allow splitting the graph without allocating compute buffers and uses it to split the graph for the automatic Flash Attention check.

slaren requested a review from JohannesGaessler August 31, 2025 13:24

JohannesGaessler approved these changes Aug 31, 2025

View reviewed changes

slaren merged commit 9777032 into master Aug 31, 2025
42 of 48 checks passed

slaren deleted the sl/fix-fattn-reserve branch August 31, 2025 13:49

github-actions bot added the ggml changes relating to the ggml tensor library for machine learning label Aug 31, 2025

theo77186 mentioned this pull request Oct 2, 2025

Eval bug: Jina embeddings v2 base code crashes with GGML_ASSERT(ggml_can_mul_mat(a, b)) failed #16392

Closed

Nexesenex added a commit to Nexesenex/croco.cpp that referenced this pull request Oct 7, 2025

Revert "llama : separate compute buffer reserve from fattn check (ggm…

feec643

…l-org#15696)"

Nexesenex added a commit to Nexesenex/croco.cpp that referenced this pull request Oct 26, 2025

Revert "llama : separate compute buffer reserve from fattn check (ggm…

bf3165e

…l-org#15696)"

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llama : separate compute buffer reserve from fattn check#15696

llama : separate compute buffer reserve from fattn check#15696
slaren merged 1 commit intomasterfrom
sl/fix-fattn-reserve

slaren commented Aug 31, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

slaren commented Aug 31, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants