graph : fix nkvo offload with FA by ggerganov · Pull Request #19105 · ggml-org/llama.cpp

ggerganov · 2026-01-26T09:06:22Z

fix #19096

The ggml_flash_attn_ext was not being offloaded to the CPU when -nkvo is specified.

Also remove obsolete strcmp(name, "kqv_merged_cont") check in the graph callback.

graph : fix nkvo offload with FA

9878038

ggerganov requested a review from CISC as a code owner January 26, 2026 09:06

ggerganov mentioned this pull request Jan 26, 2026

Misc. bug: ggml\src\ggml-cuda\fattn.cu:453: fatal error #19096

Closed

JohannesGaessler approved these changes Jan 26, 2026

View reviewed changes

ggerganov merged commit 8f80d1b into master Jan 26, 2026
73 of 78 checks passed

ggerganov deleted the gg/graph-fix-nkvo branch January 26, 2026 18:18

LifesLight mentioned this pull request Jan 28, 2026

Eval bug: Performance regression: GLM 4.6 prefill with -nkvo #19158

Closed

ggerganov mentioned this pull request Jan 28, 2026

cuda : fix nkvo, offload and cuda graph node properties matching #19165

Merged

shaofeiqi pushed a commit to qualcomm/llama.cpp that referenced this pull request Feb 6, 2026

graph : fix nkvo offload with FA (ggml-org#19105)

59adcef

Seunghhon pushed a commit to Seunghhon/llama.cpp that referenced this pull request Apr 26, 2026

graph : fix nkvo offload with FA (ggml-org#19105)

ace0663

ljubomirj pushed a commit to ljubomirj/llama.cpp that referenced this pull request May 6, 2026

graph : fix nkvo offload with FA (ggml-org#19105)

d0c62e4

my-other-github-account pushed a commit to my-other-github-account/llama.cpp that referenced this pull request May 15, 2026

graph : fix nkvo offload with FA (ggml-org#19105)

b18dfdb

my-other-github-account pushed a commit to my-other-github-account/llama.cpp that referenced this pull request May 15, 2026

graph : fix nkvo offload with FA (ggml-org#19105)

806c90b

fewtarius pushed a commit to fewtarius/llama.cpp that referenced this pull request May 30, 2026

graph : fix nkvo offload with FA (ggml-org#19105)

c9c20a4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

graph : fix nkvo offload with FA#19105

graph : fix nkvo offload with FA#19105
ggerganov merged 1 commit into
masterfrom
gg/graph-fix-nkvo

ggerganov commented Jan 26, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ggerganov commented Jan 26, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants