tests : add non-cont, inplace rope tests by ggerganov · Pull Request #19296 · ggml-org/llama.cpp

ggerganov · 2026-02-03T15:42:50Z

ref #18986 (comment)
ref #19128 (comment)
ref #19292

This type of rope is needed during rope shift:

Lines 1626 to 1635 in cf36c21

    
           ggml_tensor * k = 
        
               ggml_view_3d(ctx, layer.k, 
        
                   n_rot, n_head_kv, get_size()*n_stream, 
        
                   ggml_row_size(layer.k->type, n_embd_head_k), 
        
                   ggml_row_size(layer.k->type, n_embd_k_gqa), 
        
                   ggml_row_size(layer.k->type, n_embd_nope)); 
        
           ggml_tensor * cur = build_rope_shift(cparams, ctx, k, inp->k_shift, rope_factors, freq_base_l, freq_scale_l);

We are updating the KV cache data, so the op is inplace. Since the changes in #18986 the tensor is now also non-contiguous.

Based on the discussion in #18986 (comment), it appears that the Vulkan backend currently does not support this case. cc @jeffbolznv @0cc4m

jeffbolznv · 2026-02-03T15:55:31Z

Yeah, the new tests are failing in the vulkan backend. I'll look into it.

LostRuins · 2026-02-03T17:18:51Z

Thank you. If it works I will close #19292

tests/test-backend-ops.cpp

Co-authored-by: Jeff Bolz <jbolz@nvidia.com>

ggerganov · 2026-02-04T08:33:50Z

Huh, adding the dim 3 somehow hided the CUDA failures from earlier:

https://github.com/ggml-org/llama.cpp/actions/runs/21636838841/job/62364670686#step:3:48919

I guess I have to add both tests for ne[3] == 1 and ne[3] > 1

LostRuins · 2026-02-04T08:35:51Z

Hi all, I can confirm that Jeff's patch fixes #19292 for me on Vulkan!

Works fine and fixes the incoherence-after-shifting regression on GLM-4-32B-0414 and GLM-4.7-Flash.

Much appreciated to all.

However is anyone able to check this test case is good for CUDA too? I have heard that a similar issue affected a CUDA RTX 5xxx user on GLM-4.7-Flash, would be nice if someone could confirm the test passes on CUDA.

ggerganov · 2026-02-04T08:37:56Z

It's very likely broken with CUDA in a similar way - see the test failure from earlier: https://github.com/ggml-org/llama.cpp/actions/runs/21636838841/job/62364670686#step:3:48919

Once we stabilize the tests, we'll notify CUDA maintainers to take a look.

LostRuins · 2026-02-04T08:46:15Z

Thanks! Once there is a fix for cuda i'll let the cuda user try it out again.

ggerganov · 2026-02-04T10:46:34Z

@JohannesGaessler Note that some of the newly added rope tests are currently failing with CUDA: https://github.com/ggml-org/llama.cpp/actions/runs/21664340312/job/62455946368

ORippler · 2026-02-04T17:01:59Z

@JohannesGaessler Note that some of the newly added rope tests are currently failing with CUDA: https://github.com/ggml-org/llama.cpp/actions/runs/21664340312/job/62455946368

Should be fixed by #19338

pwilkin · 2026-02-04T23:32:19Z

The new tests are failing with Vulkan as well @jeffbolznv @0cc4m

jeffbolznv · 2026-02-04T23:46:01Z

The vulkan fix is at #19299

* tests : add non-cont, inplace rope tests * cont : exercise dim 3 Co-authored-by: Jeff Bolz <jbolz@nvidia.com> * cont : more dim3 exercises --------- Co-authored-by: Jeff Bolz <jbolz@nvidia.com>

tests : add non-cont, inplace rope tests

cf36c21

ggerganov mentioned this pull request Feb 3, 2026

mla : make the V tensor a view of K #18986

Merged

jeffbolznv mentioned this pull request Feb 3, 2026

vulkan: fix non-contig rope #19299

Merged

jeffbolznv reviewed Feb 3, 2026

View reviewed changes

tests/test-backend-ops.cpp Show resolved Hide resolved

github-actions bot added the testing Everything test related label Feb 3, 2026

cont : exercise dim 3

b51b3c4

Co-authored-by: Jeff Bolz <jbolz@nvidia.com>

cont : more dim3 exercises

45fe817

ggerganov merged commit eaba92c into master Feb 4, 2026
67 of 75 checks passed

ggerganov deleted the gg/tests-rope-inplace branch February 4, 2026 10:45

ORippler mentioned this pull request Feb 4, 2026

CUDA: Fix non-contig rope #19338

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tests : add non-cont, inplace rope tests#19296

tests : add non-cont, inplace rope tests#19296
ggerganov merged 3 commits intomasterfrom
gg/tests-rope-inplace

ggerganov commented Feb 3, 2026

Uh oh!

jeffbolznv commented Feb 3, 2026

Uh oh!

LostRuins commented Feb 3, 2026

Uh oh!

Uh oh!

ggerganov commented Feb 4, 2026

Uh oh!

LostRuins commented Feb 4, 2026

Uh oh!

ggerganov commented Feb 4, 2026

Uh oh!

LostRuins commented Feb 4, 2026

Uh oh!

Uh oh!

ggerganov commented Feb 4, 2026

Uh oh!

ORippler commented Feb 4, 2026

Uh oh!

pwilkin commented Feb 4, 2026

Uh oh!

jeffbolznv commented Feb 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants


	ggml_tensor * k =
	ggml_view_3d(ctx, layer.k,
	n_rot, n_head_kv, get_size()*n_stream,
	ggml_row_size(layer.k->type, n_embd_head_k),
	ggml_row_size(layer.k->type, n_embd_k_gqa),
	ggml_row_size(layer.k->type, n_embd_nope));

	ggml_tensor * cur = build_rope_shift(cparams, ctx, k, inp->k_shift, rope_factors, freq_base_l, freq_scale_l);

Conversation

ggerganov commented Feb 3, 2026

Uh oh!

jeffbolznv commented Feb 3, 2026

Uh oh!

LostRuins commented Feb 3, 2026

Uh oh!

Uh oh!

ggerganov commented Feb 4, 2026

Uh oh!

LostRuins commented Feb 4, 2026

Uh oh!

ggerganov commented Feb 4, 2026

Uh oh!

LostRuins commented Feb 4, 2026

Uh oh!

Uh oh!

ggerganov commented Feb 4, 2026

Uh oh!

ORippler commented Feb 4, 2026

Uh oh!

pwilkin commented Feb 4, 2026

Uh oh!

jeffbolznv commented Feb 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants