tests : add non-cont, inplace rope tests#19296
Conversation
|
Yeah, the new tests are failing in the vulkan backend. I'll look into it. |
|
Thank you. If it works I will close #19292 |
Co-authored-by: Jeff Bolz <jbolz@nvidia.com>
|
Huh, adding the dim 3 somehow hided the CUDA failures from earlier: https://github.com/ggml-org/llama.cpp/actions/runs/21636838841/job/62364670686#step:3:48919 I guess I have to add both tests for |
|
Hi all, I can confirm that Jeff's patch fixes #19292 for me on Vulkan! Works fine and fixes the incoherence-after-shifting regression on GLM-4-32B-0414 and GLM-4.7-Flash. Much appreciated to all. However is anyone able to check this test case is good for CUDA too? I have heard that a similar issue affected a CUDA RTX 5xxx user on GLM-4.7-Flash, would be nice if someone could confirm the test passes on CUDA. |
|
It's very likely broken with CUDA in a similar way - see the test failure from earlier: https://github.com/ggml-org/llama.cpp/actions/runs/21636838841/job/62364670686#step:3:48919 Once we stabilize the tests, we'll notify CUDA maintainers to take a look. |
|
Thanks! Once there is a fix for cuda i'll let the cuda user try it out again. |
|
@JohannesGaessler Note that some of the newly added rope tests are currently failing with CUDA: https://github.com/ggml-org/llama.cpp/actions/runs/21664340312/job/62455946368 |
Should be fixed by #19338 |
|
The new tests are failing with Vulkan as well @jeffbolznv @0cc4m |
|
The vulkan fix is at #19299 |
* tests : add non-cont, inplace rope tests * cont : exercise dim 3 Co-authored-by: Jeff Bolz <jbolz@nvidia.com> * cont : more dim3 exercises --------- Co-authored-by: Jeff Bolz <jbolz@nvidia.com>
ref #18986 (comment)
ref #19128 (comment)
ref #19292
This type of rope is needed during rope shift:
llama.cpp/src/llama-kv-cache.cpp
Lines 1626 to 1635 in cf36c21
We are updating the KV cache data, so the op is inplace. Since the changes in #18986 the tensor is now also non-contiguous.
Based on the discussion in #18986 (comment), it appears that the Vulkan backend currently does not support this case. cc @jeffbolznv @0cc4m