sycl : Fixes broken build and test-backend-ops#10257
Conversation
06cb3c6 to
f6ea8b7
Compare
|
could you cherry-pick the |
|
Added the changes |
Rbiessy
left a comment
There was a problem hiding this comment.
The oneMKL changes look good to me.
|
these commits negatively affect intel gpus. Is this expected ? For example : After: Reverting over master and performance returns |
|
I'm currently investigating another performance regression in the SYCL backend, though this seems a separate issue. Reverting the change would break the build for non-intel backends, but we also want to avoid this performance loss. Will look into it. Edit: Sorry for the inconvenience |
|
The issues is caused by the mul_mats marked as unsupported. I've found out a new issue for a specific test case:
The following is a temporary patch that fixes the regression and makes test-backend-ops not crash. I'm still investigating the test-case from above. diff --git a/ggml/src/ggml-sycl/ggml-sycl.cpp b/ggml/src/ggml-sycl/ggml-sycl.cpp
index 2dba15d2..72a94a50 100644
--- a/ggml/src/ggml-sycl/ggml-sycl.cpp
+++ b/ggml/src/ggml-sycl/ggml-sycl.cpp
@@ -4350,9 +4350,10 @@ static bool ggml_backend_sycl_device_supports_op(ggml_backend_dev_t dev, const g
if (op->op == GGML_OP_MUL_MAT) {
a = op->src[0];
b = op->src[1];
- if (ggml_is_permuted(a) || ggml_is_permuted(b)) {
+ if (ggml_is_permuted(a)) {
// TODO: fix like https://github.com/ggerganov/llama.cpp/pull/10021
- return false;
+ if (a->nb[0] <= a->nb[1] && a->nb[3] <= a->nb[2]) return false; // 0,1,3,2 Unsupported
+ if (b->type != GGML_TYPE_F32) return false;
}
} else {
a = op->src[2]; |
* Fixes broken build for the SYCL CUDA backend caused by non-explicit gemm call in outprod (merged in with RWKV6 in Optimize RWKV6 Operator Naming and Implement Multi-core CPU/ SYCL Acceleration ggml-org#10133) * Marks permuted MUL_MAT as unsupported to be able to run test-backend-ops * Fixes asserts in norm to fix debug builds.
|
@Alcpz How do you think? Thank you! |
|
Yes, let's revert until we get a proper fix for the test-backend-ops. Edit: I can't automatically do it. I will submit a new PR reverting just the problematic changes. CI is gonna fail for SYCL though, so we may need to have a convesation there. |
Tests confirmed passing in Nvidia A100 and Intel Data Center GPU Max 1100