CANN: supports out_prod operator for F32 and F16#17406
CANN: supports out_prod operator for F32 and F16#17406hipudding merged 1 commit intoggml-org:masterfrom
Conversation
noemotiovon
left a comment
There was a problem hiding this comment.
LGTM, just a minor issue.
ggml/src/ggml-cann/aclnn_ops.cpp
Outdated
|
|
||
| const int64_t i12 = i2; | ||
| const int64_t i13 = i3; | ||
| aclTensor *accumulator = ggml_cann_create_tensor( |
There was a problem hiding this comment.
The result of ggml_cann_create_tensor should be acl_tensor_ptr, not aclTensor*.
ggml/src/ggml-cann/aclnn_ops.h
Outdated
| */ | ||
| void ggml_cann_out_prod(ggml_backend_cann_context & ctx, ggml_tensor * dst); | ||
|
|
||
| void ggml_cann_out_prod_fp(ggml_backend_cann_context & ctx, ggml_tensor * dst); |
ggml/src/ggml-cann/aclnn_ops.cpp
Outdated
| @@ -72,6 +72,7 @@ | |||
| #include <aclnnop/aclnn_index_select.h> | |||
| #include <aclnnop/aclnn_clamp.h> | |||
| #include <aclnnop/aclnn_threshold.h> | |||
| #include <aclnnop/aclnn_ger.h> | |||
There was a problem hiding this comment.
You should use
find ggml/src/ggml-cann -iname ".cpp" -o -iname ".h" | xargs clang-format -i
to format code.
ggml/src/ggml-cann/aclnn_ops.cpp
Outdated
| @@ -72,6 +72,7 @@ | |||
| #include <aclnnop/aclnn_index_select.h> | |||
| #include <aclnnop/aclnn_clamp.h> | |||
| #include <aclnnop/aclnn_threshold.h> | |||
| #include <aclnnop/aclnn_ger.h> | |||
There was a problem hiding this comment.
You should use
find ggml/src/ggml-cann -iname ".cpp" -o -iname ".h" | xargs clang-format -i
to format code.
ggml/src/ggml-cann/aclnn_ops.cpp
Outdated
| dst->nb, | ||
| 2); | ||
|
|
||
| GGML_CANN_CALL_ACLNN_OP(ctx, InplaceZero, accumulator); |
There was a problem hiding this comment.
Currently, InplaceZero is being called on each iteration of the for loop. I believe we can just call it once on dst before the loop.
815e770 to
5d9578a
Compare
|
Thank you for your contribution! :) |
There was a problem hiding this comment.
I think the current computation method is not optimal. Computing the outer product is essentially two vectors of dimension 1, which can be broadcast and multiplied element-wise. Here, the broadcast can be implemented by modifying the view (similar to how it’s done in operators like Add), followed by element-wise multiplication.
Ignore my previous comment. aclnnGer is actually the outer product; the CANN documentation description is incorrect.
| ggml_type_size(dst->type), dst->ne, dst->nb, 2); | ||
|
|
||
| // The outer product needs to be accumulated in this dimension. | ||
| for (int64_t i1 = 0; i1 < ne11; i1++) { |
There was a problem hiding this comment.
There are three nested loops here, which will result in very poor performance. Optimization should be done once an appropriate operator is available.
| acl_tensor_ptr acl_out = ggml_cann_create_tensor(output_buffer, ggml_cann_type_mapping(dst->type), | ||
| ggml_type_size(dst->type), dst->ne, dst->nb, 2); | ||
|
|
||
| GGML_CANN_CALL_ACLNN_OP(ctx, Ger, acl_input.get(), acl_weight.get(), acl_out.get()); |
There was a problem hiding this comment.
This operator is intended for computing the outer product, but it uses the inner product here, which is unsuitable.
Ignore my previous comment. aclnnGer is actually the outer product; the CANN documentation description is incorrect.
|
Is 310p support aclnnGer? If not, please change support op function. |
I will go check this issue, and if not, I will modify it in the next PR. |
Co-authored-by: tianhao <tianhao42@huawei.com>
Co-authored-by: tianhao <tianhao42@huawei.com>

The CANN backend supports floating-point product calculations.