CANN: implement the SSM_CONV operator#17737
Conversation
Co-authored-by: Aleksei Lobanov, <zeromarblectm@gmail.com> Co-authored-by: Sujin Kang, <waterjin326@gmail.com>
tests/test-backend-ops.cpp
Outdated
| // so the inputs are converted from f32 | ||
| // and tests fail with NMSE = 0.000000114 > 0.000000100 | ||
| double max_nmse_err() override { | ||
| return 1e-6; |
There was a problem hiding this comment.
Do not modify test cases other than those for ggml-cann, because the precision issues come from the 310p device’s own computation. We just need to be aware that the 310p will have some degree of precision loss.
There was a problem hiding this comment.
Is there any official way for the test case to know what backed is running? Doesn't seem like any other tests do any backend-specific stuff. I could override test_case.eval method on test_ssm_conv to save the backend, or make it so the max_err method takes optional backed parameters.
There was a problem hiding this comment.
I think accuracy tests shouldn’t depend on which backend is used. If the required accuracy isn’t met, then it simply isn’t met — the standard should remain consistent across all backends.
tests/test-backend-ops.cpp
Outdated
| // and tests fail with NMSE = 0.000000114 > 0.000000100 | ||
| double max_nmse_err() override { | ||
| return 1e-6; | ||
| } |
There was a problem hiding this comment.
I just removed the custom error limits, now the test fails on 310P3
noemotiovon
left a comment
There was a problem hiding this comment.
LGTM, Thank you very much for your contribution — there’s just a small issue.
@hipudding, could you please help review it?
ggml/src/ggml-cann/aclnn_ops.cpp
Outdated
| int64_t w_ne[GGML_MAX_DIMS] = { 0 }; | ||
| size_t w_nb[GGML_MAX_DIMS] = { 0 }; | ||
|
|
||
| w_ne[0] = nc; // K |
There was a problem hiding this comment.
This part can be merged into one line, but please keep the comments.
int64_t w_ne[GGML_MAX_DIMS] = { nc, 1, nr, 1 }; // [K, 1, C, 1]
size_t w_nb[GGML_MAX_DIMS] = { src1->nb[0], src1->nb[1], src1->nb[1], src1->nb[3] }; // reuse src1 strides
ggml/src/ggml-cann/aclnn_ops.cpp
Outdated
| int64_t y_ne[GGML_MAX_DIMS] = { 0 }; | ||
| size_t y_nb[GGML_MAX_DIMS] = { 0 }; | ||
|
|
||
| y_ne[0] = n_t; // L_out |
|
@hipudding @noemotiovon hey whats the status of this PR? |
1a20717 to
a70e4c8
Compare
|
@0Marble, Is is PR ready for review now? If so, please click "Ready for review". |
* CANN: implement SSM_CONV operator Co-authored-by: Aleksei Lobanov, <zeromarblectm@gmail.com> Co-authored-by: Sujin Kang, <waterjin326@gmail.com> * CANN: remove custom error limit for SSM_CONV * CANN: merge SSM_CONV tensor shape/strides into one line --------- Co-authored-by: Sujin Kang, <waterjin326@gmail.com>
Description
We implement the
SSM_CONVoperator using depthwise 1D convolution.We use high-level builtin
aclnnConvolutionfunction.The goal is to compute the following:
where the shape of$y$ is $[dinner, nt, ns]$ , $x$ is $[dconv - 1 + nt, dinner, ns]$ and $w$ is $[dconv, dinner]$ .
In order to use
aclnnConvolutionto implement this formula, we reshape the tensors and set the groups parameter tod_innerto calculate the convolution for each channel independently.Testing
We ran test-backend-ops test suite for
SSM_CONVon two different cards: 310P3 and 910B3.For the 310P3 card, it requires setting the
cubeMathTypeparameter toALLOW_FP32_DOWN_PRECISION, and it seems that causes the computation to be done not in f32, which in turn causes the tests to not pass with a small error (NMSE 0.000000114, greater than the allowed 1e-7). We had to overridemax_nmse_err()method fortest_ssm_convto set the maximum error to 1e-6 which allows the tests to pass.On the 910B card, the operator runs in f32 natively, it passes the tests at the original 1e-7 precision.