CANN: implement the SSM_CONV operator by 0Marble · Pull Request #17737 · ggml-org/llama.cpp

0Marble · 2025-12-03T12:22:36Z

Description

We implement the SSM_CONV operator using depthwise 1D convolution.
We use high-level builtin aclnnConvolution function.

The goal is to compute the following:

$$ y[i,j,k] = \sum_{l=0}^{dconv}w[l,i] x[l+j, i, k] $$

where the shape of $y$ is $[dinner, nt, ns]$, $x$ is $[dconv - 1 + nt, dinner, ns]$ and $w$ is $[dconv, dinner]$.

In order to use aclnnConvolution to implement this formula, we reshape the tensors and set the groups parameter to d_inner to calculate the convolution for each channel independently.

Testing

We ran test-backend-ops test suite for SSM_CONV on two different cards: 310P3 and 910B3.

For the 310P3 card, it requires setting the cubeMathType parameter to ALLOW_FP32_DOWN_PRECISION, and it seems that causes the computation to be done not in f32, which in turn causes the tests to not pass with a small error (NMSE 0.000000114, greater than the allowed 1e-7). We had to override max_nmse_err() method for test_ssm_conv to set the maximum error to 1e-6 which allows the tests to pass.

On the 910B card, the operator runs in f32 natively, it passes the tests at the original 1e-7 precision.

Co-authored-by: Aleksei Lobanov, <zeromarblectm@gmail.com> Co-authored-by: Sujin Kang, <waterjin326@gmail.com>

noemotiovon · 2025-12-04T06:33:57Z

tests/test-backend-ops.cpp

+    // so the inputs are converted from f32
+    // and tests fail with NMSE = 0.000000114 > 0.000000100
+    double max_nmse_err() override {
+        return 1e-6;


Do not modify test cases other than those for ggml-cann, because the precision issues come from the 310p device’s own computation. We just need to be aware that the 310p will have some degree of precision loss.

Is there any official way for the test case to know what backed is running? Doesn't seem like any other tests do any backend-specific stuff. I could override test_case.eval method on test_ssm_conv to save the backend, or make it so the max_err method takes optional backed parameters.

I think accuracy tests shouldn’t depend on which backend is used. If the required accuracy isn’t met, then it simply isn’t met — the standard should remain consistent across all backends.

0Marble · 2025-12-04T09:04:15Z

tests/test-backend-ops.cpp

-    // and tests fail with NMSE = 0.000000114 > 0.000000100
-    double max_nmse_err() override {
-        return 1e-6;
-    }


I just removed the custom error limits, now the test fails on 310P3

No problem.

noemotiovon

LGTM, Thank you very much for your contribution — there’s just a small issue.
@hipudding, could you please help review it?

noemotiovon · 2025-12-04T09:30:44Z

ggml/src/ggml-cann/aclnn_ops.cpp

+    int64_t w_ne[GGML_MAX_DIMS] = { 0 };
+    size_t  w_nb[GGML_MAX_DIMS] = { 0 };
+
+    w_ne[0] = nc;  // K


This part can be merged into one line, but please keep the comments.

int64_t w_ne[GGML_MAX_DIMS] = { nc, 1, nr, 1 }; // [K, 1, C, 1] size_t w_nb[GGML_MAX_DIMS] = { src1->nb[0], src1->nb[1], src1->nb[1], src1->nb[3] }; // reuse src1 strides

noemotiovon · 2025-12-04T09:32:38Z

ggml/src/ggml-cann/aclnn_ops.cpp

+    int64_t y_ne[GGML_MAX_DIMS] = { 0 };
+    size_t  y_nb[GGML_MAX_DIMS] = { 0 };
+
+    y_ne[0] = n_t;  // L_out


Same as above.

0Marble · 2025-12-09T07:46:17Z

@hipudding @noemotiovon hey whats the status of this PR?

hipudding · 2025-12-18T08:56:10Z

@0Marble, Is is PR ready for review now? If so, please click "Ready for review".

* CANN: implement SSM_CONV operator Co-authored-by: Aleksei Lobanov, <zeromarblectm@gmail.com> Co-authored-by: Sujin Kang, <waterjin326@gmail.com> * CANN: remove custom error limit for SSM_CONV * CANN: merge SSM_CONV tensor shape/strides into one line --------- Co-authored-by: Sujin Kang, <waterjin326@gmail.com>

CANN: implement SSM_CONV operator

df6a560

Co-authored-by: Aleksei Lobanov, <zeromarblectm@gmail.com> Co-authored-by: Sujin Kang, <waterjin326@gmail.com>

0Marble requested a review from ggerganov as a code owner December 3, 2025 12:22

loci-dev mentioned this pull request Dec 3, 2025

UPSTREAM PR #17737: CANN: implement the SSM_CONV operator auroralabs-loci/llama.cpp#416

Open

github-actions bot added testing Everything test related ggml changes relating to the ggml tensor library for machine learning Ascend NPU issues specific to Ascend NPUs labels Dec 3, 2025

noemotiovon reviewed Dec 4, 2025

View reviewed changes

CANN: remove custom error limit for SSM_CONV

eb07456

0Marble commented Dec 4, 2025

View reviewed changes

noemotiovon approved these changes Dec 4, 2025

View reviewed changes

CANN: merge SSM_CONV tensor shape/strides into one line

a70e4c8

0Marble requested review from 0cc4m, CISC, JohannesGaessler, aldehir, allozaur, am17an, danbev, lhez, max-krasnyansky, ngxson, pwilkin, reeselevine and rgerganov as code owners December 18, 2025 03:17

0Marble force-pushed the squash-commits branch from 1a20717 to a70e4c8 Compare December 18, 2025 03:21

0Marble marked this pull request as draft December 18, 2025 03:24

CISC removed request for aldehir, danbev and rgerganov December 18, 2025 06:44

CISC requested review from hipudding and removed request for 0cc4m, CISC, JohannesGaessler, allozaur, am17an, ggerganov, lhez, max-krasnyansky, ngxson, pwilkin and reeselevine December 18, 2025 06:44

0Marble marked this pull request as ready for review December 18, 2025 08:58

hipudding approved these changes Dec 24, 2025

View reviewed changes

hipudding merged commit b07cda6 into ggml-org:master Dec 26, 2025
285 of 293 checks passed

noemotiovon mentioned this pull request Jan 13, 2026

CANN: SSM_CONV operator noemotiovon/llama.cpp#9

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CANN: implement the SSM_CONV operator#17737

CANN: implement the SSM_CONV operator#17737
hipudding merged 3 commits intoggml-org:masterfrom
0Marble:squash-commits

0Marble commented Dec 3, 2025

Uh oh!

noemotiovon Dec 4, 2025

Uh oh!

0Marble Dec 4, 2025

Uh oh!

noemotiovon Dec 4, 2025

Uh oh!

0Marble Dec 4, 2025

Uh oh!

noemotiovon Dec 4, 2025

Uh oh!

noemotiovon left a comment

Uh oh!

noemotiovon Dec 4, 2025

Uh oh!

noemotiovon Dec 4, 2025

Uh oh!

0Marble commented Dec 9, 2025

Uh oh!

hipudding commented Dec 18, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

0Marble commented Dec 3, 2025

Description

Testing

Uh oh!

noemotiovon Dec 4, 2025

Choose a reason for hiding this comment

Uh oh!

0Marble Dec 4, 2025

Choose a reason for hiding this comment

Uh oh!

noemotiovon Dec 4, 2025

Choose a reason for hiding this comment

Uh oh!

0Marble Dec 4, 2025

Choose a reason for hiding this comment

Uh oh!

noemotiovon Dec 4, 2025

Choose a reason for hiding this comment

Uh oh!

noemotiovon left a comment

Choose a reason for hiding this comment

Uh oh!

noemotiovon Dec 4, 2025

Choose a reason for hiding this comment

Uh oh!

noemotiovon Dec 4, 2025

Choose a reason for hiding this comment

Uh oh!

0Marble commented Dec 9, 2025

Uh oh!

hipudding commented Dec 18, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants