Skip to content

Fix MSan use-of-uninitialized-value in SimSIMD SVE functions#101239

Open
alexey-milovidov wants to merge 4 commits intomasterfrom
fix-simsimd-sve-msan
Open

Fix MSan use-of-uninitialized-value in SimSIMD SVE functions#101239
alexey-milovidov wants to merge 4 commits intomasterfrom
fix-simsimd-sve-msan

Conversation

@alexey-milovidov
Copy link
Copy Markdown
Member

@alexey-milovidov alexey-milovidov commented Mar 30, 2026

Update SimSIMD submodule to include fix for svmla_*_x / svmls_*_x in SVE accumulator operations. The _x (don't-care) variant left inactive lanes undefined, but svaddv(svptrue, ...) summed all lanes including those undefined ones. Changed to _m (merge) which preserves the accumulator value for inactive lanes.

Fixes #101232

Contrib PR: ClickHouse/SimSIMD#18
Upstream PR: ashvardanian/NumKong#331

https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=100177&sha=a7031a78dd031fc0e90fbeebc0c95df386c5c1a7&name_0=PR&name_1=Stress%20test%20%28arm_msan%29

Changelog category (leave one):

  • CI Fix or Improvement (changelog entry is not required)

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

...

Documentation entry for user-facing changes

  • Documentation is written (mandatory for new features)

Update SimSIMD submodule to include fix for `svmla_*_x` / `svmls_*_x`
in SVE accumulator operations. The `_x` (don't-care) variant left
inactive lanes undefined, but `svaddv(svptrue, ...)` summed all lanes
including those undefined ones. Changed to `_m` (merge) which preserves
the accumulator value for inactive lanes.

Fixes #101232

https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=100177&sha=a7031a78dd031fc0e90fbeebc0c95df386c5c1a7&name_0=PR&name_1=Stress%20test%20%28arm_msan%29

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh bot commented Mar 30, 2026

Workflow [PR], commit [792e196]

Summary:

job_name test_name status info comment
Build (arm_tidy) failure
Checkout Submodules failure cidb
Build (amd_debug) dropped
Build (amd_asan_ubsan) dropped
Build (amd_tsan) dropped
Build (amd_msan) dropped
Build (amd_binary) dropped
Build (arm_debug) dropped
Build (arm_asan_ubsan) dropped
Build (arm_tsan) dropped
Build (arm_msan) dropped

AI Review

Summary

This PR updates the contrib/SimSIMD submodule from e37149571aba... to 324864d39356... to pull an upstream SVE/MSan fix for inactive-lane handling in accumulator operations. The change is narrowly scoped to a single submodule pointer update, and I did not find correctness, safety, performance, or rollout issues in the ClickHouse tree itself.

ClickHouse Rules
Item Status Notes
Deletion logging
Serialization versioning
Core-area scrutiny
No test removal
Experimental gate
No magic constants
Backward compatibility
SettingsChangesHistory.cpp
PR metadata quality
Safe rollout
Compilation time
Final Verdict
  • Status: ✅ Approve

@clickhouse-gh clickhouse-gh bot added pr-ci submodule changed At least one submodule changed in this PR. labels Mar 30, 2026
@nikitamikhaylov
Copy link
Copy Markdown
Member

@groeneai It was right to update the submodule rather than disabling the MSAN in the SimSIMD like you did here #100862

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@groeneai
Copy link
Copy Markdown
Contributor

Thanks @nikitamikhaylov — agreed, the submodule update fixing the _x_m SVE accumulator variants is the proper approach since it addresses the root cause in SimSIMD directly. Our #100862 was a workaround (disable SVE under MSAN) that we already closed in favor of this.

@nikitamikhaylov
Copy link
Copy Markdown
Member

@groeneai The issue still reproduces. Do you know what's wrong here?

The previous approach (unpoisoning `*result` after the dispatch wrapper
returns in lib.c) did not help because MSan reports the error inside the
SVE function body before it returns to the dispatch layer.

Add `SIMSIMD_UNPOISON` calls after every `svaddv` scalar reduction inside
SVE function implementations. This prevents MSan-tainted values from
propagating to normalize functions and result stores.

https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=101239&sha=a43cadd697dc5f4dbe13fd8481164dfac8be525e&name_0=PR&name_1=Stress%20test%20%28arm_msan%29

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@groeneai
Copy link
Copy Markdown
Contributor

groeneai commented Apr 1, 2026

@nikitamikhaylov The PR has never actually built — every CI run fails at Checkout Submodules because the SimSIMD submodule pointer (324864d39356) references a commit that no longer exists on ClickHouse/SimSIMD:

gh api repos/ClickHouse/SimSIMD/commits/324864d393566df939e845a78a81902a25ee2a11
→ 422: "No commit found for SHA: 324864d393566df939e845a78a81902a25ee2a11"

The fix commit exists at a different hash — it was likely force-pushed. The correct commit is 5a0677af82f7e77afae43045ae7c01debf21625a on the ClickHouse/v6.5.15 branch:

gh api repos/ClickHouse/SimSIMD/commits/5a0677af
→ "Fix use-of-uninitialized-value in SVE accumulator operations (#18)"

That commit has the right fix: changes svmla_f32_xsvmla_f32_m (and similarly for f64, f16, bf16) in all SVE accumulator operations in spatial.h and dot.h. The _x (don't-care) suffix leaves inactive lanes undefined; the subsequent svaddv(svptrue, ...) sums all lanes including undefined ones, triggering MSan. The _m (merge) variant preserves the accumulator's initialized-zero value for inactive lanes, so the reduction is clean.

To fix: update the submodule pointer to 5a0677af82f7e77afae43045ae7c01debf21625a. That alone should resolve all the STID 1003-358c/326e/2410-543a/47e0 family (574 hits across 346 PRs in the last 30 days). The SIMSIMD_UNPOISON calls in the latest commit would then be unnecessary since the root cause is addressed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-ci submodule changed At least one submodule changed in this PR.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

MemorySanitizer: use-of-uninitialized-value (STID: 1003-358c)

3 participants