Skip to content

Conversation

@edgchen1
Copy link
Contributor

@edgchen1 edgchen1 commented Jul 25, 2025

Description

In DynamicQuantizeMatMul KleidiAI-specific prepacking logic, handle case where B zero point input is provided but not constant. In this case, we should not prepack.

Add some unit tests that test the prepacking code path.

Add check for ARM SME instructions in DynamicQuantizeMatMul before calling MlasDynamicQGemmBatch() and associated functions.

Motivation and Context

Follow up to #25187

hariharans29
hariharans29 previously approved these changes Jul 25, 2025
Copy link
Member

@hariharans29 hariharans29 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can commit the suggested changes from lintrunner.

@edgchen1 edgchen1 marked this pull request as ready for review July 26, 2025 02:01
@jywu-msft jywu-msft merged commit 9aad21c into main Jul 26, 2025
89 of 94 checks passed
@jywu-msft jywu-msft deleted the edgchen1/dynamic_quantize_matmul_b_zp_check branch July 26, 2025 04:39
@damdoo01-arm
Copy link
Contributor

Thanks @edgchen1 for handling this case.

snnn pushed a commit that referenced this pull request Jul 28, 2025
…ded but not constant. (#25544)

### Description
<!-- Describe your changes. -->

In DynamicQuantizeMatMul KleidiAI-specific prepacking logic, handle case
where B zero point input is provided but not constant. In this case, we
should not prepack.

Add some unit tests that test the prepacking code path.

Add check for ARM SME instructions in DynamicQuantizeMatMul before
calling `MlasDynamicQGemmBatch()` and associated functions.

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

Follow up to #25187
@snnn snnn mentioned this pull request Jul 28, 2025
snnn pushed a commit that referenced this pull request Jul 28, 2025
- **DynamicQuantizeMatMul - handle case where B zero point input is
provided but not constant. (#25544)**
- **Refactor plugin EP support (#25541)**
- **Remove the python installation steps from
win-qnn-arm64-ci-pipeline.yml (#25552)**
snnn pushed a commit that referenced this pull request Jul 30, 2025
- DynamicQuantizeMatMul - handle case where B zero point input is
provided but not constant. (#25544)
- Refactor plugin EP support (#25541)
- Remove the python installation steps from
win-qnn-arm64-ci-pipeline.yml (#25552)
- [EP ABI] Node_GetAttrByName returns ORT_NOT_FOUND with non-existing
attr name (#25565)
- Fix C/C++ documentation generation (#25569)
- [build] fix multi-config for VCPKG (#25585)
sanketkaleoss pushed a commit to sanketkaleoss/onnxruntime that referenced this pull request Aug 11, 2025
…ded but not constant. (microsoft#25544)

### Description
<!-- Describe your changes. -->

In DynamicQuantizeMatMul KleidiAI-specific prepacking logic, handle case
where B zero point input is provided but not constant. In this case, we
should not prepack.

Add some unit tests that test the prepacking code path.

Add check for ARM SME instructions in DynamicQuantizeMatMul before
calling `MlasDynamicQGemmBatch()` and associated functions.

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

Follow up to microsoft#25187
sanketkaleoss pushed a commit to sanketkaleoss/onnxruntime that referenced this pull request Aug 11, 2025
- **DynamicQuantizeMatMul - handle case where B zero point input is
provided but not constant. (microsoft#25544)**
- **Refactor plugin EP support (microsoft#25541)**
- **Remove the python installation steps from
win-qnn-arm64-ci-pipeline.yml (microsoft#25552)**
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants