Skip to content

Conversation

@edgchen1
Copy link
Contributor

@edgchen1 edgchen1 commented Jan 5, 2024

Description

Add MatMulNBits accuracy_level parameter to quantization utilities.

Motivation and Context

Allow MatMulNBits accuracy_level attribute (added in #17669) to be set to a particular value when the model is quantized.

@edgchen1
Copy link
Contributor Author

edgchen1 commented Jan 5, 2024

I verified that an int4 model can be produced with the expected MatMulNBits accuracy_level attributes present.

@edgchen1 edgchen1 merged commit 4190c29 into main Jan 5, 2024
@edgchen1 edgchen1 deleted the edgchen1/int4_quantization_accuracy_level branch January 5, 2024 22:51
jslap-ubi pushed a commit to cgaudreau-ubisoft/onnxruntime that referenced this pull request Apr 5, 2024
…icrosoft#19015)

Allow MatMulNBits `accuracy_level` attribute (added in microsoft#17669) to be set to a particular value when the model is quantized.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants