Add support for float8 activation for Int4PreshuffledTensor#2437
Merged
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2437
Note: Links to docs will display an error until the docs builds have been completed. ✅ You can merge normally! (1 Unrelated Failure)As of commit 889dca3 with merge base e5ca515 ( BROKEN TRUNK - The following job failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
jerryzh168
added a commit
that referenced
this pull request
Jun 24, 2025
Summary: Note: slice is not working yet, others are working Test Plan: python test/dtypes/test_float8_activation_int4_groupwise_preshuffle.py Reviewers: Subscribers: Tasks: Tags: stack-info: PR: #2437, branch: jerryzh168/stack/4
d6d3477 to
26517e8
Compare
This was referenced Jun 24, 2025
jerryzh168
added a commit
that referenced
this pull request
Jun 24, 2025
Summary: Note: slice is not working yet, others are working Test Plan: python test/dtypes/test_float8_activation_int4_groupwise_preshuffle.py Reviewers: Subscribers: Tasks: Tags: stack-info: PR: #2437, branch: jerryzh168/stack/4
26517e8 to
d187f78
Compare
d187f78 to
2fcff42
Compare
jerryzh168
added a commit
that referenced
this pull request
Jun 24, 2025
Summary: Note: slice is not working yet, others are working Test Plan: python test/dtypes/test_float8_activation_int4_groupwise_preshuffle.py Reviewers: Subscribers: Tasks: Tags: stack-info: PR: #2437, branch: jerryzh168/stack/4
jerryzh168
added a commit
that referenced
this pull request
Jun 26, 2025
Summary: Note: slice is not working yet, others are working Test Plan: python test/dtypes/test_float8_activation_int4_groupwise_preshuffle.py Reviewers: Subscribers: Tasks: Tags: stack-info: PR: #2437, branch: jerryzh168/stack/4
2fcff42 to
95856ed
Compare
jerryzh168
added a commit
that referenced
this pull request
Jun 27, 2025
Summary: Note: slice is not working yet, others are working Test Plan: python test/dtypes/test_float8_activation_int4_groupwise_preshuffle.py Reviewers: Subscribers: Tasks: Tags: stack-info: PR: #2437, branch: jerryzh168/stack/4
95856ed to
1dec2cb
Compare
1dec2cb to
1645c79
Compare
jerryzh168
added a commit
that referenced
this pull request
Jun 27, 2025
Summary: Note: slice is not working yet, others are working Test Plan: python test/dtypes/test_float8_activation_int4_groupwise_preshuffle.py Reviewers: Subscribers: Tasks: Tags: stack-info: PR: #2437, branch: jerryzh168/stack/4
jerryzh168
added a commit
that referenced
this pull request
Jun 27, 2025
Summary: Note: slice is not working yet, others are working Test Plan: python test/dtypes/test_float8_activation_int4_groupwise_preshuffle.py Reviewers: Subscribers: Tasks: Tags: stack-info: PR: #2437, branch: jerryzh168/stack/4
1645c79 to
5e9e869
Compare
cc359e6 to
b86939c
Compare
jerryzh168
commented
Jul 2, 2025
b86939c to
9ff26dd
Compare
9ff26dd to
b8466ab
Compare
b8466ab to
e666ef4
Compare
e666ef4 to
c13489e
Compare
drisspg
reviewed
Jul 8, 2025
drisspg
reviewed
Jul 8, 2025
drisspg
reviewed
Jul 8, 2025
drisspg
reviewed
Jul 8, 2025
drisspg
reviewed
Jul 8, 2025
drisspg
reviewed
Jul 8, 2025
drisspg
reviewed
Jul 8, 2025
drisspg
reviewed
Jul 8, 2025
drisspg
reviewed
Jul 8, 2025
drisspg
left a comment
Contributor
There was a problem hiding this comment.
Can you add serialization tests
Contributor
Author
you mean serialization of the models? it's moved to https://github.com/pytorch/ao/pull/2463/files#diff-9f6b6c4b39656e797cfda97536a4cf8a82004c64da518ad524637b471b716739, I don't exactly remember the reason for config serialization we did a config refactor in the last PR, I can add after we are aligned on what config should look like |
vkuzo
reviewed
Jul 8, 2025
andrewor14
approved these changes
Jul 9, 2025
Summary: Added basic op support like linear and bmm, we have both float8 and bf16 in the same Tensor because it's the same dtype, only difference is whether the activation is quantized or not. Although there is some differneces in implementation: bf16 activaton: * group_scale * group_zero fp8 activation * group_scale * row_scale Test Plan: python test/quantization/quantize_/workflows/int4/test_int4_preshuffled_tensor.py Reviewers: Subscribers: Tasks: Tags: stack-info: PR: #2437, branch: jerryzh168/stack/4
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Stacked PRs:
Add support for float8 activation for Int4PreshuffledTensor
Summary:
Added basic op support like linear and bmm, we have both float8 and bf16 in the same Tensor
because it's the same dtype, only difference is whether the activation is quantized or not. Although
there is some differneces in implementation:
bf16 activaton:
fp8 activation
Test Plan:
python test/quantization/quantize_/workflows/int4/test_int4_preshuffled_tensor.py
Reviewers:
Subscribers:
Tasks:
Tags: