Skip to content

Add support for float8 activation for Int4PreshuffledTensor#2437

Merged
jerryzh168 merged 1 commit into
mainfrom
jerryzh168/stack/4
Jul 12, 2025
Merged

Add support for float8 activation for Int4PreshuffledTensor#2437
jerryzh168 merged 1 commit into
mainfrom
jerryzh168/stack/4

Conversation

@jerryzh168

@jerryzh168 jerryzh168 commented Jun 24, 2025

Copy link
Copy Markdown
Contributor

Stacked PRs:


Add support for float8 activation for Int4PreshuffledTensor

Summary:
Added basic op support like linear and bmm, we have both float8 and bf16 in the same Tensor
because it's the same dtype, only difference is whether the activation is quantized or not. Although
there is some differneces in implementation:

bf16 activaton:

  • group_scale
  • group_zero

fp8 activation

  • group_scale
  • row_scale

Test Plan:
python test/quantization/quantize_/workflows/int4/test_int4_preshuffled_tensor.py

Reviewers:

Subscribers:

Tasks:

Tags:

@pytorch-bot

pytorch-bot Bot commented Jun 24, 2025

Copy link
Copy Markdown

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2437

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit 889dca3 with merge base e5ca515 (image):

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

jerryzh168 added a commit that referenced this pull request Jun 24, 2025
Summary:
Note: slice is not working yet, others are working

Test Plan:
python test/dtypes/test_float8_activation_int4_groupwise_preshuffle.py

Reviewers:

Subscribers:

Tasks:

Tags:

stack-info: PR: #2437, branch: jerryzh168/stack/4
@jerryzh168 jerryzh168 force-pushed the jerryzh168/stack/4 branch from d6d3477 to 26517e8 Compare June 24, 2025 22:25
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 24, 2025
@jerryzh168 jerryzh168 changed the base branch from jerryzh168/stack/2 to main June 24, 2025 22:26
jerryzh168 added a commit that referenced this pull request Jun 24, 2025
Summary:
Note: slice is not working yet, others are working

Test Plan:
python test/dtypes/test_float8_activation_int4_groupwise_preshuffle.py

Reviewers:

Subscribers:

Tasks:

Tags:

stack-info: PR: #2437, branch: jerryzh168/stack/4
@jerryzh168 jerryzh168 force-pushed the jerryzh168/stack/4 branch from 26517e8 to d187f78 Compare June 24, 2025 22:26
@jerryzh168 jerryzh168 changed the base branch from main to jerryzh168/stack/2 June 24, 2025 22:26
@jerryzh168 jerryzh168 changed the base branch from jerryzh168/stack/2 to main June 24, 2025 22:28
@jerryzh168 jerryzh168 force-pushed the jerryzh168/stack/4 branch from d187f78 to 2fcff42 Compare June 24, 2025 22:28
jerryzh168 added a commit that referenced this pull request Jun 24, 2025
Summary:
Note: slice is not working yet, others are working

Test Plan:
python test/dtypes/test_float8_activation_int4_groupwise_preshuffle.py

Reviewers:

Subscribers:

Tasks:

Tags:

stack-info: PR: #2437, branch: jerryzh168/stack/4
@jerryzh168 jerryzh168 changed the base branch from main to jerryzh168/stack/2 June 24, 2025 22:28
@jerryzh168 jerryzh168 changed the base branch from jerryzh168/stack/2 to main June 26, 2025 05:03
jerryzh168 added a commit that referenced this pull request Jun 26, 2025
Summary:
Note: slice is not working yet, others are working

Test Plan:
python test/dtypes/test_float8_activation_int4_groupwise_preshuffle.py

Reviewers:

Subscribers:

Tasks:

Tags:

stack-info: PR: #2437, branch: jerryzh168/stack/4
@jerryzh168 jerryzh168 force-pushed the jerryzh168/stack/4 branch from 2fcff42 to 95856ed Compare June 26, 2025 05:03
@jerryzh168 jerryzh168 changed the base branch from main to jerryzh168/stack/2 June 26, 2025 05:03
@jerryzh168 jerryzh168 changed the base branch from jerryzh168/stack/2 to main June 27, 2025 19:36
jerryzh168 added a commit that referenced this pull request Jun 27, 2025
Summary:
Note: slice is not working yet, others are working

Test Plan:
python test/dtypes/test_float8_activation_int4_groupwise_preshuffle.py

Reviewers:

Subscribers:

Tasks:

Tags:

stack-info: PR: #2437, branch: jerryzh168/stack/4
@jerryzh168 jerryzh168 force-pushed the jerryzh168/stack/4 branch from 95856ed to 1dec2cb Compare June 27, 2025 19:37
@jerryzh168 jerryzh168 changed the base branch from main to jerryzh168/stack/2 June 27, 2025 19:37
@jerryzh168 jerryzh168 changed the base branch from jerryzh168/stack/2 to main June 27, 2025 19:38
@jerryzh168 jerryzh168 force-pushed the jerryzh168/stack/4 branch from 1dec2cb to 1645c79 Compare June 27, 2025 19:38
jerryzh168 added a commit that referenced this pull request Jun 27, 2025
Summary:
Note: slice is not working yet, others are working

Test Plan:
python test/dtypes/test_float8_activation_int4_groupwise_preshuffle.py

Reviewers:

Subscribers:

Tasks:

Tags:

stack-info: PR: #2437, branch: jerryzh168/stack/4
@jerryzh168 jerryzh168 changed the base branch from main to jerryzh168/stack/2 June 27, 2025 19:38
@jerryzh168 jerryzh168 changed the base branch from jerryzh168/stack/2 to main June 27, 2025 19:48
jerryzh168 added a commit that referenced this pull request Jun 27, 2025
Summary:
Note: slice is not working yet, others are working

Test Plan:
python test/dtypes/test_float8_activation_int4_groupwise_preshuffle.py

Reviewers:

Subscribers:

Tasks:

Tags:

stack-info: PR: #2437, branch: jerryzh168/stack/4
@jerryzh168 jerryzh168 force-pushed the jerryzh168/stack/4 branch from 1645c79 to 5e9e869 Compare June 27, 2025 19:48
@jerryzh168 jerryzh168 changed the base branch from main to jerryzh168/stack/2 June 27, 2025 19:48
@jerryzh168 jerryzh168 force-pushed the jerryzh168/stack/4 branch from cc359e6 to b86939c Compare July 2, 2025 23:44
@jerryzh168 jerryzh168 changed the base branch from main to jerryzh168/stack/2 July 2, 2025 23:44
Comment thread test/integration/test_serialization_bc.py Outdated
@jerryzh168 jerryzh168 changed the base branch from jerryzh168/stack/2 to main July 3, 2025 00:09
@jerryzh168 jerryzh168 force-pushed the jerryzh168/stack/4 branch from b86939c to 9ff26dd Compare July 3, 2025 00:09
@jerryzh168 jerryzh168 changed the base branch from main to jerryzh168/stack/2 July 3, 2025 00:09
@jerryzh168 jerryzh168 changed the base branch from jerryzh168/stack/2 to main July 3, 2025 02:18
@jerryzh168 jerryzh168 force-pushed the jerryzh168/stack/4 branch from 9ff26dd to b8466ab Compare July 3, 2025 02:18
@jerryzh168 jerryzh168 changed the base branch from main to jerryzh168/stack/2 July 3, 2025 02:18
@jerryzh168 jerryzh168 changed the base branch from jerryzh168/stack/2 to main July 3, 2025 02:23
@jerryzh168 jerryzh168 force-pushed the jerryzh168/stack/4 branch from b8466ab to e666ef4 Compare July 3, 2025 02:23
@jerryzh168 jerryzh168 changed the base branch from main to jerryzh168/stack/2 July 3, 2025 02:23
@jerryzh168 jerryzh168 changed the base branch from jerryzh168/stack/2 to main July 3, 2025 02:36
@jerryzh168 jerryzh168 force-pushed the jerryzh168/stack/4 branch from e666ef4 to c13489e Compare July 3, 2025 02:36
@jerryzh168 jerryzh168 changed the base branch from main to jerryzh168/stack/2 July 3, 2025 02:36
@jerryzh168 jerryzh168 changed the base branch from jerryzh168/stack/2 to main July 3, 2025 02:37
Comment thread torchao/quantization/quant_api.py Outdated
Comment thread torchao/quantization/quantize_/int4/int4_groupwise_preshuffle_tensor.py Outdated
Comment thread torchao/quantization/quantize_/int4/int4_groupwise_preshuffle_tensor.py Outdated
Comment thread torchao/quantization/quantize_/int4/int4_groupwise_preshuffle_tensor.py Outdated
Comment thread torchao/quantization/quantize_/int4/int4_groupwise_preshuffle_tensor.py Outdated

@drisspg drisspg left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add serialization tests

@jerryzh168

jerryzh168 commented Jul 8, 2025

Copy link
Copy Markdown
Contributor Author

Can you add serialization tests

you mean serialization of the models? it's moved to https://github.com/pytorch/ao/pull/2463/files#diff-9f6b6c4b39656e797cfda97536a4cf8a82004c64da518ad524637b471b716739, I don't exactly remember the reason

for config serialization we did a config refactor in the last PR, I can add after we are aligned on what config should look like

Comment thread torchao/quantization/quantize_/int4/__init__.py Outdated
Comment thread torchao/quantization/quant_api.py
Comment thread torchao/quantization/quant_api.py
Comment thread torchao/quantization/quant_api.py
Comment thread torchao/quantization/quantize_/int4/int4_groupwise_preshuffle_tensor.py Outdated
Comment thread torchao/quantization/quantize_/int4/int4_groupwise_preshuffle_tensor.py Outdated
Summary:
Added basic op support like linear and bmm, we have both float8 and bf16 in the same Tensor
because it's the same dtype, only difference is whether the activation is quantized or not. Although
there is some differneces in implementation:

bf16 activaton:
* group_scale
* group_zero

fp8 activation
* group_scale
* row_scale

Test Plan:
python test/quantization/quantize_/workflows/int4/test_int4_preshuffled_tensor.py

Reviewers:

Subscribers:

Tasks:

Tags:

stack-info: PR: #2437, branch: jerryzh168/stack/4
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. topic: new feature Use this tag if this PR adds a new feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants