Add Float8Tensor by jerryzh168 · Pull Request #2463 · pytorch/ao

jerryzh168 · 2025-06-30T23:01:24Z

Stacked PRs:

Add Float8Tensor

Summary:

Added Float8Tensor that's using fbgemm kernels and scaled_mm:
- per row activation + per row weight linear calling torch._scaled_mm op (for compatibilty with SM 8.9)
- per tensor activation + per tensor weight quant linear calling torch._scaled_mm op (for compatibilty with SM 8.9)
- per row activation + per row weight bmm calling torch.ops.fbgemm.f8f8bf16_rowwise_batched kernel (only works for SM 9.0+) can use batched scaled mm from torch when it's supported: [RFC]: PyTorch Low-Precision GEMMs Public API pytorch#157950
dynamic quantization kwargs is added to the Float8Tensor directly
Added QuantizeTensorKwargs and QuantizeTensorToFloat8Kwargs to store key word args for Float8Tensor.to_float8
Updated Float8DynamicActivationFloat8WeightConfig and Float8WeightOnlyConfig to use Float8Tensor

Test Plan:
python test/dtypes/test_affine_quantized_float.py
python test/quantization/quantize_/workflows/float8/test_float8_tensor.py

Reviewers:

Subscribers:

Tasks:

Tags:

pytorch-bot · 2025-06-30T23:01:28Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2463

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

ghstack-mergeability-check and Check labels failing with 'Resource not accessible by integration'

❌ 1 New Failure

As of commit b0c2cf3 with merge base b757fb9 ():

NEW FAILURE - The following job has failed:

Run TorchAO Experimental Tests / test-cpu-ops (linux.arm64.2xlarge) (gh)
test_replace_q_dq_patterns_with_quantized_linear_ops_pass

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Summary: Splits out the float8 rowwise quantized path (both act and weight) of AQT to Float8RowwiseTensor Next: could potentially incorporate the per tensor activation path there as well Next: we can split the per tensor weight path to another Tensor as well, so we can deprecate AQT path for float8 Test Plan: python test/dtypes/test_affine_quantized_float.py python test/quantization/quantize_/test_float8_rowwise_tensor.py Reviewers: Subscribers: Tasks: Tags: stack-info: PR: #2463, branch: jerryzh168/stack/9

vkuzo · 2025-07-25T11:50:34Z

+from torchao.quantization.quantize_.common import QuantizeTensorKwargs
+
+
+def _choose_quant_func_and_quantize_tensor(


nit: colocate with QuantizeTensorKwargs and move out from utils, since this is more of a developer facing function which is central to the design?

hmmm, I have thought about move this to common/ but then common is going to import from the workflow specific code, this seems weird, like:

from torchao.quantization.quantize_.workflows import ( Float8Tensor, QuantizeTensorToFloat8Kwargs, )

any ideas to resolve that? or do you feel it's fine for functions in common to import from workflow?

huydhn · 2025-07-25T17:34:02Z

+        git submodule update --init --recursive
+        python use_existing_torch.py
+        pip install -r requirements/build.txt
+        pip install --no-build-isolation -e .


@jerryzh168 This setup, unfortunately, destroys our H100 fleet because:

It builds vLLM from scratch without any cache and that could take hours. So, it's a very inefficient use of H100 runners

The build somehow hang Docker daemon, maybe memory related. At least, you want to set MAX_JOBS here to match what vLLM does https://github.com/vllm-project/vllm/blob/main/docker/Dockerfile#L198-L202

As it's not trivial to build vLLM from source locally, I think we should explore the option to use pre-build image from vLLM where you can pull them from public.ecr.aws/q9t5s3a7/vllm-ci-postmerge-repo:<COMMIT_HASH>. Here is an example https://github.com/pytorch/pytorch-integration-testing/blob/main/.github/workflows/vllm-benchmark.yml#L149

Another way is to run this test on PyTorch CI where @yangw-dev are working on a CI job to test vLLM main v.s. PyTorch (we also need to build vLLM there, but with caching properly setup). We could also run pytest test/integration from AO there I think

I see thanks, will try out the suggestions in #2601

Summary: * Added Float8Tensor that's using fbgemm kernels and scaled_mm: * per row activation + per row weight linear calling torch._scaled_mm op (for compatibilty with SM 8.9) * per tensor activation + per tensor weight quant linear calling torch._scaled_mm op (for compatibilty with SM 8.9) * per row activation + per row weight bmm calling torch.ops.fbgemm.f8f8bf16_rowwise_batched kernel (only works for SM 9.0+) can use batched scaled mm from torch when it's supported: pytorch/pytorch#157950 * dynamic quantization kwargs is added to the Float8Tensor directly * Added QuantizeTensorKwargs and QuantizeTensorToFloat8Kwargs to store key word args for Float8Tensor.to_float8 * Updated Float8DynamicActivationFloat8WeightConfig and Float8WeightOnlyConfig to use Float8Tensor Test Plan: python test/dtypes/test_affine_quantized_float.py python test/quantization/quantize_/workflows/float8/test_float8_tensor.py Reviewers: Subscribers: Tasks: Tags: stack-info: PR: #2463, branch: jerryzh168/stack/9

clee2000 · 2025-08-06T17:04:57Z

/easycla

jerryzh168 force-pushed the jerryzh168/stack/9 branch from da79207 to 5cae4d0 Compare June 30, 2025 23:01

This was referenced Jun 30, 2025

Add support for Int4GroupwisePreshuffleTensor for fbgemm #2421

Merged

Remove transpose_input from fbgemm configs #2422

Merged

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 30, 2025

jerryzh168 mentioned this pull request Jun 30, 2025

Add support for float8 activation for Int4PreshuffledTensor #2437

Merged

jerryzh168 added the topic: new feature Use this tag if this PR adds a new feature label Jun 30, 2025

jerryzh168 changed the base branch from jerryzh168/stack/4 to main July 2, 2025 01:58

jerryzh168 force-pushed the jerryzh168/stack/9 branch from 5cae4d0 to 33ca58e Compare July 2, 2025 01:58

jerryzh168 changed the title ~~Add Float8RowwiseTensor~~ Add Float8Tensor Jul 2, 2025

jerryzh168 changed the base branch from main to jerryzh168/stack/4 July 2, 2025 01:58

jerryzh168 mentioned this pull request Jul 2, 2025

Add all fbgemm kernel Tensors into Int4WeightOnlyConfig and Float8DynamicActivationInt4WeightConfig #2474

Merged

vkuzo reviewed Jul 2, 2025

View reviewed changes

Comment thread torchao/quantization/quantize_/float8_tensor.py Outdated

vkuzo reviewed Jul 2, 2025

View reviewed changes

Comment thread torchao/quantization/quant_api.py Outdated

vkuzo reviewed Jul 2, 2025

View reviewed changes

Comment thread torchao/quantization/quantize_/float8_tensor.py Outdated

jerryzh168 changed the base branch from jerryzh168/stack/4 to main July 2, 2025 20:35

jerryzh168 force-pushed the jerryzh168/stack/9 branch from 33ca58e to 897ec7e Compare July 2, 2025 20:36

jerryzh168 changed the base branch from main to jerryzh168/stack/4 July 2, 2025 20:36

jerryzh168 changed the base branch from jerryzh168/stack/4 to main July 2, 2025 21:42

jerryzh168 force-pushed the jerryzh168/stack/9 branch 2 times, most recently from 7897dcf to 99a1bb1 Compare July 2, 2025 21:42

jerryzh168 mentioned this pull request Jul 2, 2025

Rename torchao.float8.Float8Tensor to torchao.float8.Float8TrainingTensor #2479

Merged

jerryzh168 changed the base branch from main to jerryzh168/stack/11 July 2, 2025 21:42

jerryzh168 changed the base branch from jerryzh168/stack/11 to main July 2, 2025 23:44

jerryzh168 force-pushed the jerryzh168/stack/9 branch from 99a1bb1 to 7e9f224 Compare July 2, 2025 23:44

jerryzh168 changed the base branch from main to jerryzh168/stack/11 July 2, 2025 23:44

jerryzh168 changed the base branch from jerryzh168/stack/11 to main July 3, 2025 00:09

jerryzh168 force-pushed the jerryzh168/stack/9 branch from 7e9f224 to 442bd6c Compare July 3, 2025 00:09

jerryzh168 changed the base branch from main to jerryzh168/stack/11 July 3, 2025 00:09

vkuzo reviewed Jul 25, 2025

View reviewed changes

Comment thread torchao/quantization/quantize_/common/general_kernel_preference.py Outdated

vkuzo reviewed Jul 25, 2025

View reviewed changes

huydhn reviewed Jul 25, 2025

View reviewed changes

huydhn mentioned this pull request Jul 26, 2025

Add daily lib integration test #2601

Closed

vkuzo reviewed Jul 28, 2025

View reviewed changes

Comment thread torchao/quantization/quantize_/common/general_kernel_preference.py Outdated

vkuzo reviewed Jul 28, 2025

View reviewed changes

Comment thread torchao/quantization/quantize_/common/general_kernel_preference.py Outdated

vkuzo reviewed Jul 28, 2025

View reviewed changes

Comment thread torchao/quantization/quantize_/common/general_kernel_preference.py Outdated