Remove support for quant_llm_linear by howardzhang-cv · Pull Request #3520 · pytorch/ao

howardzhang-cv · 2025-12-20T02:08:44Z

Stack from ghstack (oldest at bottom):

-> Remove support for quant_llm_linear #3520

Summary: Deleted fp6_linear.cu and rest of fp6_llm folder
Modified ops.py (torchao/ops.py) and test_ops.py (test/test_ops.py) to remove quant_llm_linear calls

Tasks: Related to issue #3516

Differential Revision: D89908990

[ghstack-poisoned]

pytorch-bot · 2025-12-20T02:08:47Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/3520

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

B200 runners are down due to network issues

✅ No Failures

As of commit 80d8cd1 with merge base 27c5eb9 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Summary: Deleted fp6_linear.cu and rest of fp6_llm folder Modified ops.py (torchao/ops.py) and test_ops.py (test/test_ops.py) to remove quant_llm_linear calls Tasks: Related to issue [#3516](github.com//issues/3516) ghstack-source-id: 69c1877 Pull-Request: #3520

jerryzh168 · 2025-12-20T02:11:37Z

probably have to delete this and related tests etc. as well:

ao/torchao/prototype/quantization/quant_api.py

Line 620 in 7035fb7

class FPXWeightOnlyConfig(AOBaseConfig):

you can search for quant_llm_linear in the code base (https://github.com/search?q=repo%3Apytorch%2Fao%20quant_llm_linear&type=code) and delete all the related code

howardzhang-cv · 2025-12-20T02:14:26Z

First time working with the torchao repo so not really sure if this is the right way to do it:
I deleted the entire fp6_llm folder, and modified ops.py and test_ops.py to remove calls to quant_llm_linear. Is this what we wanted? Or did we want to just delete fp6_llm, keep the calls to quant_llm_linear, and just raise an error or something?
Also, if we are deleting quant_llm_linear, should I keep the floatx_tensor_core? I might be misunderstanding, but it seems like the point of those functions were just to create the fp6 that could use quant_llm_linear? In any case, there is still a reference to quant_llm_linear in floatx_tensor_core_layout.py and the README in that same folder that I have not removed. Just wanted some confirmation that this is what I'm supposed to be doing before continuing.

jerryzh168 · 2025-12-20T02:21:49Z

@howardzhang-cv I think it might be cleaner if you delete the floatx_tensor_core_layout and the FPXWeightOnlyConfig in a separate PR first, before doing this

[ghstack-poisoned]

Summary: Deleted fp6_linear.cu and rest of fp6_llm folder Modified ops.py (torchao/ops.py) and test_ops.py (test/test_ops.py) to remove quant_llm_linear calls Removed all tests/references to floatx_tensor_core_layout and FloatXTensorCoreLayout Removed all tests/references to FPXWeightOnlyConfig Tasks: Related to issue [#3516](github.com//issues/3516) ghstack-source-id: fe8afeb Pull-Request: #3520

howardzhang-cv · 2025-12-24T02:19:55Z

Since they seemed pretty intertwined (since floatx seems to only use quant_llm_linear), it made a bit more sense to me to add them to this PR as well. I removed all the references/tests to floatx_tensor_core_layout and FPXWeightOnlyConfig. That included quite a bit more code changes and deletions, so please check over to make sure I didn't delete anything important.

There was also the float8 layouts and cutlass_semi_sparse layouts that were in the floatx folder that I kept and did not delete. There are a couple references to floatx in the repo that are actually referring to these I believe? (from_hp_to_floatx for example). Please let me know if this is correct.

[ghstack-poisoned]

Summary: Deleted fp6_linear.cu and rest of fp6_llm folder Modified ops.py (torchao/ops.py) and test_ops.py (test/test_ops.py) to remove quant_llm_linear calls Removed all tests/references to floatx_tensor_core_layout and FloatXTensorCoreLayout Removed all tests/references to FPXWeightOnlyConfig Tasks: Related to issue [#3516](github.com//issues/3516) ghstack-source-id: c124f6d Pull-Request: #3520

jerryzh168

looks good, thanks! we can land as long as the CI pass I think

[ghstack-poisoned]

Summary: Deleted fp6_linear.cu and rest of fp6_llm folder Modified ops.py (torchao/ops.py) and test_ops.py (test/test_ops.py) to remove quant_llm_linear calls Removed all tests/references to floatx_tensor_core_layout and FloatXTensorCoreLayout Removed all tests/references to FPXWeightOnlyConfig Tasks: Related to issue [#3516](github.com//issues/3516) ghstack-source-id: 750701f Pull-Request: #3520

[ghstack-poisoned]

Summary: Deleted fp6_linear.cu and rest of fp6_llm folder Modified ops.py (torchao/ops.py) and test_ops.py (test/test_ops.py) to remove quant_llm_linear calls Removed all tests/references to floatx_tensor_core_layout and FloatXTensorCoreLayout Removed all tests/references to FPXWeightOnlyConfig Tasks: Related to issue [#3516](github.com//issues/3516) ghstack-source-id: c4d9c5a Pull-Request: #3520

Summary: Deleted fp6_linear.cu and rest of fp6_llm folder Modified ops.py (torchao/ops.py) and test_ops.py (test/test_ops.py) to remove quant_llm_linear calls Tasks: Related to issue [#3516](github.com//issues/3516) [ghstack-poisoned]

Summary: Deleted fp6_linear.cu and rest of fp6_llm folder Modified ops.py (torchao/ops.py) and test_ops.py (test/test_ops.py) to remove quant_llm_linear calls Removed all tests/references to floatx_tensor_core_layout and FloatXTensorCoreLayout Removed all tests/references to FPXWeightOnlyConfig Tasks: Related to issue [#3516](github.com//issues/3516) ghstack-source-id: 991d72b Pull-Request: #3520

howardzhang-cv · 2025-12-30T01:33:31Z

@howardzhang-cv has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

howardzhang-cv added 2 commits December 19, 2025 18:08

Update

07076f0

[ghstack-poisoned]

Update (base update)

ebdf60f

[ghstack-poisoned]

meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Dec 20, 2025

howardzhang-cv marked this pull request as draft December 20, 2025 02:09

howardzhang-cv requested a review from jerryzh168 December 20, 2025 02:09

Update

9e790a9

[ghstack-poisoned]

howardzhang-cv added the module: bc-breaking Use this tag if this PR breaks backward compatibility label Dec 24, 2025

Update

724f2ef

[ghstack-poisoned]

jerryzh168 approved these changes Dec 24, 2025

View reviewed changes

howardzhang-cv added the module: deprecation Use this tag if this PR deprecates a feature label Dec 24, 2025

Update

0904ee3

[ghstack-poisoned]

howardzhang-cv added 2 commits December 23, 2025 19:54

Update (base update)

9315458

[ghstack-poisoned]

Update

739c904

[ghstack-poisoned]

howardzhang-cv marked this pull request as ready for review December 24, 2025 08:30

howardzhang-cv added 2 commits December 29, 2025 17:32

Update on "Remove support for quant_llm_linear"

80d8cd1

Summary: Deleted fp6_linear.cu and rest of fp6_llm folder Modified ops.py (torchao/ops.py) and test_ops.py (test/test_ops.py) to remove quant_llm_linear calls Tasks: Related to issue [#3516](github.com//issues/3516) [ghstack-poisoned]

howardzhang-cv changed the base branch from gh/howardzhang-cv/1/base to main January 5, 2026 20:28

howardzhang-cv merged commit 3c2030b into main Jan 5, 2026
44 of 45 checks passed

andrewor14 mentioned this pull request Jan 26, 2026

Making torchao ABI stable and moving closer to python only #3516

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove support for quant_llm_linear#3520

Remove support for quant_llm_linear#3520
howardzhang-cv merged 9 commits into
mainfrom
gh/howardzhang-cv/1/head

howardzhang-cv commented Dec 20, 2025 •

edited

Loading

Uh oh!

pytorch-bot Bot commented Dec 20, 2025 •

edited

Loading

Uh oh!

jerryzh168 commented Dec 20, 2025 •

edited

Loading

Uh oh!

howardzhang-cv commented Dec 20, 2025

Uh oh!

jerryzh168 commented Dec 20, 2025 •

edited

Loading

Uh oh!

howardzhang-cv commented Dec 24, 2025

Uh oh!

jerryzh168 left a comment

Uh oh!

howardzhang-cv commented Dec 30, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

howardzhang-cv commented Dec 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot Bot commented Dec 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/3520

❗ 1 Active SEVs

✅ No Failures

Uh oh!

jerryzh168 commented Dec 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

howardzhang-cv commented Dec 20, 2025

Uh oh!

jerryzh168 commented Dec 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

howardzhang-cv commented Dec 24, 2025

Uh oh!

jerryzh168 left a comment

Choose a reason for hiding this comment

Uh oh!

howardzhang-cv commented Dec 30, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

howardzhang-cv commented Dec 20, 2025 •

edited

Loading

pytorch-bot Bot commented Dec 20, 2025 •

edited

Loading

jerryzh168 commented Dec 20, 2025 •

edited

Loading

jerryzh168 commented Dec 20, 2025 •

edited

Loading