Skip to content

Remove support for quant_llm_linear#3520

Merged
howardzhang-cv merged 9 commits into
mainfrom
gh/howardzhang-cv/1/head
Jan 5, 2026
Merged

Remove support for quant_llm_linear#3520
howardzhang-cv merged 9 commits into
mainfrom
gh/howardzhang-cv/1/head

Conversation

@howardzhang-cv

@howardzhang-cv howardzhang-cv commented Dec 20, 2025

Copy link
Copy Markdown
Contributor

Stack from ghstack (oldest at bottom):

Summary: Deleted fp6_linear.cu and rest of fp6_llm folder
Modified ops.py (torchao/ops.py) and test_ops.py (test/test_ops.py) to remove quant_llm_linear calls

Tasks: Related to issue #3516

Differential Revision: D89908990

[ghstack-poisoned]
[ghstack-poisoned]
@pytorch-bot

pytorch-bot Bot commented Dec 20, 2025

Copy link
Copy Markdown

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/3520

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

✅ No Failures

As of commit 80d8cd1 with merge base 27c5eb9 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

howardzhang-cv added a commit that referenced this pull request Dec 20, 2025
Summary: Deleted fp6_linear.cu and rest of fp6_llm folder
Modified ops.py (torchao/ops.py) and test_ops.py (test/test_ops.py) to remove quant_llm_linear calls

Tasks: Related to issue [#3516](github.com//issues/3516)
ghstack-source-id: 69c1877
Pull-Request: #3520
@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Dec 20, 2025
@howardzhang-cv howardzhang-cv marked this pull request as draft December 20, 2025 02:09
@jerryzh168

jerryzh168 commented Dec 20, 2025

Copy link
Copy Markdown
Contributor

probably have to delete this and related tests etc. as well:

class FPXWeightOnlyConfig(AOBaseConfig):

you can search for quant_llm_linear in the code base (https://github.com/search?q=repo%3Apytorch%2Fao%20quant_llm_linear&type=code) and delete all the related code

@howardzhang-cv

Copy link
Copy Markdown
Contributor Author

First time working with the torchao repo so not really sure if this is the right way to do it:
I deleted the entire fp6_llm folder, and modified ops.py and test_ops.py to remove calls to quant_llm_linear. Is this what we wanted? Or did we want to just delete fp6_llm, keep the calls to quant_llm_linear, and just raise an error or something?
Also, if we are deleting quant_llm_linear, should I keep the floatx_tensor_core? I might be misunderstanding, but it seems like the point of those functions were just to create the fp6 that could use quant_llm_linear? In any case, there is still a reference to quant_llm_linear in floatx_tensor_core_layout.py and the README in that same folder that I have not removed. Just wanted some confirmation that this is what I'm supposed to be doing before continuing.

@jerryzh168

jerryzh168 commented Dec 20, 2025

Copy link
Copy Markdown
Contributor

@howardzhang-cv I think it might be cleaner if you delete the floatx_tensor_core_layout and the FPXWeightOnlyConfig in a separate PR first, before doing this

[ghstack-poisoned]
howardzhang-cv added a commit that referenced this pull request Dec 24, 2025
Summary: Deleted fp6_linear.cu and rest of fp6_llm folder
Modified ops.py (torchao/ops.py) and test_ops.py (test/test_ops.py) to remove quant_llm_linear calls
Removed all tests/references to floatx_tensor_core_layout and FloatXTensorCoreLayout
Removed all tests/references to FPXWeightOnlyConfig

Tasks: Related to issue [#3516](github.com//issues/3516)
ghstack-source-id: fe8afeb
Pull-Request: #3520
@howardzhang-cv

Copy link
Copy Markdown
Contributor Author

Since they seemed pretty intertwined (since floatx seems to only use quant_llm_linear), it made a bit more sense to me to add them to this PR as well. I removed all the references/tests to floatx_tensor_core_layout and FPXWeightOnlyConfig. That included quite a bit more code changes and deletions, so please check over to make sure I didn't delete anything important.

There was also the float8 layouts and cutlass_semi_sparse layouts that were in the floatx folder that I kept and did not delete. There are a couple references to floatx in the repo that are actually referring to these I believe? (from_hp_to_floatx for example). Please let me know if this is correct.

@howardzhang-cv howardzhang-cv added the module: bc-breaking Use this tag if this PR breaks backward compatibility label Dec 24, 2025
[ghstack-poisoned]
howardzhang-cv added a commit that referenced this pull request Dec 24, 2025
Summary: Deleted fp6_linear.cu and rest of fp6_llm folder
Modified ops.py (torchao/ops.py) and test_ops.py (test/test_ops.py) to remove quant_llm_linear calls
Removed all tests/references to floatx_tensor_core_layout and FloatXTensorCoreLayout
Removed all tests/references to FPXWeightOnlyConfig

Tasks: Related to issue [#3516](github.com//issues/3516)
ghstack-source-id: c124f6d
Pull-Request: #3520

@jerryzh168 jerryzh168 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good, thanks! we can land as long as the CI pass I think

@howardzhang-cv howardzhang-cv added the module: deprecation Use this tag if this PR deprecates a feature label Dec 24, 2025
[ghstack-poisoned]
howardzhang-cv added a commit that referenced this pull request Dec 24, 2025
Summary: Deleted fp6_linear.cu and rest of fp6_llm folder
Modified ops.py (torchao/ops.py) and test_ops.py (test/test_ops.py) to remove quant_llm_linear calls
Removed all tests/references to floatx_tensor_core_layout and FloatXTensorCoreLayout
Removed all tests/references to FPXWeightOnlyConfig

Tasks: Related to issue [#3516](github.com//issues/3516)
ghstack-source-id: 750701f
Pull-Request: #3520
[ghstack-poisoned]
[ghstack-poisoned]
howardzhang-cv added a commit that referenced this pull request Dec 24, 2025
Summary: Deleted fp6_linear.cu and rest of fp6_llm folder
Modified ops.py (torchao/ops.py) and test_ops.py (test/test_ops.py) to remove quant_llm_linear calls
Removed all tests/references to floatx_tensor_core_layout and FloatXTensorCoreLayout
Removed all tests/references to FPXWeightOnlyConfig

Tasks: Related to issue [#3516](github.com//issues/3516)
ghstack-source-id: c4d9c5a
Pull-Request: #3520
@howardzhang-cv howardzhang-cv marked this pull request as ready for review December 24, 2025 08:30
Summary: Deleted fp6_linear.cu and rest of fp6_llm folder
Modified ops.py (torchao/ops.py) and test_ops.py (test/test_ops.py) to remove quant_llm_linear calls

Tasks: Related to issue [#3516](github.com//issues/3516)

[ghstack-poisoned]
Summary: Deleted fp6_linear.cu and rest of fp6_llm folder
Modified ops.py (torchao/ops.py) and test_ops.py (test/test_ops.py) to remove quant_llm_linear calls

Tasks: Related to issue [#3516](github.com//issues/3516)

[ghstack-poisoned]
howardzhang-cv added a commit that referenced this pull request Dec 30, 2025
Summary: Deleted fp6_linear.cu and rest of fp6_llm folder
Modified ops.py (torchao/ops.py) and test_ops.py (test/test_ops.py) to remove quant_llm_linear calls
Removed all tests/references to floatx_tensor_core_layout and FloatXTensorCoreLayout
Removed all tests/references to FPXWeightOnlyConfig

Tasks: Related to issue [#3516](github.com//issues/3516)
ghstack-source-id: 991d72b
Pull-Request: #3520
@howardzhang-cv

Copy link
Copy Markdown
Contributor Author

@howardzhang-cv has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@howardzhang-cv howardzhang-cv changed the base branch from gh/howardzhang-cv/1/base to main January 5, 2026 20:28
@howardzhang-cv howardzhang-cv merged commit 3c2030b into main Jan 5, 2026
44 of 45 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. module: bc-breaking Use this tag if this PR breaks backward compatibility module: deprecation Use this tag if this PR deprecates a feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants