Move QAT out of prototype#1091
Merged
Merged
Conversation
Summary: Move QAT out of prototype so we can provide stronger
BC guarantees moving forward.
**BC-breaking notes**
Before:
```
from torchao.quantization.prototype.qat import (
disable_4w_fake_quant,
disable_8da4w_fake_quant,
enable_4w_fake_quant,
enable_8da4w_fake_quant,
ComposableQATQuantizer,
Int4WeightOnlyQATQuantizer,
Int4WeightOnlyEmbeddingQATQuantizer
Int8DynActInt4WeightQATQuantizer,
Int8DynActInt4WeightQATLinear,
)
from torchao.quantization.prototype.qat.api import (
FakeQuantizeConfig,
)
from torchao.quantization.prototype.qat.fake_quantizer import (
FakeQuantizer,
)
```
After:
```
from torchao.quantization.qat import (
ComposableQATQuantizer,
Int4WeightOnlyQATQuantizer,
Int4WeightOnlyEmbeddingQATQuantizer
Int8DynActInt4WeightQATQuantizer,
)
from torchao.quantization.qat.linear import (
disable_4w_fake_quant,
disable_8da4w_fake_quant,
enable_4w_fake_quant,
enable_8da4w_fake_quant,
Int8DynActInt4WeightQATLinear,
)
from torchao.quantization.qat.api import (
FakeQuantizeConfig,
)
from torchao.quantization.qat.fake_quantizer import (
FakeQuantizer,
)
```
Test Plan:
python test/quantization/test_qat.py
[ghstack-poisoned]
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1091
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit fc9c1a9 with merge base 7aaf0ff ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
andrewor14
added a commit
that referenced
this pull request
Oct 16, 2024
Summary: Move QAT out of prototype so we can provide stronger
BC guarantees moving forward.
**BC-breaking notes**
Before:
```
from torchao.quantization.prototype.qat import (
disable_4w_fake_quant,
disable_8da4w_fake_quant,
enable_4w_fake_quant,
enable_8da4w_fake_quant,
ComposableQATQuantizer,
Int4WeightOnlyQATQuantizer,
Int4WeightOnlyEmbeddingQATQuantizer
Int8DynActInt4WeightQATQuantizer,
Int8DynActInt4WeightQATLinear,
)
from torchao.quantization.prototype.qat.api import (
FakeQuantizeConfig,
)
from torchao.quantization.prototype.qat.fake_quantizer import (
FakeQuantizer,
)
```
After:
```
from torchao.quantization.qat import (
ComposableQATQuantizer,
Int4WeightOnlyQATQuantizer,
Int4WeightOnlyEmbeddingQATQuantizer
Int8DynActInt4WeightQATQuantizer,
)
from torchao.quantization.qat.linear import (
disable_4w_fake_quant,
disable_8da4w_fake_quant,
enable_4w_fake_quant,
enable_8da4w_fake_quant,
Int8DynActInt4WeightQATLinear,
)
from torchao.quantization.qat.api import (
FakeQuantizeConfig,
)
from torchao.quantization.qat.fake_quantizer import (
FakeQuantizer,
)
```
Test Plan:
python test/quantization/test_qat.py
ghstack-source-id: 9ebc45d
Pull Request resolved: #1091
jainapurva
approved these changes
Oct 16, 2024
Contributor
Author
|
(Don't merge this yet! Still need to ghimport once we figure out the diff train issue) |
jerryzh168
approved these changes
Oct 16, 2024
**Summary:** Move QAT out of prototype so we can provide stronger
BC guarantees moving forward.
**BC-breaking notes**
Before:
```
from torchao.quantization.prototype.qat import (
disable_4w_fake_quant,
disable_8da4w_fake_quant,
enable_4w_fake_quant,
enable_8da4w_fake_quant,
ComposableQATQuantizer,
Int4WeightOnlyQATQuantizer,
Int4WeightOnlyEmbeddingQATQuantizer
Int8DynActInt4WeightQATQuantizer,
Int8DynActInt4WeightQATLinear,
)
from torchao.quantization.prototype.qat.api import (
FakeQuantizeConfig,
)
from torchao.quantization.prototype.qat.fake_quantizer import (
FakeQuantizer,
)
```
After:
```
from torchao.quantization.qat import (
ComposableQATQuantizer,
Int4WeightOnlyQATQuantizer,
Int4WeightOnlyEmbeddingQATQuantizer
Int8DynActInt4WeightQATQuantizer,
)
from torchao.quantization.qat.linear import (
disable_4w_fake_quant,
disable_8da4w_fake_quant,
enable_4w_fake_quant,
enable_8da4w_fake_quant,
Int8DynActInt4WeightQATLinear,
)
from torchao.quantization.qat.api import (
FakeQuantizeConfig,
)
from torchao.quantization.qat.fake_quantizer import (
FakeQuantizer,
)
```
**Test Plan:**
python test/quantization/test_qat.py
[ghstack-poisoned]
andrewor14
added a commit
that referenced
this pull request
Oct 17, 2024
Summary: Move QAT out of prototype so we can provide stronger
BC guarantees moving forward.
**BC-breaking notes**
Before:
```
from torchao.quantization.prototype.qat import (
disable_4w_fake_quant,
disable_8da4w_fake_quant,
enable_4w_fake_quant,
enable_8da4w_fake_quant,
ComposableQATQuantizer,
Int4WeightOnlyQATQuantizer,
Int4WeightOnlyEmbeddingQATQuantizer
Int8DynActInt4WeightQATQuantizer,
Int8DynActInt4WeightQATLinear,
)
from torchao.quantization.prototype.qat.api import (
FakeQuantizeConfig,
)
from torchao.quantization.prototype.qat.fake_quantizer import (
FakeQuantizer,
)
```
After:
```
from torchao.quantization.qat import (
ComposableQATQuantizer,
Int4WeightOnlyQATQuantizer,
Int4WeightOnlyEmbeddingQATQuantizer
Int8DynActInt4WeightQATQuantizer,
)
from torchao.quantization.qat.linear import (
disable_4w_fake_quant,
disable_8da4w_fake_quant,
enable_4w_fake_quant,
enable_8da4w_fake_quant,
Int8DynActInt4WeightQATLinear,
)
from torchao.quantization.qat.api import (
FakeQuantizeConfig,
)
from torchao.quantization.qat.fake_quantizer import (
FakeQuantizer,
)
```
Test Plan:
python test/quantization/test_qat.py
ghstack-source-id: cb72a8b
Pull Request resolved: #1091
Contributor
Author
|
@andrewor14 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
**Summary:** Move QAT out of prototype so we can provide stronger
BC guarantees moving forward.
**BC-breaking notes**
Before:
```
from torchao.quantization.prototype.qat import (
disable_4w_fake_quant,
disable_8da4w_fake_quant,
enable_4w_fake_quant,
enable_8da4w_fake_quant,
ComposableQATQuantizer,
Int4WeightOnlyQATQuantizer,
Int4WeightOnlyEmbeddingQATQuantizer
Int8DynActInt4WeightQATQuantizer,
Int8DynActInt4WeightQATLinear,
)
from torchao.quantization.prototype.qat.api import (
FakeQuantizeConfig,
)
from torchao.quantization.prototype.qat.fake_quantizer import (
FakeQuantizer,
)
```
After:
```
from torchao.quantization.qat import (
ComposableQATQuantizer,
Int4WeightOnlyQATQuantizer,
Int4WeightOnlyEmbeddingQATQuantizer
Int8DynActInt4WeightQATQuantizer,
)
from torchao.quantization.qat.linear import (
disable_4w_fake_quant,
disable_8da4w_fake_quant,
enable_4w_fake_quant,
enable_8da4w_fake_quant,
Int8DynActInt4WeightQATLinear,
)
from torchao.quantization.qat.api import (
FakeQuantizeConfig,
)
from torchao.quantization.qat.fake_quantizer import (
FakeQuantizer,
)
```
**Test Plan:**
python test/quantization/test_qat.py
Differential Revision: [D64555609](https://our.internmc.facebook.com/intern/diff/D64555609)
[ghstack-poisoned]
andrewor14
added a commit
that referenced
this pull request
Oct 17, 2024
Summary: Move QAT out of prototype so we can provide stronger
BC guarantees moving forward.
**(Future) BC-breaking notes**
Note: This commit itself doesn't break BC yet. A future PR
will do that. The following is just to save this BC breaking
note somewhere.
Before:
```
from torchao.quantization.prototype.qat import (
disable_4w_fake_quant,
disable_8da4w_fake_quant,
enable_4w_fake_quant,
enable_8da4w_fake_quant,
ComposableQATQuantizer,
Int4WeightOnlyQATQuantizer,
Int4WeightOnlyEmbeddingQATQuantizer
Int8DynActInt4WeightQATQuantizer,
Int8DynActInt4WeightQATLinear,
)
from torchao.quantization.prototype.qat.api import (
FakeQuantizeConfig,
)
from torchao.quantization.prototype.qat.fake_quantizer import (
FakeQuantizer,
)
```
After:
```
from torchao.quantization.qat import (
ComposableQATQuantizer,
Int4WeightOnlyQATQuantizer,
Int4WeightOnlyEmbeddingQATQuantizer
Int8DynActInt4WeightQATQuantizer,
)
from torchao.quantization.qat.linear import (
disable_4w_fake_quant,
disable_8da4w_fake_quant,
enable_4w_fake_quant,
enable_8da4w_fake_quant,
Int8DynActInt4WeightQATLinear,
)
from torchao.quantization.qat.api import (
FakeQuantizeConfig,
)
from torchao.quantization.qat.fake_quantizer import (
FakeQuantizer,
)
```
Test Plan:
python test/quantization/test_qat.py
ghstack-source-id: 926354f
Pull Request resolved: #1091
**Summary:** Move QAT out of prototype so we can provide stronger
BC guarantees moving forward.
**BC-breaking notes**
Before:
```
from torchao.quantization.prototype.qat import (
disable_4w_fake_quant,
disable_8da4w_fake_quant,
enable_4w_fake_quant,
enable_8da4w_fake_quant,
ComposableQATQuantizer,
Int4WeightOnlyQATQuantizer,
Int4WeightOnlyEmbeddingQATQuantizer
Int8DynActInt4WeightQATQuantizer,
Int8DynActInt4WeightQATLinear,
)
from torchao.quantization.prototype.qat.api import (
FakeQuantizeConfig,
)
from torchao.quantization.prototype.qat.fake_quantizer import (
FakeQuantizer,
)
```
After:
```
from torchao.quantization.qat import (
ComposableQATQuantizer,
Int4WeightOnlyQATQuantizer,
Int4WeightOnlyEmbeddingQATQuantizer
Int8DynActInt4WeightQATQuantizer,
)
from torchao.quantization.qat.linear import (
disable_4w_fake_quant,
disable_8da4w_fake_quant,
enable_4w_fake_quant,
enable_8da4w_fake_quant,
Int8DynActInt4WeightQATLinear,
)
from torchao.quantization.qat.api import (
FakeQuantizeConfig,
)
from torchao.quantization.qat.fake_quantizer import (
FakeQuantizer,
)
```
**Test Plan:**
python test/quantization/test_qat.py
Differential Revision: [D64555609](https://our.internmc.facebook.com/intern/diff/D64555609)
[ghstack-poisoned]
andrewor14
added a commit
that referenced
this pull request
Oct 17, 2024
Summary: Move QAT out of prototype so we can provide stronger
BC guarantees moving forward.
**(Future) BC-breaking notes**
Note: This commit itself doesn't break BC yet. A future PR
will do that. The following is just to save this BC breaking
note somewhere.
Before:
```
from torchao.quantization.prototype.qat import (
disable_4w_fake_quant,
disable_8da4w_fake_quant,
enable_4w_fake_quant,
enable_8da4w_fake_quant,
ComposableQATQuantizer,
Int4WeightOnlyQATQuantizer,
Int4WeightOnlyEmbeddingQATQuantizer
Int8DynActInt4WeightQATQuantizer,
Int8DynActInt4WeightQATLinear,
)
from torchao.quantization.prototype.qat.api import (
FakeQuantizeConfig,
)
from torchao.quantization.prototype.qat.fake_quantizer import (
FakeQuantizer,
)
```
After:
```
from torchao.quantization.qat import (
ComposableQATQuantizer,
Int4WeightOnlyQATQuantizer,
Int4WeightOnlyEmbeddingQATQuantizer
Int8DynActInt4WeightQATQuantizer,
)
from torchao.quantization.qat.linear import (
disable_4w_fake_quant,
disable_8da4w_fake_quant,
enable_4w_fake_quant,
enable_8da4w_fake_quant,
Int8DynActInt4WeightQATLinear,
)
from torchao.quantization.qat.api import (
FakeQuantizeConfig,
)
from torchao.quantization.qat.fake_quantizer import (
FakeQuantizer,
)
```
Test Plan:
python test/quantization/test_qat.py
ghstack-source-id: add9dca
Pull Request resolved: #1091
Contributor
Author
|
@andrewor14 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
Closed
andrewor14
added a commit
to andrewor14/torchtune
that referenced
this pull request
Oct 22, 2024
Summary: pytorch/ao#1091 moved QAT out of prototype in torchao. This is a BC-breaking change so torchtune also needs to update its QAT imports. Additionally, after pytorch/ao#987 we decided that QAT in torchao will use module swaps to insert fake quantizes, so there is no need to have a separate module swap quantizer, so this commit removes the `*ModuleSwapQuantizer` option. Test Plan: pytest -m integration_test tests/recipes/test_qat_distributed.py should work
andrewor14
added a commit
to andrewor14/torchtune
that referenced
this pull request
Oct 22, 2024
Summary: pytorch/ao#1091 moved QAT out of prototype in torchao. This is a BC-breaking change so torchtune also needs to update its QAT imports. Additionally, after pytorch/ao#987 we decided that QAT in torchao will use module swaps to insert fake quantizes, so there is no need to have a separate module swap quantizer, so this commit removes the `*ModuleSwapQuantizer` option. Test Plan: pytest -m integration_test tests/recipes/test_qat_distributed.py should work
Contributor
Author
|
Oh... this was merged into the wrong branch. I need to reopen this |
Contributor
|
oh I see, that's why there is a fix up diff |
Contributor
Author
|
yeah exactly, opened #1152 to reland |
andrewor14
added a commit
that referenced
this pull request
Oct 24, 2024
Move QAT out of prototype
Summary: Move QAT out of prototype so we can provide stronger
BC guarantees moving forward.
**(Future) BC-breaking notes**
Note: This commit itself doesn't break BC yet. A future PR
will do that. The following is just to save this BC breaking
note somewhere.
Before:
```
from torchao.quantization.prototype.qat import (
disable_4w_fake_quant,
disable_8da4w_fake_quant,
enable_4w_fake_quant,
enable_8da4w_fake_quant,
ComposableQATQuantizer,
Int4WeightOnlyQATQuantizer,
Int4WeightOnlyEmbeddingQATQuantizer
Int8DynActInt4WeightQATQuantizer,
Int8DynActInt4WeightQATLinear,
)
from torchao.quantization.prototype.qat.api import (
FakeQuantizeConfig,
)
from torchao.quantization.prototype.qat.fake_quantizer import (
FakeQuantizer,
)
```
After:
```
from torchao.quantization.qat import (
ComposableQATQuantizer,
Int4WeightOnlyQATQuantizer,
Int4WeightOnlyEmbeddingQATQuantizer
Int8DynActInt4WeightQATQuantizer,
)
from torchao.quantization.qat.linear import (
disable_4w_fake_quant,
disable_8da4w_fake_quant,
enable_4w_fake_quant,
enable_8da4w_fake_quant,
Int8DynActInt4WeightQATLinear,
)
from torchao.quantization.qat.api import (
FakeQuantizeConfig,
)
from torchao.quantization.qat.fake_quantizer import (
FakeQuantizer,
)
```
Test Plan:
python test/quantization/test_qat.py
ghstack-source-id: add9dca
Pull Request resolved: #1091
jainapurva
pushed a commit
that referenced
this pull request
Oct 24, 2024
Move QAT out of prototype
Summary: Move QAT out of prototype so we can provide stronger
BC guarantees moving forward.
**(Future) BC-breaking notes**
Note: This commit itself doesn't break BC yet. A future PR
will do that. The following is just to save this BC breaking
note somewhere.
Before:
```
from torchao.quantization.prototype.qat import (
disable_4w_fake_quant,
disable_8da4w_fake_quant,
enable_4w_fake_quant,
enable_8da4w_fake_quant,
ComposableQATQuantizer,
Int4WeightOnlyQATQuantizer,
Int4WeightOnlyEmbeddingQATQuantizer
Int8DynActInt4WeightQATQuantizer,
Int8DynActInt4WeightQATLinear,
)
from torchao.quantization.prototype.qat.api import (
FakeQuantizeConfig,
)
from torchao.quantization.prototype.qat.fake_quantizer import (
FakeQuantizer,
)
```
After:
```
from torchao.quantization.qat import (
ComposableQATQuantizer,
Int4WeightOnlyQATQuantizer,
Int4WeightOnlyEmbeddingQATQuantizer
Int8DynActInt4WeightQATQuantizer,
)
from torchao.quantization.qat.linear import (
disable_4w_fake_quant,
disable_8da4w_fake_quant,
enable_4w_fake_quant,
enable_8da4w_fake_quant,
Int8DynActInt4WeightQATLinear,
)
from torchao.quantization.qat.api import (
FakeQuantizeConfig,
)
from torchao.quantization.qat.fake_quantizer import (
FakeQuantizer,
)
```
Test Plan:
python test/quantization/test_qat.py
ghstack-source-id: add9dca
Pull Request resolved: #1091
andrewor14
added a commit
to andrewor14/torchtune
that referenced
this pull request
Oct 25, 2024
Summary: pytorch/ao#1091 moved QAT out of prototype in torchao. This is a BC-breaking change so torchtune also needs to update its QAT imports. Additionally, after pytorch/ao#987 we decided that QAT in torchao will use module swaps to insert fake quantizes, so there is no need to have a separate module swap quantizer, so this commit removes the `*ModuleSwapQuantizer` option. Test Plan: pytest -m integration_test tests/recipes/test_qat_distributed.py should work
andrewor14
added a commit
to andrewor14/torchtune
that referenced
this pull request
Oct 25, 2024
Summary: pytorch/ao#1091 moved QAT out of prototype in torchao. This is a BC-breaking change so torchtune also needs to update its QAT imports. Additionally, after pytorch/ao#987 we decided that QAT in torchao will use module swaps to insert fake quantizes, so there is no need to have a separate module swap quantizer, so this commit removes the `*ModuleSwapQuantizer` option. Test Plan: pytest -m integration_test tests/recipes/test_qat_distributed.py should work
yanbing-j
pushed a commit
to yanbing-j/ao
that referenced
this pull request
Dec 9, 2024
…at/ folder (pytorch#1076) * [Hackability Refactor] Move known_model_params under torchchat (pytorch#1073) * [Hackability Refactor] Migrate CLI call sites to explicitly go through torchchat.py (pytorch#1075) * [Hackability Refactor] Move model.py underneath torchchat/ (pytorch#1077) * Move model.py * Clear out init to avoid package circular import * [Hackability Refactor] Move select top level docs into folders within torchchat (pytorch#1080) * [Hackability Refactor] Move the top level util folder into torchchat/utils (pytorch#1079) * [Hackability Refactor] Move the top level util file into torchchat/utils/ * Cleared out init to avoid packing * [Hackability Refactor] Collapse gguf_util into gguf_loader (pytorch#1078) * [Hackability Refactor] Collapse gguf_util into gguf_loader * Update bad import * [Hackability Refactor] Move model_config into torchchat/model_config (pytorch#1082) * [Hackability Refactor] Move cli related files under torchchat/cli (pytorch#1083) * [Hackability Refactor] Move build/util into torchchat/utils (pytorch#1084) * [Hackability Refactor] Easy Moves: eval, gguf_loader, quantize, model_dist (pytorch#1085) * [Hackability Refactor] Easy Cheap Moves: eval, gguf_loader, quantize, model_dist * Update eval.py call sites that slipped through the initial pass * [Hackability Refactor] Update missed direct file calls to use torchchat.py (pytorch#1088) * [Hackability Refactor] Move export and generate under torchchat/ (pytorch#1089) * [Hackability Refactor] Move scripts under torchchat/utils (pytorch#1090) * [Hackability Refactor] Move scripts under torchchat/utils * Fix install script for AOTI * Update referenced path in build_android * Adding missing utils path * Add another layer for torchchat * Move the source command depending on if TC root is defined * [Hackability Refactor] Move installation related files into install/ (pytorch#1081) * [Hackability Refactor] Move installation related files into install/ * Fix install req path * Test fix with install path for bash * Debug messages * Remove changes to install in et_python_libs * Remove debug echo * Fix pin path for et * [Hackability Refactor] Restricted Lint (pytorch#1091) * [Hackability Refactor] Removing __main__ from export/generate/eval (pytorch#1092)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Stack from ghstack (oldest at bottom):
Summary: Move QAT out of prototype so we can provide stronger
BC guarantees moving forward.
BC-breaking notes
Before:
After:
Test Plan:
python test/quantization/test_qat.py
Differential Revision: D64555609