Skip to content

Implement QAT for APoT#83282

Closed
asl3 wants to merge 3 commits intogh/asl3/50/basefrom
gh/asl3/50/head
Closed

Implement QAT for APoT#83282
asl3 wants to merge 3 commits intogh/asl3/50/basefrom
gh/asl3/50/head

Conversation

@asl3
Copy link
Contributor

@asl3 asl3 commented Aug 11, 2022

Stack from ghstack (oldest at bottom):

Summary:

This PR implements QAT for APoT FakeQuant. It runs QAT with FX graph mode quantized models (Resnet-18 pre-trained model, full ImageNet dataset) to compare accuracy metrics for different qconfig settings of uniform vs. APoT quantized activation and weight. It also refactors the APoT PTQ module apot_fx_graph_mode_ptq.py (previously fx_graph_mode_apot.py) such that shared helper functions between PTQ and QAT are in a separate file quantization_util.py.

Model #2 (uniformly quantized activation, APoT quantized weight) shows comparable accuracy compared to model #1 (uniformly quantized activation, APoT quantized weight) for 8-bit and significant accuracy improvement for 4-bit (see "Accuracy Stats" section below).

Test Plan:

Run QAT models with: python test/quantization/core/experimental/apot_qat.py
Run PTQ models with: python test/quantization/core/experimental/apot_ptq.py

Accuracy Stats

8-bit (Uniform int8, APoT b = 8 k = 2)

Model #1: Uniform activation, uniform weight (FX Graph Mode quantized)
Evaluation accuracy on test dataset: 69.67% (Top-1), 89.04% (Top-5)

Model #2: Uniform activation, APoT weight (FX Graph Mode quantized)
Evaluation accuracy on test dataset: 69.72% (Top-1), 89.06% (Top-5)

4-bit (Uniform int4, APoT b = 4 k = 2)

Model #1: Uniform activation, uniform weight (FX Graph Mode quantized)
Evaluation accuracy on test dataset: 46.85% (Top-1), 72.85% (Top-5)

Model #2: Uniform activation, APoT weight (FX Graph Mode quantized)
Evaluation accuracy on test dataset: 66.45% (Top-1), 86.23% (Top-5)

[ghstack-poisoned]
@facebook-github-bot
Copy link
Contributor

facebook-github-bot commented Aug 11, 2022

🔗 Helpful links

✅ No Failures (0 Pending)

As of commit ef5f5c6 (more details on the Dr. CI page):

Expand to see more

💚 💚 Looks good so far! There are no failures yet. 💚 💚


This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

### Summary:
This PR implements QAT for APoT FakeQuant. It runs QAT with FX graph mode quantized models (Resnet-18 pre-trained model, full ImageNet dataset) to compare accuracy metrics for different qconfig settings of uniform vs. APoT quantized activation and weight. It also refactors the APoT PTQ module `apot_ptq.py` (previously `fx_graph_mode_apot.py`) such that shared helper functions between PTQ and QAT are in a separate file `quantization_util.py`.

See "Accuracy Stats" section below.

### Test Plan:
Run QAT models with: `python test/quantization/core/experimental/apot_qat.py`
Run PTQ models with: `python test/quantization/core/experimental/apot_ptq.py`

### Accuracy Stats
8-bit (Uniform int8, APoT b = 8 k = 2)

Model #1: Uniform activation, uniform weight (FX Graph Mode quantized)
Evaluation accuracy on test dataset: 69.67% (Top-1), 89.04% (Top-5)

Model #2: Uniform activation, APoT weight (FX Graph Mode quantized)
Evaluation accuracy on test dataset: 69.72% (Top-1), 89.06% (Top-5)

4-bit (Uniform int4, APoT b = 4 k = 2)

Model #1: Uniform activation, uniform weight (FX Graph Mode quantized)
Evaluation accuracy on test dataset: 46.85% (Top-1), 72.85% (Top-5)

Model #2: Uniform activation, APoT weight (FX Graph Mode quantized)
Evaluation accuracy on test dataset: 66.45% (Top-1), 86.23% (Top-5)

[ghstack-poisoned]
Copy link
Contributor

@jerryzh168 jerryzh168 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good, thanks!

### Summary:
This PR implements QAT for APoT FakeQuant. It runs QAT with FX graph mode quantized models (Resnet-18 pre-trained model, full ImageNet dataset) to compare accuracy metrics for different qconfig settings of uniform vs. APoT quantized activation and weight. It also refactors the APoT PTQ module `apot_ptq.py` (previously `fx_graph_mode_apot.py`) such that shared helper functions between PTQ and QAT are in a separate file `quantization_util.py`.

See "Accuracy Stats" section below.

### Test Plan:
Run QAT models with: `python test/quantization/core/experimental/apot_qat.py`
Run PTQ models with: `python test/quantization/core/experimental/apot_ptq.py`

### Accuracy Stats
8-bit (Uniform int8, APoT b = 8 k = 2)

Model #1: Uniform activation, uniform weight (FX Graph Mode quantized)
Evaluation accuracy on test dataset: 69.67% (Top-1), 89.04% (Top-5)

Model #2: Uniform activation, APoT weight (FX Graph Mode quantized)
Evaluation accuracy on test dataset: 69.72% (Top-1), 89.06% (Top-5)

4-bit (Uniform int4, APoT b = 4 k = 2)

Model #1: Uniform activation, uniform weight (FX Graph Mode quantized)
Evaluation accuracy on test dataset: 46.85% (Top-1), 72.85% (Top-5)

Model #2: Uniform activation, APoT weight (FX Graph Mode quantized)
Evaluation accuracy on test dataset: 66.45% (Top-1), 86.23% (Top-5)

[ghstack-poisoned]
asl3 added a commit that referenced this pull request Aug 12, 2022
ghstack-source-id: 563f765
Pull Request resolved: #83282
@asl3
Copy link
Contributor Author

asl3 commented Aug 12, 2022

@pytorchbot merge -g

@pytorchmergebot
Copy link
Collaborator

@pytorchbot successfully started a merge job. Check the current status here.
The merge job was triggered with the green (-g) flag. This means that your change will be merged once all checks on your PR have passed (ETA: 0-4 Hours). If this is not the intended behavior, feel free to use some of the other merge options in the wiki.
Please reach out to the PyTorch DevX Team with feedback or questions!

@github-actions
Copy link
Contributor

Hey @asl3.
You've committed this PR, but it does not have both a 'release notes: ...' and 'topics: ...' label. Please add one of each to the PR. The 'release notes: ...' label should represent the part of PyTorch that this PR changes (fx, autograd, distributed, etc) and the 'topics: ...' label should represent the kind of PR it is (not user facing, new feature, bug fix, perf improvement, etc). The list of valid labels can be found here for the 'release notes: ...' and here for the 'topics: ...'.
For changes that are 'topic: not user facing' there is no need for a release notes label.

@asl3 asl3 added release notes: quantization release notes category topic: not user facing topic category labels Aug 12, 2022
facebook-github-bot pushed a commit that referenced this pull request Aug 12, 2022
Summary:
### Summary:
This PR implements QAT for APoT FakeQuant. It runs QAT with FX graph mode quantized models (Resnet-18 pre-trained model, full ImageNet dataset) to compare accuracy metrics for different qconfig settings of uniform vs. APoT quantized activation and weight. It also refactors the APoT PTQ module `apot_fx_graph_mode_ptq.py` (previously `fx_graph_mode_apot.py`) such that shared helper functions between PTQ and QAT are in a separate file `quantization_util.py`.

Model #2 (uniformly quantized activation, APoT quantized weight) shows comparable accuracy compared to model #1 (uniformly quantized activation, APoT quantized weight) for 8-bit and significant accuracy improvement for 4-bit (see "Accuracy Stats" section below).

### Test Plan:
Run QAT models with: `python test/quantization/core/experimental/apot_qat.py`
Run PTQ models with: `python test/quantization/core/experimental/apot_ptq.py`

### Accuracy Stats
8-bit (Uniform int8, APoT b = 8 k = 2)

Model #1: Uniform activation, uniform weight (FX Graph Mode quantized)
Evaluation accuracy on test dataset: 69.67% (Top-1), 89.04% (Top-5)

Model #2: Uniform activation, APoT weight (FX Graph Mode quantized)
Evaluation accuracy on test dataset: 69.72% (Top-1), 89.06% (Top-5)

4-bit (Uniform int4, APoT b = 4 k = 2)

Model #1: Uniform activation, uniform weight (FX Graph Mode quantized)
Evaluation accuracy on test dataset: 46.85% (Top-1), 72.85% (Top-5)

Model #2: Uniform activation, APoT weight (FX Graph Mode quantized)
Evaluation accuracy on test dataset: 66.45% (Top-1), 86.23% (Top-5)

Pull Request resolved: #83282
Approved by: https://github.com/jerryzh168

Test Plan: contbuild & OSS CI, see https://hud.pytorch.org/commit/pytorch/pytorch/0e0f8fd03e0967059456d4755c47e8fd4407796d

Reviewed By: atalman

Differential Revision: D38658591

Pulled By: asl3

fbshipit-source-id: 92d8e64ff41c5b3903516758381539113d0e432b
@facebook-github-bot facebook-github-bot deleted the gh/asl3/50/head branch August 15, 2022 14:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants