Skip to content

Fold weight permutation inside quantized conv operator#26241

Closed
dzhulgakov wants to merge 8 commits intogh/dzhulgakov/2/basefrom
gh/dzhulgakov/2/head
Closed

Fold weight permutation inside quantized conv operator#26241
dzhulgakov wants to merge 8 commits intogh/dzhulgakov/2/basefrom
gh/dzhulgakov/2/head

Conversation

@dzhulgakov
Copy link
Copy Markdown
Collaborator

@dzhulgakov dzhulgakov commented Sep 14, 2019

Stack from ghstack:

According to #19092 we always keep NCHW order and do handling inside the kernels. This PR fixes it for weights of the qconv by using MemoryLayout mechanism.

Differential Revision: D17443219

@dzhulgakov dzhulgakov requested a review from apaszke as a code owner September 14, 2019 23:44
@pytorchbot pytorchbot added module: nn Related to torch.nn module: operators oncall: quantization Quantization support in PyTorch labels Sep 14, 2019
@dzhulgakov dzhulgakov changed the title permute for weights Fold weight permutation inside quantized fold operator Sep 14, 2019
@dzhulgakov dzhulgakov changed the title Fold weight permutation inside quantized fold operator Fold weight permutation inside quantized conv operator Sep 14, 2019
According to #19092 we always keep NCHW order and do handling inside the kernels. This PR fixes it for weights of the qconv by using MemoryLayout mechanism.

[ghstack-poisoned]
Copy link
Copy Markdown
Contributor

@jerryzh168 jerryzh168 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jerryzh168
Copy link
Copy Markdown
Contributor

What about quantize linear ops?

@dzhulgakov
Copy link
Copy Markdown
Collaborator Author

What do you mean about linear? Linear ops don't have memory order

Also cc @VitalyFedyunin to verify that memory format stuff looks good


b = torch.randn(oC, dtype=torch.float32) if use_bias else None
q_bias = torch.quantize_linear(b, scale=1.0 / 1024, zero_point=0, dtype=torch.qint32) if use_bias else None
q_filters_ref = torch.ops.quantized.conv_prepack(qw.permute([0, 2, 3, 1]),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't you also need similar changes to the tests in test_quantized.py?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, I think I put it accidentally in #26242 - let me just squash them together

Dmytro Dzhulgakov added 2 commits September 17, 2019 20:04
According to #19092 we always keep NCHW order and do handling inside the kernels. This PR fixes it for weights of the qconv by using MemoryLayout mechanism.

[ghstack-poisoned]
According to #19092 we always keep NCHW order and do handling inside the kernels. This PR fixes it for weights of the qconv by using MemoryLayout mechanism.

[ghstack-poisoned]
@VitalyFedyunin
Copy link
Copy Markdown
Contributor

Please don't land, I need to read it carefully.

Copy link
Copy Markdown
Contributor

@VitalyFedyunin VitalyFedyunin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Memory Format manipulations looks good!

Dmytro Dzhulgakov added 3 commits September 18, 2019 18:00
According to #19092 we always keep NCHW order and do handling inside the kernels. This PR fixes it for weights of the qconv by using MemoryLayout mechanism.

Differential Revision: [D17443219](https://our.internmc.facebook.com/intern/diff/D17443219)

[ghstack-poisoned]
According to #19092 we always keep NCHW order and do handling inside the kernels. This PR fixes it for weights of the qconv by using MemoryLayout mechanism.

Differential Revision: [D17443219](https://our.internmc.facebook.com/intern/diff/D17443219)

[ghstack-poisoned]
According to #19092 we always keep NCHW order and do handling inside the kernels. This PR fixes it for weights of the qconv by using MemoryLayout mechanism.

Differential Revision: [D17443219](https://our.internmc.facebook.com/intern/diff/D17443219)

[ghstack-poisoned]
Comment thread test/test_quantized.py
W_KRSC = W.permute([0, 2, 3, 1]).contiguous()
if channelwise:
W_q = torch.quantize_linear_per_channel(W_KRSC,
W_q = torch.quantize_linear_per_channel(W,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for changing this too. Did you test it locally by setting PYTORCH_TEST_WITH_QNNPACK ?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I did :)

scales.toType(kDouble),
zero_points.toType(kLong),
{output_channels, kernel_h, kernel_w, C_per_G},
{output_channels, C_per_G, kernel_h, kernel_w},
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to change it similarly for QNNPACK as well below since we store it as MemoryFormat::ChannelsLast in the packed struct?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, you store the original Tensor and it's fine to return Tensor in ChannelLast format (it'd just happen to be non-contiguous but still semantically correct)

According to #19092 we always keep NCHW order and do handling inside the kernels. This PR fixes it for weights of the qconv by using MemoryLayout mechanism.

Differential Revision: [D17443219](https://our.internmc.facebook.com/intern/diff/D17443219)

[ghstack-poisoned]
zdevito pushed a commit to zdevito/ATen that referenced this pull request Sep 19, 2019
Summary:
Pull Request resolved: pytorch/pytorch#26241

According to pytorch/pytorch#19092 we always keep NCHW order and do handling inside the kernels. This PR fixes it for weights of the qconv by using MemoryLayout mechanism.

Test Plan: Imported from OSS

Differential Revision: D17443219

Pulled By: dzhulgakov

fbshipit-source-id: ce0eb92034a9977b3303dafab8b0414575171062
@facebook-github-bot
Copy link
Copy Markdown
Contributor

@dzhulgakov merged this pull request in d5daac7.

@facebook-github-bot facebook-github-bot deleted the gh/dzhulgakov/2/head branch October 28, 2019 22:08
laurentdupin pushed a commit to laurentdupin/pytorch that referenced this pull request Apr 24, 2026
Summary:
Pull Request resolved: pytorch#26241

According to pytorch#19092 we always keep NCHW order and do handling inside the kernels. This PR fixes it for weights of the qconv by using MemoryLayout mechanism.

Test Plan: Imported from OSS

Differential Revision: D17443219

Pulled By: dzhulgakov

fbshipit-source-id: ce0eb92034a9977b3303dafab8b0414575171062
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Merged module: nn Related to torch.nn oncall: quantization Quantization support in PyTorch

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants