Skip to content

[quant][graphmode] Different rule for handling aten::cat#38570

Closed
jerryzh168 wants to merge 5 commits intogh/jerryzh168/320/basefrom
gh/jerryzh168/320/head
Closed

[quant][graphmode] Different rule for handling aten::cat#38570
jerryzh168 wants to merge 5 commits intogh/jerryzh168/320/basefrom
gh/jerryzh168/320/head

Conversation

@jerryzh168
Copy link
Copy Markdown
Contributor

@jerryzh168 jerryzh168 commented May 15, 2020

Stack from ghstack:

Summary:
We changed the rule of quantizing aten::cat, previously aten::cat is considered to be
an op that should always be quantized, like aten::conv2d, but this is not ideal, a better
way is to quantize the output of aten::cat depending on whether the input is quantized, if it is
then we'll quantize the output, if not, then we will not quantize the output, since aten::cat works both on
quantized and non-quantized tensor.

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: D21600160

Summary:
We changed the rule of quantizing `aten::cat`, previously `aten::cat` is considered to be
an op that should always be quantized, like `aten::conv2d`, but this is not ideal, a better
way is to quantize the output of `aten::cat` depending on whether the input is quantized, if it is
then we'll quantize the output, if not, then we will not quantize the output, since `aten::cat` works both on
quantized and non-quantized tensor.

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
@dr-ci
Copy link
Copy Markdown

dr-ci Bot commented May 15, 2020

💊 CI failures summary and remediations

As of commit 7236764 (more details on the Dr. CI page):


  • 2/2 failures possibly* introduced in this PR
    • 2/2 non-CircleCI failure(s)

Extra GitHub checks: 1 failed


ci.pytorch.org: 1 failed


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker.

See how this bot performed.

This comment has been revised 16 times.

Summary:
We changed the rule of quantizing `aten::cat`, previously `aten::cat` is considered to be
an op that should always be quantized, like `aten::conv2d`, but this is not ideal, a better
way is to quantize the output of `aten::cat` depending on whether the input is quantized, if it is
then we'll quantize the output, if not, then we will not quantize the output, since `aten::cat` works both on
quantized and non-quantized tensor.

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D21600160](https://our.internmc.facebook.com/intern/diff/D21600160)

[ghstack-poisoned]
.run(m.graph)

# non quantized cat
m = torch.jit.script(NonQuantizedCat()).eval()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, what is the expected behavior? Would you quantize all the inputs to non-quantized cat and quantize the output too? In general, dont you quantize the input to a model?

Copy link
Copy Markdown
Contributor Author

@jerryzh168 jerryzh168 May 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, we don't quantize the inputs to cat in this case, the output of cat is only quantized when all the inputs are quantized.

!isObserved(v, block_observed_values)) {
if (auto observer_opt = getObserverFor(v)) {
auto observer_opt = getObserverFor(v);
// If the node is one of the propagate quant node, e.g.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isnt this logic also true for add or conv? Why special case concat?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only true for cat, for conv we'll always quantize its input and output

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need a consistent set of rules, consider the following two models:

import torch
import torch.nn as nn
class testM(nn.Module):
    def __init__(self):
        super().__init__()
        self.c = nn.Conv2d(3,5,1)
        self.d = nn.Conv2d(3,5,1)
    
    def forward(self, x):
# If there is a nn.Identity or shape, will the inputs be quantized?
        y = self.c(x)
        z = self.d(x)
        w = torch.cat((y, z))
        return w

# Second one:
class testM2(nn.Module):
    def __init__(self):
        super().__init__()
        self.c = nn.Conv2d(3,5,1)
        self.d = nn.Conv2d(3,5,1)
    
    def forward(self, x):
        w = torch.cat((x, x))
        y = self.c(w)
        z = self.d(w)
        return w

In the first case, the input will be quantized. In the second case the input will not be.
What if in the first case I have an nn.Identity prior to the conv or a reshape? In that case do we quantize the inputs?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't quantize identity or reshape, we also don't accept input that's already quantized outside of the model.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In terms user interface, user will always provide a floating point Tensor, regardless of how the model is quantized

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, the input is always in fp and in certain cases the input is quantized (conv) and in certain cases it is not (cat)

@jerryzh168 jerryzh168 requested a review from raghuramank100 May 15, 2020 20:38
Summary:
We changed the rule of quantizing `aten::cat`, previously `aten::cat` is considered to be
an op that should always be quantized, like `aten::conv2d`, but this is not ideal, a better
way is to quantize the output of `aten::cat` depending on whether the input is quantized, if it is
then we'll quantize the output, if not, then we will not quantize the output, since `aten::cat` works both on
quantized and non-quantized tensor.

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D21600160](https://our.internmc.facebook.com/intern/diff/D21600160)

[ghstack-poisoned]
Summary:
We changed the rule of quantizing `aten::cat`, previously `aten::cat` is considered to be
an op that should always be quantized, like `aten::conv2d`, but this is not ideal, a better
way is to quantize the output of `aten::cat` depending on whether the input is quantized, if it is
then we'll quantize the output, if not, then we will not quantize the output, since `aten::cat` works both on
quantized and non-quantized tensor.

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D21600160](https://our.internmc.facebook.com/intern/diff/D21600160)

[ghstack-poisoned]
Summary:
We changed the rule of quantizing `aten::cat`, previously `aten::cat` is considered to be
an op that should always be quantized, like `aten::conv2d`, but this is not ideal, a better
way is to quantize the output of `aten::cat` depending on whether the input is quantized, if it is
then we'll quantize the output, if not, then we will not quantize the output, since `aten::cat` works both on
quantized and non-quantized tensor.

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D21600160](https://our.internmc.facebook.com/intern/diff/D21600160)

[ghstack-poisoned]
@facebook-github-bot
Copy link
Copy Markdown
Contributor

This pull request has been merged in 1ef77f9.

@facebook-github-bot facebook-github-bot deleted the gh/jerryzh168/320/head branch May 23, 2020 14:16
laurentdupin pushed a commit to laurentdupin/pytorch that referenced this pull request Apr 24, 2026
…8570)

Summary:
Pull Request resolved: pytorch#38570

We changed the rule of quantizing `aten::cat`, previously `aten::cat` is considered to be
an op that should always be quantized, like `aten::conv2d`, but this is not ideal, a better
way is to quantize the output of `aten::cat` depending on whether the input is quantized, if it is
then we'll quantize the output, if not, then we will not quantize the output, since `aten::cat` works both on
quantized and non-quantized tensor.

Test Plan: Imported from OSS

Differential Revision: D21600160

fbshipit-source-id: efa957e0eaa608fffefcdfefa7f442fab45605eb
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Merged oncall: jit Add this issue/PR to JIT oncall triage queue

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants