Skip to content

quant bench: update observer configs#42956

Closed
vkuzo wants to merge 2 commits intogh/vkuzo/122/basefrom
gh/vkuzo/122/head
Closed

quant bench: update observer configs#42956
vkuzo wants to merge 2 commits intogh/vkuzo/122/basefrom
gh/vkuzo/122/head

Conversation

@vkuzo
Copy link
Copy Markdown
Contributor

@vkuzo vkuzo commented Aug 13, 2020

Stack from ghstack:

Summary:

In preparation for observer perf improvement, cleans up the
micro benchmarks:

  • disable CUDA for histogram observers (it's too slow)
  • add larger shapes for better representation of real workloads

Test Plan:

cd benchmarks/operator_benchmark
python -m pt.qobserver_test

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: D23093996

Summary:

In preparation for observer perf improvement, cleans up the
micro benchmarks:
* disable CUDA for histogram observers (it's too slow)
* add larger shapes for better representation of real workloads

Test Plan:

```
cd benchmarks/operator_benchmark
python -m pt.qobserver_test
```

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
@dr-ci
Copy link
Copy Markdown

dr-ci Bot commented Aug 13, 2020

💊 CI failures summary and remediations

As of commit d96e549 (more details on the Dr. CI page):


💚 💚 Looks good so far! There are no failures yet. 💚 💚


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group.

See how this bot performed.

This comment has been revised 3 times.

Summary:

In preparation for observer perf improvement, cleans up the
micro benchmarks:
* disable CUDA for histogram observers (it's too slow)
* add larger shapes for better representation of real workloads

Test Plan:

```
cd benchmarks/operator_benchmark
python -m pt.qobserver_test
```

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D23093996](https://our.internmc.facebook.com/intern/diff/D23093996)

[ghstack-poisoned]
vkuzo added a commit that referenced this pull request Aug 16, 2020
Summary:

In preparation for observer perf improvement, cleans up the
micro benchmarks:
* disable CUDA for histogram observers (it's too slow)
* add larger shapes for better representation of real workloads

Test Plan:

```
cd benchmarks/operator_benchmark
python -m pt.qobserver_test
```

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 6047570
Pull Request resolved: #42956

def forward(self):
return self.op_func(self.f_input)
self.op_func(self.f_input)
Copy link
Copy Markdown
Contributor

@raghuramank100 raghuramank100 Aug 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously we had a forward and qparam benchmark separately which might be more useful in practice. We call forward for multiple iterations and calcqparams once at convert. With the separate ones, we can also synthesize the time taken for the combined forward+calcqparam call. Is there a reason to prefer this way of doing profiling?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is making the benchmark represent what happens inside the observer during QAT, not keeping the old code around because I'm not aware of a need for it in the near future. We have separate benchmarks for histogram observers, and I'm not aware of any requests to optimize observers outside of QAT + histogram observers.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

calculate_qparams is called at every pass through the observer during QAT, when observers are enabled

@facebook-github-bot
Copy link
Copy Markdown
Contributor

This pull request has been merged in 5aa61af.

@facebook-github-bot facebook-github-bot deleted the gh/vkuzo/122/head branch August 21, 2020 14:16
laurentdupin pushed a commit to laurentdupin/pytorch that referenced this pull request Apr 24, 2026
Summary:
Pull Request resolved: pytorch#42956

In preparation for observer perf improvement, cleans up the
micro benchmarks:
* disable CUDA for histogram observers (it's too slow)
* add larger shapes for better representation of real workloads

Test Plan:
```
cd benchmarks/operator_benchmark
python -m pt.qobserver_test
```

Imported from OSS

Reviewed By: supriyar

Differential Revision: D23093996

fbshipit-source-id: 5dc477c9bd5490d79d85ff8537270cd25aca221a
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants