Skip to content

observers: use clamp instead of min/max in calculate_qparams#43150

Closed
vkuzo wants to merge 1 commit intogh/vkuzo/127/basefrom
gh/vkuzo/127/head
Closed

observers: use clamp instead of min/max in calculate_qparams#43150
vkuzo wants to merge 1 commit intogh/vkuzo/127/basefrom
gh/vkuzo/127/head

Conversation

@vkuzo
Copy link
Copy Markdown
Contributor

@vkuzo vkuzo commented Aug 17, 2020

Stack from ghstack:

Summary:

The current logic was expensive because it created tensors on CUDA.
Switching to clamp since it can work without needing to create tensors.

yields a ~20% latency improvement on the CUDA microbenchmark for small tensors
(prev diff: P139074571, this diff: P139074706)

Test Plan:

benchmarks

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: D23170427

Summary:

The current logic was expensive because it created tensors on CUDA.
Switching to clamp since it can work without needing to create tensors.

Test Plan:

benchmarks

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
@dr-ci
Copy link
Copy Markdown

dr-ci Bot commented Aug 17, 2020

💊 CI failures summary and remediations

As of commit 9724ed7 (more details on the Dr. CI page):


  • 1/1 failures possibly* introduced in this PR
    • 1/1 non-CircleCI failure(s)

ci.pytorch.org: 1 failed


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group.

See how this bot performed.

This comment has been revised 1 time.

@facebook-github-bot
Copy link
Copy Markdown
Contributor

This pull request has been merged in 3264ba0.

@facebook-github-bot facebook-github-bot deleted the gh/vkuzo/127/head branch August 21, 2020 14:16
laurentdupin pushed a commit to laurentdupin/pytorch that referenced this pull request Apr 24, 2026
…#43150)

Summary:
Pull Request resolved: pytorch#43150

The current logic was expensive because it created tensors on CUDA.
Switching to clamp since it can work without needing to create tensors.

Test Plan:
benchmarks

Imported from OSS

Reviewed By: jerryzh168

Differential Revision: D23170427

fbshipit-source-id: 6fe3a728e737aca9f6c2c4d518c6376738577e21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants