[float8] Re-enable slow-accum in the bwd of axis-wise scaling schemes by lw · Pull Request #1377 · pytorch/ao

lw · 2024-12-04T15:55:49Z

And circumvent the issue with the slow CUTLASS kernel by using the cuBLAS kernel + manual scaling.

[ghstack-poisoned]

lw · 2024-12-04T15:55:50Z

Stack from ghstack (oldest at bottom):

pytorch-bot · 2024-12-04T15:55:53Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1377

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit d3aba66 with merge base 53d2486 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

And circumvent the issue with the slow CUTLASS kernel by using the cuBLAS kernel + manual scaling. ghstack-source-id: 54eb6ce ghstack-comment-id: 2517855458 Pull Request resolved: #1377

lw · 2024-12-04T15:57:21Z

Re-submission of #1325

[ghstack-poisoned]

* Bug fix: Enable fast to override quantize json * collapse conditional

…#1377)

Update

b91f59b

[ghstack-poisoned]

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Dec 4, 2024

This was referenced Dec 4, 2024

[float8] Allow specifying arbitrary dtype for each tensor #1378

Merged

[float8] Re-enable slow-accum in the bwd of axis-wise scaling schemes #1325

Merged

lw added the topic: performance Use this tag if this PR improves the performance of a feature label Dec 4, 2024

vkuzo approved these changes Dec 4, 2024

View reviewed changes

Update

d3aba66

[ghstack-poisoned]

lw merged commit 6a177c9 into main Dec 4, 2024

lw deleted the gh/lw/3/head branch December 4, 2024 21:13

yanbing-j pushed a commit to yanbing-j/ao that referenced this pull request Dec 9, 2024

Bug fix: Enable fast to override quantize json (pytorch#1377)

b809b69

* Bug fix: Enable fast to override quantize json * collapse conditional

amdfaa pushed a commit that referenced this pull request Jan 10, 2025

[float8] Re-enable slow-accum in the bwd of axis-wise scaling schemes (…

a85c2bb

…#1377)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[float8] Re-enable slow-accum in the bwd of axis-wise scaling schemes#1377

[float8] Re-enable slow-accum in the bwd of axis-wise scaling schemes#1377
lw merged 2 commits into
mainfrom
gh/lw/3/head

lw commented Dec 4, 2024

Uh oh!

lw commented Dec 4, 2024 •

edited

Loading

Uh oh!

pytorch-bot Bot commented Dec 4, 2024 •

edited

Loading

Uh oh!

lw commented Dec 4, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

lw commented Dec 4, 2024

Uh oh!

lw commented Dec 4, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot Bot commented Dec 4, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1377

✅ No Failures

Uh oh!

lw commented Dec 4, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

lw commented Dec 4, 2024 •

edited

Loading

pytorch-bot Bot commented Dec 4, 2024 •

edited

Loading