Skip to content

Add dont constant fold flag#154945

Closed
LevelDownRefine wants to merge 9 commits into
pytorch:mainfrom
LevelDownRefine:wengshiy/dont_constant_fold
Closed

Add dont constant fold flag#154945
LevelDownRefine wants to merge 9 commits into
pytorch:mainfrom
LevelDownRefine:wengshiy/dont_constant_fold

Conversation

@LevelDownRefine

@LevelDownRefine LevelDownRefine commented Jun 3, 2025

Copy link
Copy Markdown
Contributor

For support pytorch/ao#2228

What we want to do now is to enable FP8 quantization in PyTorch. And similar as INT8 quantization, we need to insert quantize and dequantize ops into the graph.

However we met problems with these q/dq ops both in the PyTorch core and Torchao.

PyTorch core:

The quantize_per_tensor op does not support FP8. We want to fix it via #153601. And as you commented, the op is deprecated.
Torchao:

In the fusion pass in Inductor, we want to match the pattern fp8_weight -> torchao.dequantize_affine_float8 -> fp32_op and fuse it as fp8_weight -> weight_pack -> fp8_op. We have done so for INT8 PT2E quantization. However, the pattern matching pass is applied after a constant folding pass in Inductor:
https://github.com/pytorch/pytorch/blob/100ec0b34aeff2b948dae33937857d0c86cf1646/torch/_inductor/fx_passes/freezing_patterns.py#L69C1-L74C1
After constant_fold(gm), the pattern will be folded as fp32_weight -> fp32_op. Then the original pattern cannot be found any more and the FP8 semantics is lost since the pattern is entirely in fp32 now.
For INT8, the int8_weight -> quantized_decomposed.dequantize_per_channel -> fp32_op pattern won't be folded because we mark quantized_decomposed.dequantize_per_channel impure so that it won't be folded: https://github.com/pytorch/pytorch/blob/100ec0b34aeff2b948dae33937857d0c86cf1646/torch/_inductor/constant_folding.py#L139C1-L149C1 . But for the torchao.dequantize_affine_float8, we cannot do this because
It is an op from Torchao, which is unknown to the constant folder
It is decomposed to smaller ops, so we cannot put it in the list as a single op.
So, we think an easy and short-term solution is to modify the ops in PyTorch core via #153601.
However, if we want to resolve the issue with Torchao, we need to
Add a method in the constant folder in Inductor to allow registration of impure ops

Based on Jansel‘s reply, add dont constant fold flag on this patch

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @chenyang78 @kadeng @muchulee8 @amjames @chauhang @aakhundov

@pytorch-bot

pytorch-bot Bot commented Jun 3, 2025

Copy link
Copy Markdown

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/154945

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit 6437a0d with merge base be2ad70 (image):

UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@LevelDownRefine LevelDownRefine changed the title Wengshiy/dont constant fold Add dont constant fold flag Jun 3, 2025
@Xia-Weiwen

Copy link
Copy Markdown
Collaborator

Thanks for the PR. I would suggest elaborating more about the background, like we have discussed elsewhere.

@LevelDownRefine

Copy link
Copy Markdown
Contributor Author

I want to add the flag dont_constant_fold on torch/_inductor/config.py, but get following warming.

W0529 15:14:03.946369 3148528 torch/_inductor/codecache.py:632] [0/0] Failed to pickle cache key
W0529 15:14:03.946369 3148528 torch/_inductor/codecache.py:632] [0/0] Traceback (most recent call last):
W0529 15:14:03.946369 3148528 torch/_inductor/codecache.py:632] [0/0]   File "/home/wengshiy/pytorch/torch/_inductor/codecache.py", line 628, in dumps
W0529 15:14:03.946369 3148528 torch/_inductor/codecache.py:632] [0/0]     self.dump(obj)         
W0529 15:14:03.946369 3148528 torch/_inductor/codecache.py:632] [0/0] TypeError: cannot pickle 'PyCapsule' object

So I add this flag on torch/_inductor/constant_folding.py firstly

@LevelDownRefine

Copy link
Copy Markdown
Contributor Author

Thanks for the PR. I would suggest elaborating more about the background, like we have discussed elsewhere.

Done

@leslie-fang-intel leslie-fang-intel left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lint Failure?

@LevelDownRefine LevelDownRefine marked this pull request as ready for review June 4, 2025 09:08
@LevelDownRefine

Copy link
Copy Markdown
Contributor Author

Lint Failure?

Fixed

@LevelDownRefine

Copy link
Copy Markdown
Contributor Author

Hi @eellison @jansel , could you help review this patch?

Comment thread torch/_inductor/constant_folding.py Outdated
jansel
jansel previously approved these changes Jun 4, 2025
Co-authored-by: Jason Ansel <jansel@jansel.net>
@LevelDownRefine

Copy link
Copy Markdown
Contributor Author

@pytorchbot merge

@pytorch-bot

pytorch-bot Bot commented Jun 5, 2025

Copy link
Copy Markdown

This PR has pending changes requested. Please address the comments and update the PR before merging.

@LevelDownRefine

Copy link
Copy Markdown
Contributor Author

@pytorchbot merge

@pytorch-bot pytorch-bot Bot added the ciflow/trunk Trigger trunk jobs on your pull request label Jun 5, 2025
@pytorchmergebot

Copy link
Copy Markdown
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@LevelDownRefine

LevelDownRefine commented Jun 9, 2025

Copy link
Copy Markdown
Contributor Author

skip halide on my test to avoid CI issue.

In my test(test/inductor/test_torchinductor.py), I check the function name to see if mul is constant_fold.
But on test/inductor/test_halide.py, function name will always be halide_kernel_0.

@jansel

jansel commented Jun 9, 2025

Copy link
Copy Markdown
Contributor

@pytorchbot merge

@pytorch-bot pytorch-bot Bot added the ciflow/trunk Trigger trunk jobs on your pull request label Jun 9, 2025
@pytorchmergebot

Copy link
Copy Markdown
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorchmergebot

Copy link
Copy Markdown
Collaborator

Merge failed

Reason: 1 jobs have failed, first few of them are: inductor / unit-test / linux-jammy-cpu-py3.12-gcc11-inductor-triton-cpu / test (inductor-triton-cpu, 1, 1, linux.12xlarge)

Details for Dev Infra team Raised by workflow job

@pytorch-bot pytorch-bot Bot removed ciflow/trunk Trigger trunk jobs on your pull request ciflow/inductor labels Jun 9, 2025
@LevelDownRefine

Copy link
Copy Markdown
Contributor Author

@pytorchbot merge

@pytorch-bot

pytorch-bot Bot commented Jun 9, 2025

Copy link
Copy Markdown

Pull workflow has not been scheduled for the PR yet. It could be because author doesn't have permissions to run those or skip-checks keywords were added to PR/commits, aborting merge. Please get/give approval for the workflows and/or remove skip ci decorators before next merge attempt. If you think this is a mistake, please contact PyTorch Dev Infra.

@jansel jansel added ciflow/trunk Trigger trunk jobs on your pull request ciflow/inductor labels Jun 9, 2025
@jansel

jansel commented Jun 10, 2025

Copy link
Copy Markdown
Contributor

@pytorchbot merge

@pytorchmergebot

Copy link
Copy Markdown
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci-no-td Do not run TD on this PR ciflow/inductor ciflow/trunk Trigger trunk jobs on your pull request Merged module: inductor open source release notes: inductor Reverted triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants