[Inductor] Fix conditional codegen#129492

Closed

sijiac wants to merge 1 commit intopytorch:mainfrom

sijiac:export-D58973730

Contributor

sijiac commented Jun 25, 2024 •

edited by pytorch-bot bot

Loading

Summary:
We have the cache to guarantee the sym is codegen only once, see the following code

def ensure_size_computed(self, sym: sympy.Symbol):
    if isinstance(sym, sympy.Symbol) and symbol_is_type(sym, SymT.PRECOMPUTED_SIZE):
        if sym in self.computed_sizes:
            return
        self.computed_sizes.add(sym)
        expr = V.graph.sizevars.inv_precomputed_replacements[sym]
        self.writeline(
            f"{self.declare}{sym} = {self.expr_printer(expr)}{self.ending}"
        )

However, we don't consider the case when same syms need to be codegen in both conditions (true branch and false branch), which caused the issue of undefined symbols: P1441378833

To fix the issue, we use a stack to capture the state before doing the condition codegen and restore the state after doing the codegen

Test Plan:
TORCH_LOGS="+inductor" buck2 run mode/dev-nosan -c fbcode.nvcc_arch=h100 -c fbcode.enable_gpu_sections=true --config 'cxx.extra_cxxflags=-g1' -c fbcode.platform010_cuda_version=12 //scripts/hhh:repro_cond_torch_compile

PYTORCH_TEST_FBCODE=1 TORCH_COMPILE_DEBUG=1 buck2 run mode/opt -c=python.package_style=inplace -c fbcode.enable_gpu_sections=true -c fbcode.platform=platform010 -c fbcode.split-dwarf=true //caffe2/test/inductor:control_flow -- -r test_cond_control_flow_with_precomputed_size

Differential Revision: D58973730

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @peterbell10 @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @chauhang

pytorch-bot bot commented Jun 25, 2024 •

edited

Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/129492

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 84c787b with merge base d1b832e ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

pytorch-bot bot added ciflow/inductor module: inductor labels

Contributor

facebook-github-bot commented Jun 25, 2024

This pull request was exported from Phabricator. Differential Revision: D58973730

facebook-github-bot added the fb-exported label

Contributor

facebook-github-bot commented Jun 25, 2024

This pull request was exported from Phabricator. Differential Revision: D58973730

sijiac added a commit to sijiac/pytorch that referenced this pull request


          [Inductor] Fix conditional codegen (pytorch#129492)

41b26d6

Summary:
Pull Request resolved: pytorch#129492

We have the cache to guarantee the `sym` is codegen only once, see the following code
```
def ensure_size_computed(self, sym: sympy.Symbol):
    if isinstance(sym, sympy.Symbol) and symbol_is_type(sym, SymT.PRECOMPUTED_SIZE):
        if sym in self.computed_sizes:
            return
        self.computed_sizes.add(sym)
        expr = V.graph.sizevars.inv_precomputed_replacements[sym]
        self.writeline(
            f"{self.declare}{sym} = {self.expr_printer(expr)}{self.ending}"
        )
```
However, we don't consider the case when same `sym`s need to be codegen in both conditions (true branch and false branch), which caused the issue of  `undefined symbols`: P1441378833

To fix the issue, we use a stack to capture the state before doing the condition codegen and restore the state after doing the codegen

Test Plan:
TORCH_LOGS="+inductor" buck2 run mode/dev-nosan -c fbcode.nvcc_arch=h100 -c fbcode.enable_gpu_sections=true --config 'cxx.extra_cxxflags=-g1' -c fbcode.platform010_cuda_version=12 //scripts/hhh:repro_cond_torch_compile

PYTORCH_TEST_FBCODE=1 TORCH_COMPILE_DEBUG=1 buck2 run  mode/opt -c=python.package_style=inplace -c fbcode.enable_gpu_sections=true -c fbcode.platform=platform010 -c fbcode.split-dwarf=true //caffe2/test/inductor:control_flow -- -r test_cond_control_flow_with_precomputed_size

Differential Revision: D58973730

sijiac force-pushed the export-D58973730 branch from 0145dc3 to 41b26d6 Compare

June 25, 2024 18:58

sijiac requested a review from aakhundov

June 26, 2024 04:48

chenyang78 commented Jun 26, 2024

Fix the lint issue and test failure? Thanks!

Contributor

facebook-github-bot commented Jun 27, 2024

This pull request was exported from Phabricator. Differential Revision: D58973730

sijiac force-pushed the export-D58973730 branch from 41b26d6 to f277df8 Compare

June 27, 2024 05:41

pytorch-bot bot pushed a commit that referenced this pull request


          [Inductor] Fix conditional codegen (#129492)

f277df8

Summary:
Pull Request resolved: #129492

We have the cache to guarantee the `sym` is codegen only once, see the following code
```
def ensure_size_computed(self, sym: sympy.Symbol):
    if isinstance(sym, sympy.Symbol) and symbol_is_type(sym, SymT.PRECOMPUTED_SIZE):
        if sym in self.computed_sizes:
            return
        self.computed_sizes.add(sym)
        expr = V.graph.sizevars.inv_precomputed_replacements[sym]
        self.writeline(
            f"{self.declare}{sym} = {self.expr_printer(expr)}{self.ending}"
        )
```
However, we don't consider the case when same `sym`s need to be codegen in both conditions (true branch and false branch), which caused the issue of  `undefined symbols`: P1441378833

To fix the issue, we use a stack to capture the state before doing the condition codegen and restore the state after doing the codegen

Test Plan:
TORCH_LOGS="+inductor" buck2 run mode/dev-nosan -c fbcode.nvcc_arch=h100 -c fbcode.enable_gpu_sections=true --config 'cxx.extra_cxxflags=-g1' -c fbcode.platform010_cuda_version=12 //scripts/hhh:repro_cond_torch_compile

PYTORCH_TEST_FBCODE=1 TORCH_COMPILE_DEBUG=1 buck2 run  mode/opt -c=python.package_style=inplace -c fbcode.enable_gpu_sections=true -c fbcode.platform=platform010 -c fbcode.split-dwarf=true //caffe2/test/inductor:control_flow -- -r test_cond_control_flow_with_precomputed_size

Differential Revision: D58973730

Contributor

facebook-github-bot commented Jul 1, 2024

This pull request was exported from Phabricator. Differential Revision: D58973730

sijiac force-pushed the export-D58973730 branch from f277df8 to 944af20 Compare

July 1, 2024 04:57

pytorch-bot bot pushed a commit that referenced this pull request


          [Inductor] Fix conditional codegen (#129492)

944af20

Summary:
Pull Request resolved: #129492

We have the cache to guarantee the `sym` is codegen only once, see the following code
```
def ensure_size_computed(self, sym: sympy.Symbol):
    if isinstance(sym, sympy.Symbol) and symbol_is_type(sym, SymT.PRECOMPUTED_SIZE):
        if sym in self.computed_sizes:
            return
        self.computed_sizes.add(sym)
        expr = V.graph.sizevars.inv_precomputed_replacements[sym]
        self.writeline(
            f"{self.declare}{sym} = {self.expr_printer(expr)}{self.ending}"
        )
```
However, we don't consider the case when same `sym`s need to be codegen in both conditions (true branch and false branch), which caused the issue of  `undefined symbols`: P1441378833

To fix the issue, we use a stack to capture the state before doing the condition codegen and restore the state after doing the codegen

Test Plan:
TORCH_LOGS="+inductor" buck2 run mode/dev-nosan -c fbcode.nvcc_arch=h100 -c fbcode.enable_gpu_sections=true --config 'cxx.extra_cxxflags=-g1' -c fbcode.platform010_cuda_version=12 //scripts/hhh:repro_cond_torch_compile

PYTORCH_TEST_FBCODE=1 TORCH_COMPILE_DEBUG=1 buck2 run  mode/opt -c=python.package_style=inplace -c fbcode.enable_gpu_sections=true -c fbcode.platform=platform010 -c fbcode.split-dwarf=true //caffe2/test/inductor:control_flow -- -r test_cond_control_flow_with_precomputed_size

Differential Revision: D58973730

Contributor

facebook-github-bot commented Jul 1, 2024

This pull request was exported from Phabricator. Differential Revision: D58973730

sijiac force-pushed the export-D58973730 branch from 944af20 to 084a9c9 Compare

July 1, 2024 20:57

Contributor

facebook-github-bot commented Jul 1, 2024

This pull request was exported from Phabricator. Differential Revision: D58973730

sijiac force-pushed the export-D58973730 branch from 084a9c9 to c88aabe Compare

July 1, 2024 21:08

sijiac added a commit to sijiac/pytorch that referenced this pull request


          [Inductor] Fix conditional codegen (pytorch#129492)

c88aabe

Summary:
Pull Request resolved: pytorch#129492

We have the cache to guarantee the `sym` is codegen only once, see the following code
```
def ensure_size_computed(self, sym: sympy.Symbol):
    if isinstance(sym, sympy.Symbol) and symbol_is_type(sym, SymT.PRECOMPUTED_SIZE):
        if sym in self.computed_sizes:
            return
        self.computed_sizes.add(sym)
        expr = V.graph.sizevars.inv_precomputed_replacements[sym]
        self.writeline(
            f"{self.declare}{sym} = {self.expr_printer(expr)}{self.ending}"
        )
```
However, we don't consider the case when same `sym`s need to be codegen in both conditions (true branch and false branch), which caused the issue of  `undefined symbols`: P1441378833

To fix the issue, we use a stack to capture the state before doing the condition codegen and restore the state after doing the codegen

Test Plan:
TORCH_LOGS="+inductor" buck2 run mode/dev-nosan -c fbcode.nvcc_arch=h100 -c fbcode.enable_gpu_sections=true --config 'cxx.extra_cxxflags=-g1' -c fbcode.platform010_cuda_version=12 //scripts/hhh:repro_cond_torch_compile

PYTORCH_TEST_FBCODE=1 TORCH_COMPILE_DEBUG=1 buck2 run  mode/opt -c=python.package_style=inplace -c fbcode.enable_gpu_sections=true -c fbcode.platform=platform010 -c fbcode.split-dwarf=true //caffe2/test/inductor:control_flow -- -r test_cond_control_flow_with_precomputed_size

Differential Revision: D58973730

sijiac force-pushed the export-D58973730 branch from c88aabe to e6a856b Compare

July 2, 2024 03:06

Contributor

facebook-github-bot commented Jul 2, 2024

This pull request was exported from Phabricator. Differential Revision: D58973730

sijiac added a commit to sijiac/pytorch that referenced this pull request


          [Inductor] Fix conditional codegen (pytorch#129492)

e6a856b

Summary:
Pull Request resolved: pytorch#129492

We have the cache to guarantee the `sym` is codegen only once, see the following code
```
def ensure_size_computed(self, sym: sympy.Symbol):
    if isinstance(sym, sympy.Symbol) and symbol_is_type(sym, SymT.PRECOMPUTED_SIZE):
        if sym in self.computed_sizes:
            return
        self.computed_sizes.add(sym)
        expr = V.graph.sizevars.inv_precomputed_replacements[sym]
        self.writeline(
            f"{self.declare}{sym} = {self.expr_printer(expr)}{self.ending}"
        )
```
However, we don't consider the case when same `sym`s need to be codegen in both conditions (true branch and false branch), which caused the issue of  `undefined symbols`: P1441378833

To fix the issue, we use a stack to capture the state before doing the condition codegen and restore the state after doing the codegen

Test Plan:
TORCH_LOGS="+inductor" buck2 run mode/dev-nosan -c fbcode.nvcc_arch=h100 -c fbcode.enable_gpu_sections=true --config 'cxx.extra_cxxflags=-g1' -c fbcode.platform010_cuda_version=12 //scripts/hhh:repro_cond_torch_compile

PYTORCH_TEST_FBCODE=1 TORCH_COMPILE_DEBUG=1 buck2 run  mode/opt -c=python.package_style=inplace -c fbcode.enable_gpu_sections=true -c fbcode.platform=platform010 -c fbcode.split-dwarf=true //caffe2/test/inductor:control_flow -- -r test_cond_control_flow_with_precomputed_size

Differential Revision: D58973730

Contributor

facebook-github-bot commented Jul 7, 2024

This pull request was exported from Phabricator. Differential Revision: D58973730

sijiac force-pushed the export-D58973730 branch from e6a856b to 584ad0f Compare

July 7, 2024 23:07


          [Inductor] Fix conditional codegen (pytorch#129492)

84c787b

Summary:
Pull Request resolved: pytorch#129492

We have the cache to guarantee the `sym` is codegen only once, see the following code
```
def ensure_size_computed(self, sym: sympy.Symbol):
    if isinstance(sym, sympy.Symbol) and symbol_is_type(sym, SymT.PRECOMPUTED_SIZE):
        if sym in self.computed_sizes:
            return
        self.computed_sizes.add(sym)
        expr = V.graph.sizevars.inv_precomputed_replacements[sym]
        self.writeline(
            f"{self.declare}{sym} = {self.expr_printer(expr)}{self.ending}"
        )
```
However, we don't consider the case when same `sym`s need to be codegen in both conditions (true branch and false branch), which caused the issue of  `undefined symbols`: P1441378833

To fix the issue, we use a stack to capture the state before doing the condition codegen and restore the state after doing the codegen

Test Plan:
TORCH_LOGS="+inductor" buck2 run mode/dev-nosan -c fbcode.nvcc_arch=h100 -c fbcode.enable_gpu_sections=true --config 'cxx.extra_cxxflags=-g1' -c fbcode.platform010_cuda_version=12 //scripts/hhh:repro_cond_torch_compile

PYTORCH_TEST_FBCODE=1 TORCH_COMPILE_DEBUG=1 buck2 run  mode/opt -c=python.package_style=inplace -c fbcode.enable_gpu_sections=true -c fbcode.platform=platform010 -c fbcode.split-dwarf=true //caffe2/test/inductor:control_flow -- -r test_cond_control_flow_with_precomputed_size

Differential Revision: D58973730

sijiac force-pushed the export-D58973730 branch from 584ad0f to 84c787b Compare

July 7, 2024 23:14

Contributor

facebook-github-bot commented Jul 7, 2024

This pull request was exported from Phabricator. Differential Revision: D58973730

Contributor Author

sijiac commented Jul 8, 2024

@pytorchbot merge

pytorch-bot bot commented Jul 8, 2024

This PR needs to be approved by an authorized maintainer before merge.

aakhundov approved these changes

View reviewed changes

Contributor

aakhundov left a comment

Thanks for the fix!

pytorch-bot bot added the ciflow/trunk label

Contributor Author

sijiac commented Jul 8, 2024

@pytorchbot merge

pytorchmergebot added the merging label

Collaborator

pytorchmergebot commented Jul 8, 2024

Merge failed

Reason: This PR needs a release notes: label
If your changes are user facing and intended to be a part of release notes, please use a label starting with release notes:.

If not, please add the topic: not user facing label.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "topic: not user facing"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Details for Dev Infra team

Raised by workflow job

pytorchmergebot removed the merging label

Contributor Author

sijiac commented Jul 8, 2024

@pytorchbot label "topic: not user facing"

pytorch-bot bot added the topic: not user facing label

Contributor Author

sijiac commented Jul 8, 2024

@pytorchbot merge

pytorchmergebot added the merging label

Collaborator

pytorchmergebot commented Jul 8, 2024

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

Contributor

facebook-github-bot commented Jul 8, 2024

@pytorchbot merge -f 'Landed internally'

(Initiating merge automatically since Phabricator Diff has merged, using force because this PR might not pass merge_rules.json but landed internally)

Collaborator

pytorchmergebot commented Jul 8, 2024

The merge job was canceled or timed out. This most often happen if two merge requests were issued for the same PR, or if merge job was waiting for more than 6 hours for tests to finish. In later case, please do not hesitate to reissue the merge command
For more information see pytorch-bot wiki.

Collaborator

pytorchmergebot commented Jul 8, 2024

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use -f as last resort and instead consider -i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot added the Merged label

pytorchmergebot closed this in

31bb65d

pytorchmergebot removed the merging label

etaf mentioned this pull request

[Break XPU] The newly added test_cond_control_flow_with_precomputed_size use requires_gpu, but hard code cuda. #130426

Closed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/inductor ciflow/trunk fb-exported Merged module: inductor topic: not user facing