Skip to content

OpInfo for nn.functional.batch_norm#63218

Closed
krshrimali wants to merge 27 commits intopytorch:masterfrom
krshrimali:opinfo/high_priority/nn/functional/batch_norm
Closed

OpInfo for nn.functional.batch_norm#63218
krshrimali wants to merge 27 commits intopytorch:masterfrom
krshrimali:opinfo/high_priority/nn/functional/batch_norm

Conversation

@krshrimali
Copy link
Contributor

@krshrimali krshrimali commented Aug 13, 2021

Addresses pytorch/functorch#78 and #54261.

  • There exists torch.batch_norm but it takes an extra arg: cudnn_enabled (not there in functional variant). This is passed from the functional variant to torch.batch_norm here: https://github.com/pytorch/pytorch/blob/master/torch/nn/functional.py#L2282. test_variant_consistency_jit fails with an error: (when passed an alias)
    File "/home/krshrimali/Documents/Projects/Quansight/pytorch/test/test_ops.py", line 457, in _test_consistency_helper
    variant_forward = variant(cloned,
    TypeError: batch_norm() missing 1 required positional arguments: "cudnn_enabled"
    • I'm not sure of a solution to this, as AFIK - there is no way to pass a lambda wrapper for an alias. Hence, I've skipped adding this as an alias there.
    • On second thought, is this even an alias?

cc: @mruberry @zou3519 @kshitij12345

@facebook-github-bot
Copy link
Contributor

facebook-github-bot commented Aug 13, 2021

🔗 Helpful links

💊 CI failures summary and remediations

As of commit 84998cc (more details on the Dr. CI page):


  • 1/1 failures introduced in this PR

🕵️ 1 new failure recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

See CircleCI build pytorch_linux_xenial_cuda10_2_cudnn7_py3_gcc7_old_gradcheck_test1 (1/1)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun)

Sep 21 23:58:26 RuntimeError: test_ops failed!
Sep 21 23:58:23 Generating XML reports...
Sep 21 23:58:23 Generated XML report: test-reports/python-unittest/test_ops/TEST-TestCommonCUDA-20210921202126.xml
Sep 21 23:58:23 Generated XML report: test-reports/python-unittest/test_ops/TEST-TestGradientsCUDA-20210921202126.xml
Sep 21 23:58:23 Generated XML report: test-reports/python-unittest/test_ops/TEST-TestJitCUDA-20210921202126.xml
Sep 21 23:58:23 Generated XML report: test-reports/python-unittest/test_ops/TEST-TestMathBitsCUDA-20210921202126.xml
Sep 21 23:58:26 Traceback (most recent call last):
Sep 21 23:58:26   File "test/run_test.py", line 1027, in <module>
Sep 21 23:58:26     main()
Sep 21 23:58:26   File "test/run_test.py", line 1005, in main
Sep 21 23:58:26     raise RuntimeError(err_message)
Sep 21 23:58:26 RuntimeError: test_ops failed!
Sep 21 23:58:27 + cleanup
Sep 21 23:58:27 + retcode=1
Sep 21 23:58:27 + set +x
Sep 21 23:58:27 =================== sccache compilation log ===================
Sep 21 23:58:27 =========== If your build fails, please take a look at the log above for possible reasons ===========
Sep 21 23:58:27 Compile requests                      0
Sep 21 23:58:27 Compile requests executed             0
Sep 21 23:58:27 Cache hits                            0
Sep 21 23:58:27 Cache misses                          0
Sep 21 23:58:27 Cache timeouts                        0

This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

@krshrimali krshrimali added the module: testing Issues related to the torch.testing module (not tests) label Aug 13, 2021
@krshrimali krshrimali marked this pull request as draft August 13, 2021 06:50
@kshitij12345
Copy link
Collaborator

There exists torch.batch_norm but it takes an extra arg: cudnn_enabled (not there in functional variant). This is passed from the functional variant to torch.batch_norm here: https://github.com/pytorch/pytorch/blob/master/torch/nn/functional.py#L2282. test_variant_consistency_jit fails with an error: (when passed an alias)

In that case, wouldn't torch.batch_norm be the main implementation and nn.functional.batch_norm be an alias for it? So OpInfo should be for torch.batch_norm and nn.functional.batch_norm should be in the alias list?

@krshrimali
Copy link
Contributor Author

There exists torch.batch_norm but it takes an extra arg: cudnn_enabled (not there in functional variant). This is passed from the functional variant to torch.batch_norm here: https://github.com/pytorch/pytorch/blob/master/torch/nn/functional.py#L2282. test_variant_consistency_jit fails with an error: (when passed an alias)

In that case, wouldn't torch.batch_norm be the main implementation and nn.functional.batch_norm be an alias for it? So OpInfo should be for torch.batch_norm and nn.functional.batch_norm should be in the alias list?

I don't think that will also work since the functional variant doesn't accept an arg of cudnn_enabled.

ERROR: test_variant_consistency_eager_batch_norm_cpu_float32 (__main__.TestCommonCPU)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/krshrimali/Documents/Projects/Quansight/pytorch/torch/testing/_internal/common_device_type.py", line 373, in instantiated_test
    result = test(self, **param_kwargs)
  File "/home/krshrimali/Documents/Projects/Quansight/pytorch/torch/testing/_internal/common_device_type.py", line 780, in test_wrapper
    return test(*args, **kwargs)
  File "/home/krshrimali/Documents/Projects/Quansight/pytorch/test/test_ops.py", line 468, in test_variant_consistency_eager
    _test_consistency_helper(samples, variants)
  File "/home/krshrimali/Documents/Projects/Quansight/pytorch/test/test_ops.py", line 457, in _test_consistency_helper
    variant_forward = variant(cloned,
TypeError: batch_norm() got an unexpected keyword argument 'cudnn_enabled'

@krshrimali krshrimali marked this pull request as ready for review August 13, 2021 11:24
Comment on lines +6823 to +6825
# RuntimeError: deepEquals(input.iValue, deepCopiedInput) INTERNAL ASSERT FAILED at
# "../torch/csrc/jit/passes/utils/check_alias_annotation.cpp":142, please report a bug to PyTorch
SkipInfo('TestJit', 'test_variant_consistency_jit'),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unsure about this test failure.

cc: @eellison

@zou3519
Copy link
Contributor

zou3519 commented Aug 13, 2021

There exists torch.batch_norm

IMO torch.batch_norm should not exist and we just forgot to exclude it and now we can't because of a deprecation cycle... but I wouldn't worry about the additional "cudnn_enabled" argument

@astaff astaff added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Aug 16, 2021
@zou3519 zou3519 mentioned this pull request Aug 26, 2021
@krshrimali
Copy link
Contributor Author

Have fixed the merge conflicts, @zou3519 - could you please take a look whenever you find the time? Thanks! :)

@zou3519
Copy link
Contributor

zou3519 commented Sep 15, 2021

Have fixed the merge conflicts, @zou3519 - could you please take a look whenever you find the time? Thanks! :)

yes, will do! Sorry for the delayed response.

Copy link
Contributor

@zou3519 zou3519 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great! I added some minor suggestions for cases, please let me know what you think

@facebook-github-bot
Copy link
Contributor

@zou3519 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@zou3519
Copy link
Contributor

zou3519 commented Sep 20, 2021

Test failures are unrelated; Attempting to merge

@facebook-github-bot
Copy link
Contributor

@zou3519 merged this pull request in 28bfdbb.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla signed Merged module: testing Issues related to the torch.testing module (not tests) open source triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants