Skip to content

Fix generate_not_implemented_tests not testing unknown types correctly#56997

Closed
asi1024 wants to merge 4 commits intopytorch:masterfrom
asi1024:fix-operator-test
Closed

Fix generate_not_implemented_tests not testing unknown types correctly#56997
asi1024 wants to merge 4 commits intopytorch:masterfrom
asi1024:fix-operator-test

Conversation

@asi1024
Copy link
Copy Markdown
Contributor

@asi1024 asi1024 commented Apr 27, 2021

Currently, the test code is not testing unknown types correctly because op is overwritten in the for-loop (i.e., currently only __ior__ is tested).
This PR fixes the test generate_not_implemented_tests to bind operator name to each method, and remove operators currently unsupported (__rand__, …).

cc/ @mruberry This fix is be needed to add tests for the operator we are going to introduce (e.g., __rand__)

@facebook-github-bot
Copy link
Copy Markdown
Contributor

facebook-github-bot commented Apr 27, 2021

💊 CI failures summary and remediations

As of commit 61407c1 (more details on the Dr. CI page):



🕵️ 2 new failures recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

See CircleCI build pytorch_linux_xenial_py3_6_gcc5_4_build (1/2)

Step: "(Optional) Merge target branch" (full log | diagnosis details | 🔁 rerun)

Automatic merge failed; fix conflicts and then commit the result.
CONFLICT (add/add): Merge conflict in .circleci/config.yml
Auto-merging .circleci/config.yml
CONFLICT (add/add): Merge conflict in .circleci/cimodel/data/windows_build_definitions.py
Auto-merging .circleci/cimodel/data/windows_build_definitions.py
CONFLICT (add/add): Merge conflict in .circleci/cimodel/data/simple/docker_definitions.py
Auto-merging .circleci/cimodel/data/simple/docker_definitions.py
CONFLICT (add/add): Merge conflict in .circleci/cimodel/data/pytorch_build_data.py
Auto-merging .circleci/cimodel/data/pytorch_build_data.py
CONFLICT (add/add): Merge conflict in .circleci/cimodel/data/binary_build_definitions.py
Auto-merging .circleci/cimodel/data/binary_build_definitions.py
Automatic merge failed; fix conflicts and then commit the result.


Exited with code exit status 1

See CircleCI build pytorch_xla_linux_bionic_py3_6_clang9_build (2/2)

Step: "(Optional) Merge target branch" (full log | diagnosis details | 🔁 rerun)

Automatic merge failed; fix conflicts and then commit the result.
CONFLICT (add/add): Merge conflict in .circleci/config.yml
Auto-merging .circleci/config.yml
CONFLICT (add/add): Merge conflict in .circleci/cimodel/data/windows_build_definitions.py
Auto-merging .circleci/cimodel/data/windows_build_definitions.py
CONFLICT (add/add): Merge conflict in .circleci/cimodel/data/simple/docker_definitions.py
Auto-merging .circleci/cimodel/data/simple/docker_definitions.py
CONFLICT (add/add): Merge conflict in .circleci/cimodel/data/pytorch_build_data.py
Auto-merging .circleci/cimodel/data/pytorch_build_data.py
CONFLICT (add/add): Merge conflict in .circleci/cimodel/data/binary_build_definitions.py
Auto-merging .circleci/cimodel/data/binary_build_definitions.py
Automatic merge failed; fix conflicts and then commit the result.


Exited with code exit status 1


1 failure not recognized by patterns:

Job Step Action
GitHub Actions clang-format Run clang-format 🔁 rerun

❄️ 1 failure tentatively classified as flaky

but reruns have not yet been triggered to confirm:

See CircleCI build pytorch_linux_xenial_cuda10_2_cudnn7_py3_gcc7_test2 (1/1)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun) ❄️

May 06 14:30:17 RuntimeError: Process 0 terminated or timed out after 110.04512786865234 seconds
May 06 14:30:17 ======================================================================
May 06 14:30:17 ERROR [110.068s]: test_nccl_high_priority_stream (__main__.TestDistBackendWithSpawn)
May 06 14:30:17 ----------------------------------------------------------------------
May 06 14:30:17 Traceback (most recent call last):
May 06 14:30:17   File "/opt/conda/lib/python3.6/site-packages/torch/testing/_internal/common_distributed.py", line 374, in wrapper
May 06 14:30:17     self._join_processes(fn)
May 06 14:30:17   File "/opt/conda/lib/python3.6/site-packages/torch/testing/_internal/common_distributed.py", line 566, in _join_processes
May 06 14:30:17     self._check_return_codes(elapsed_time)
May 06 14:30:17   File "/opt/conda/lib/python3.6/site-packages/torch/testing/_internal/common_distributed.py", line 614, in _check_return_codes
May 06 14:30:17     raise RuntimeError('Process {} terminated or timed out after {} seconds'.format(i, elapsed_time))
May 06 14:30:17 RuntimeError: Process 0 terminated or timed out after 110.04512786865234 seconds
May 06 14:30:17 
May 06 14:30:17 ----------------------------------------------------------------------
May 06 14:30:17 Ran 196 tests in 657.476s
May 06 14:30:17 
May 06 14:30:17 FAILED (errors=4, skipped=117)
May 06 14:30:17 
May 06 14:30:17 Generating XML reports...
May 06 14:30:17 Generated XML report: test-reports/dist-nccl/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20210506141919.xml
May 06 14:30:17 Traceback (most recent call last):
May 06 14:30:17   File "test/run_test.py", line 1156, in <module>

This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

@mruberry mruberry self-requested a review April 27, 2021 08:11
Copy link
Copy Markdown
Collaborator

@mruberry mruberry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @asi1024! Thanks for noticing this issue; this is often a tricky Python nuance.

It looks like some of the tests are failing, probably because this test wasn't running for so long and it regressed. Do you the test can be updated? Alternatively, is there something else these failures are telling us? That is, is there something outside this test we should be fixing?

@mruberry
Copy link
Copy Markdown
Collaborator

cc @pmeier, who's investigating updates to the binary pwise tests

@asi1024
Copy link
Copy Markdown
Contributor Author

asi1024 commented Apr 27, 2021

@mruberry I updated the test to pass in b913ad5!
I also fixed some routines to return NotImlemented correctly in 1d209db, but I can split the commit to another PR if preferred.

@mruberry mruberry self-requested a review April 30, 2021 12:52
Copy link
Copy Markdown
Collaborator

@mruberry mruberry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @asi1024! Test failures appear unrelated

@facebook-github-bot
Copy link
Copy Markdown
Contributor

@mruberry has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@mruberry
Copy link
Copy Markdown
Collaborator

mruberry commented May 1, 2021

With apologies, @asi1024, I believe I spoke too soon about the test failures being unrelated. I suspect the following test failure is related to this change:

Apr 27 11:51:54 =================================== FAILURES ===================================
Apr 27 11:51:54 ___________________________ TestOperators.test_rsub ____________________________
Apr 27 11:51:54 Traceback (most recent call last):
Apr 27 11:51:54   File "/var/lib/jenkins/workspace/test/onnx/test_operators.py", line 168, in test_rsub
Apr 27 11:51:54     self.assertONNX(lambda x: 1 - x, (x,))
Apr 27 11:51:54   File "/var/lib/jenkins/workspace/test/onnx/test_operators.py", line 65, in assertONNX
Apr 27 11:51:54     self.assertExpected(onnx_model_pbtxt, subname)
Apr 27 11:51:54   File "/opt/conda/lib/python3.6/site-packages/torch/testing/_internal/common_utils.py", line 1553, in assertExpected
Apr 27 11:51:54     self.assertMultiLineEqual(expected, s)
Apr 27 11:51:54   File "/opt/conda/lib/python3.6/unittest/case.py", line 1203, in assertMultiLineEqual
Apr 27 11:51:54     self.fail(self._formatMessage(msg, standardMsg))
Apr 27 11:51:54   File "/opt/conda/lib/python3.6/unittest/case.py", line 670, in fail
Apr 27 11:51:54     raise self.failureException(msg)
Apr 27 11:51:54 AssertionError: 'ir_v[351 chars]ut: "x"\n    output: "2"\n    name: "Sub_1"\n [571 chars]n}\n' != 'ir_v[351 chars]ut: "0"\n    output: "2"\n    name: "Sub_1"\n [571 chars]n}\n'
Apr 27 11:51:54 Diff is 1126 characters long. Set self.maxDiff to None to see it.
Apr 27 11:51:54 =============================== warnings summary ===============================

These tests work by generating text files and comparing the new text file to an old "expect" file. So when one of these tests breaks it doesn't mean something is wrong, necessarily, simple that something has changed.

For simplicity, however, would you revert the changes to _tensor.py and python_variable_methods.cpp, then skip testing the ops which will cause the test to fail? Then this PR will fix the issue it was originally trying to solve. Additionally, however, would you please file an issue for the other changes this PR was making so we don't forget about them?

This reverts commit 1d209db.
@asi1024
Copy link
Copy Markdown
Contributor Author

asi1024 commented May 6, 2021

@mruberry I reverted the commit and will open an issue for this change!

@facebook-github-bot
Copy link
Copy Markdown
Contributor

@mruberry has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Copy Markdown
Contributor

@mruberry merged this pull request in 1428223.

@asi1024 asi1024 deleted the fix-operator-test branch May 10, 2021 06:24
krshrimali pushed a commit to krshrimali/pytorch that referenced this pull request May 19, 2021
…tly (pytorch#56997)

Summary:
Currently, the test code is not testing unknown types correctly because `op` is overwritten in the for-loop (i.e., currently only `__ior__` is tested).
This PR fixes the test `generate_not_implemented_tests` to bind operator name to each method, and remove operators currently unsupported (`__rand__`, …).

cc/ mruberry This fix is be needed to add tests for the operator we are going to introduce (e.g., `__rand__`)

Pull Request resolved: pytorch#56997

Reviewed By: astaff

Differential Revision: D28118465

Pulled By: mruberry

fbshipit-source-id: c5a466a7604262ed5490862300d47043aff63d0b
laurentdupin pushed a commit to laurentdupin/pytorch that referenced this pull request Apr 25, 2026
…tly (pytorch#56997)

Summary:
Currently, the test code is not testing unknown types correctly because `op` is overwritten in the for-loop (i.e., currently only `__ior__` is tested).
This PR fixes the test `generate_not_implemented_tests` to bind operator name to each method, and remove operators currently unsupported (`__rand__`, …).

cc/ mruberry This fix is be needed to add tests for the operator we are going to introduce (e.g., `__rand__`)

Pull Request resolved: pytorch#56997

Reviewed By: astaff

Differential Revision: D28118465

Pulled By: mruberry

fbshipit-source-id: c5a466a7604262ed5490862300d47043aff63d0b
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants