Skip to content

Adds unit test for validating correctness#40715

Closed
durumu wants to merge 2 commits intogh/durumu/7/basefrom
gh/durumu/7/head
Closed

Adds unit test for validating correctness#40715
durumu wants to merge 2 commits intogh/durumu/7/basefrom
gh/durumu/7/head

Conversation

@durumu
Copy link
Copy Markdown
Contributor

@durumu durumu commented Jun 29, 2020

Stack from ghstack:

Differential Revision: D22290872

@dr-ci
Copy link
Copy Markdown

dr-ci Bot commented Jun 29, 2020

💊 CI failures summary and remediations

As of commit 956a0d3 (more details on the Dr. CI page):



🕵️ 6 new failures recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

See CircleCI build pytorch_linux_bionic_py3_8_gcc9_test (1/6)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun)

Jun 30 00:08:05 ERROR:sccache::server: Compilation failed: Output { status: ExitStatus(ExitStatus(256)), stdout: "", stderr: "/var/lib/jenkins/.cache/torch_extensions/test_compilation_error_formatting/main.cpp: In function ‘int main()’:\n/var/lib/jenkins/.cache/torch_extensions/test_compilation_error_formatting/main.cpp:2:22: error: expected ‘;’ before ‘}’ token\n 2 | int main() { return 0 }\n | ^~\n | ;\n" }
Jun 30 00:08:05     raise RuntimeError(message) 
Jun 30 00:08:05 RuntimeError: test_type_hints failed! 
Jun 30 00:08:05  
Jun 30 00:08:05 real	24m19.173s 
Jun 30 00:08:05 user	26m52.369s 
Jun 30 00:08:05 sys	1m16.765s 
Jun 30 00:08:05 + cleanup 
Jun 30 00:08:05 + retcode=1 
Jun 30 00:08:05 + set +x 
Jun 30 00:08:05 =================== sccache compilation log =================== 
Jun 30 00:08:05 ERROR:sccache::server: Compilation failed: Output { status: ExitStatus(ExitStatus(256)), stdout: "", stderr: "/var/lib/jenkins/.cache/torch_extensions/test_compilation_error_formatting/main.cpp: In function ‘int main()’:\n/var/lib/jenkins/.cache/torch_extensions/test_compilation_error_formatting/main.cpp:2:22: error: expected ‘;’ before ‘}’ token\n    2 | int main() { return 0 }\n      |                      ^~\n      |                      ;\n" } 
Jun 30 00:08:05  
Jun 30 00:08:05 =========== If your build fails, please take a look at the log above for possible reasons =========== 
Jun 30 00:08:05 Compile requests                 64 
Jun 30 00:08:05 Compile requests executed        35 
Jun 30 00:08:05 Cache hits                       27 
Jun 30 00:08:05 Cache misses                      7 
Jun 30 00:08:05 Cache timeouts                    0 
Jun 30 00:08:05 Cache read errors                 0 
Jun 30 00:08:05 Forced recaches                   0 
Jun 30 00:08:05 Cache write errors                0 

See CircleCI build pytorch_xla_linux_bionic_py3_6_clang9_build (2/6)

Step: "(Optional) Merge target branch" (full log | diagnosis details | 🔁 rerun)

Automatic merge failed; fix conflicts and then commit the result.
CONFLICT (add/add): Merge conflict in .circleci/config.yml 
Auto-merging .circleci/config.yml 
CONFLICT (add/add): Merge conflict in .circleci/cimodel/data/windows_build_definitions.py 
Auto-merging .circleci/cimodel/data/windows_build_definitions.py 
CONFLICT (add/add): Merge conflict in .circleci/cimodel/data/simple/util/docker_constants.py 
Auto-merging .circleci/cimodel/data/simple/util/docker_constants.py 
CONFLICT (add/add): Merge conflict in .circleci/cimodel/data/pytorch_build_definitions.py 
Auto-merging .circleci/cimodel/data/pytorch_build_definitions.py 
CONFLICT (add/add): Merge conflict in .circleci/cimodel/data/pytorch_build_data.py 
Auto-merging .circleci/cimodel/data/pytorch_build_data.py 
Automatic merge failed; fix conflicts and then commit the result. 

See CircleCI build pytorch_linux_xenial_py3_clang5_asan_test (3/6)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun)

Jun 29 23:43:28 SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /var/lib/jenkins/workspace/aten/src/ATen/Utils.cpp:11:3 in
Jun 29 23:43:28     #7 0x5595b174074b in PyEval_EvalCode /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:731 
Jun 29 23:43:28     #8 0x5595b17c0633 in run_mod /tmp/build/80754af9/python_1585002248360/work/Python/pythonrun.c:1025 
Jun 29 23:43:28     #9 0x5595b17c06cc in PyRun_StringFlags /tmp/build/80754af9/python_1585002248360/work/Python/pythonrun.c:949 
Jun 29 23:43:28     #10 0x5595b17c072e in PyRun_SimpleStringFlags /tmp/build/80754af9/python_1585002248360/work/Python/pythonrun.c:445 
Jun 29 23:43:28     #11 0x5595b17c4532 in run_command /tmp/build/80754af9/python_1585002248360/work/Modules/main.c:301 
Jun 29 23:43:28     #12 0x5595b17c4532 in Py_Main /tmp/build/80754af9/python_1585002248360/work/Modules/main.c:749 
Jun 29 23:43:28     #13 0x5595b168f1fd in main /tmp/build/80754af9/python_1585002248360/work/Programs/python.c:69 
Jun 29 23:43:28     #14 0x7f0aa965482f in __libc_start_main /build/glibc-LK5gWL/glibc-2.23/csu/../csu/libc-start.c:291 
Jun 29 23:43:28     #15 0x5595b176dc29 in _start /home/rdonnelly/mc/conda-bld/compilers_linux-64_1534865402226/work/.build/src/glibc-2.12.2/csu/../sysdeps/x86_64/elf/start.S:103 
Jun 29 23:43:28  
Jun 29 23:43:28 SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /var/lib/jenkins/workspace/aten/src/ATen/Utils.cpp:11:3 in  
Jun 29 23:43:28 + retcode=1 
Jun 29 23:43:28 + set -e 
Jun 29 23:43:28 + return 1 
Jun 29 23:43:28 + [[ pytorch-linux-xenial-py3-clang5-asan-test == *-NO_AVX-* ]] 
Jun 29 23:43:28 + [[ pytorch-linux-xenial-py3-clang5-asan-test == *-NO_AVX2-* ]] 
Jun 29 23:43:28 + '[' -n https://github.com/pytorch/pytorch/pull/40715 ']' 
Jun 29 23:43:28 ++ mktemp 
Jun 29 23:43:28 + DETERMINE_FROM=/tmp/tmp.omcuDTYoV7 
Jun 29 23:43:28 + file_diff_from_base /tmp/tmp.omcuDTYoV7 
Jun 29 23:43:28 + set +e 

See CircleCI build pytorch_linux_xenial_py3_6_gcc5_4_build (4/6)

Step: "(Optional) Merge target branch" (full log | diagnosis details | 🔁 rerun)

Automatic merge failed; fix conflicts and then commit the result.
CONFLICT (add/add): Merge conflict in .circleci/config.yml 
Auto-merging .circleci/config.yml 
CONFLICT (add/add): Merge conflict in .circleci/cimodel/data/windows_build_definitions.py 
Auto-merging .circleci/cimodel/data/windows_build_definitions.py 
CONFLICT (add/add): Merge conflict in .circleci/cimodel/data/simple/util/docker_constants.py 
Auto-merging .circleci/cimodel/data/simple/util/docker_constants.py 
CONFLICT (add/add): Merge conflict in .circleci/cimodel/data/pytorch_build_definitions.py 
Auto-merging .circleci/cimodel/data/pytorch_build_definitions.py 
CONFLICT (add/add): Merge conflict in .circleci/cimodel/data/pytorch_build_data.py 
Auto-merging .circleci/cimodel/data/pytorch_build_data.py 
Automatic merge failed; fix conflicts and then commit the result. 

See CircleCI build pytorch_linux_bionic_py3_6_clang9_test (5/6)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun)

Jun 30 00:06:44 ERROR:sccache::server: Compilation failed: Output { status: ExitStatus(ExitStatus(256)), stdout: "", stderr: "/var/lib/jenkins/.cache/torch_extensions/test_compilation_error_formatting/main.cpp: In function \'int main()\':\n/var/lib/jenkins/.cache/torch_extensions/test_compilation_error_formatting/main.cpp:2:23: error: expected \';\' before \'}\' token\n int main() { return 0 }\n ^\n" }
Jun 30 00:06:44     raise RuntimeError(message) 
Jun 30 00:06:44 RuntimeError: test_type_hints failed! 
Jun 30 00:06:44  
Jun 30 00:06:44 real	26m22.748s 
Jun 30 00:06:44 user	31m58.011s 
Jun 30 00:06:44 sys	2m28.419s 
Jun 30 00:06:44 + cleanup 
Jun 30 00:06:44 + retcode=1 
Jun 30 00:06:44 + set +x 
Jun 30 00:06:44 =================== sccache compilation log =================== 
Jun 30 00:06:44 ERROR:sccache::server: Compilation failed: Output { status: ExitStatus(ExitStatus(256)), stdout: "", stderr: "/var/lib/jenkins/.cache/torch_extensions/test_compilation_error_formatting/main.cpp: In function \'int main()\':\n/var/lib/jenkins/.cache/torch_extensions/test_compilation_error_formatting/main.cpp:2:23: error: expected \';\' before \'}\' token\n int main() { return 0 }\n                       ^\n" } 
Jun 30 00:06:44  
Jun 30 00:06:44 =========== If your build fails, please take a look at the log above for possible reasons =========== 
Jun 30 00:06:44 Compile requests                 64 
Jun 30 00:06:44 Compile requests executed        35 
Jun 30 00:06:44 Cache hits                       27 
Jun 30 00:06:44 Cache misses                      7 
Jun 30 00:06:44 Cache timeouts                    0 
Jun 30 00:06:44 Cache read errors                 0 
Jun 30 00:06:44 Forced recaches                   0 
Jun 30 00:06:44 Cache write errors                0 

See CircleCI build pytorch_macos_10_13_py3_test (6/6)

Step: "Test" (full log | diagnosis details | 🔁 rerun)

Jun 29 17:20:59 [E request_callback_impl.cpp:168] Received error while processing request type 2: RuntimeError: Can not pickle torch.futures.Future
Jun 29 17:20:59 At: 
Jun 29 17:20:59   /Users/distiller/workspace/miniconda3/lib/python3.7/site-packages/torch/distributed/rpc/internal.py(90): serialize 
Jun 29 17:20:59   /Users/distiller/workspace/miniconda3/lib/python3.7/site-packages/torch/distributed/rpc/internal.py(142): serialize 
Jun 29 17:20:59  
Jun 29 17:20:59 [E request_callback_impl.cpp:168] Received error while processing request type 2: RuntimeError: Can not pickle torch.futures.Future 
Jun 29 17:20:59  
Jun 29 17:20:59 At: 
Jun 29 17:20:59   /Users/distiller/workspace/miniconda3/lib/python3.7/site-packages/torch/distributed/rpc/internal.py(90): serialize 
Jun 29 17:20:59   /Users/distiller/workspace/miniconda3/lib/python3.7/site-packages/torch/distributed/rpc/internal.py(142): serialize 
Jun 29 17:20:59  
Jun 29 17:20:59 [E request_callback_impl.cpp:168] Received error while processing request type 2: RuntimeError: Can not pickle torch.futures.Future 
Jun 29 17:20:59  
Jun 29 17:20:59 At: 
Jun 29 17:20:59   /Users/distiller/workspace/miniconda3/lib/python3.7/site-packages/torch/distributed/rpc/internal.py(90): serialize 
Jun 29 17:20:59   /Users/distiller/workspace/miniconda3/lib/python3.7/site-packages/torch/distributed/rpc/internal.py(142): serialize 
Jun 29 17:20:59  
Jun 29 17:20:59 [W tensorpipe_agent.cpp:491] RPC agent for worker2 encountered error when reading incoming request from worker0: EOF: end of file (this is expected to happen during shutdown) 
Jun 29 17:20:59 [W tensorpipe_agent.cpp:491] RPC agent for worker3 encountered error when reading incoming request from worker0: EOF: end of file (this is expected to happen during shutdown) 
Jun 29 17:20:59 [W tensorpipe_agent.cpp:491] RPC agent for worker0 encountered error when reading incoming request from worker3: EOF: end of file (this is expected to happen during shutdown) 
Jun 29 17:20:59 [W tensorpipe_agent.cpp:491] RPC agent for worker1 encountered error when reading incoming request from worker0: EOF: end of file (this is expected to happen during shutdown) 
Jun 29 17:20:59 ok (1.555s) 

❄️ 2 failures tentatively classified as flaky

but reruns have not yet been triggered to confirm:

See CircleCI build pytorch_windows_vs2019_py36_cpu_build (1/2)

Step: "Build" (full log | diagnosis details | 🔁 rerun) ❄️

CondaHTTPError: HTTP 000 CONNECTION FAILED for url
 
circleci@PACKER-5ECD3249 C:\Users\circleci\project>call C:\Jenkins\Miniconda3\Scripts\activate.bat C:\Jenkins\Miniconda3  
 
(base) circleci@PACKER-5ECD3249 C:\Users\circleci\project>if "" == "" ( 
call conda install -y -q python=3.6 numpy cffi pyyaml boto3   
 call conda install -y -q -c conda-forge cmake  
)  
Collecting package metadata (current_repodata.json): ...working... done 
Solving environment: ...working... done 
 
CondaHTTPError: HTTP 000 CONNECTION FAILED for url <https://repo.anaconda.com/pkgs/main/win-64/mkl-2020.1-216.conda> 
Elapsed: - 
 
An HTTP error occurred when trying to retrieve this URL. 
HTTP errors are often intermittent, and a simple retry will get you on your way. 
 
 
 
## Package Plan ## 
 
  environment location: C:\Jenkins\Miniconda3 

See CircleCI build pytorch_linux_xenial_cuda10_2_cudnn7_py3_gcc7_test (2/2)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun) ❄️

Jun 30 00:33:22 ConnectionResetError: [Errno 104] Connection reset by peer
Jun 30 00:33:22   File "/opt/conda/lib/python3.6/multiprocessing/connection.py", line 455, in accept 
Jun 30 00:33:22     deliver_challenge(c, self._authkey) 
Jun 30 00:33:22   File "/opt/conda/lib/python3.6/multiprocessing/connection.py", line 722, in deliver_challenge 
Jun 30 00:33:22     response = connection.recv_bytes(256)        # reject large message 
Jun 30 00:33:22   File "/opt/conda/lib/python3.6/multiprocessing/connection.py", line 216, in recv_bytes 
Jun 30 00:33:22     buf = self._recv_bytes(maxlength) 
Jun 30 00:33:22   File "/opt/conda/lib/python3.6/multiprocessing/connection.py", line 407, in _recv_bytes 
Jun 30 00:33:22     buf = self._recv(4) 
Jun 30 00:33:22   File "/opt/conda/lib/python3.6/multiprocessing/connection.py", line 379, in _recv 
Jun 30 00:33:22     chunk = read(handle, remaining) 
Jun 30 00:33:22 ConnectionResetError: [Errno 104] Connection reset by peer 
Jun 30 00:33:22 /opt/conda/lib/python3.6/multiprocessing/semaphore_tracker.py:143: UserWarning: semaphore_tracker: There appear to be 14 leaked semaphores to clean up at shutdown 
Jun 30 00:33:22   len(cache)) 
Jun 30 00:33:24 Process ErrorTrackingProcess-126: 
Jun 30 00:33:24 Traceback (most recent call last): 
Jun 30 00:33:24   File "/opt/conda/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap 
Jun 30 00:33:24     self.run() 
Jun 30 00:33:24   File "/var/lib/jenkins/workspace/test/test_dataloader.py", line 360, in run 
Jun 30 00:33:24     super(ErrorTrackingProcess, self).run() 
Jun 30 00:33:24   File "/opt/conda/lib/python3.6/multiprocessing/process.py", line 93, in run 
Jun 30 00:33:24     self._target(*self._args, **self._kwargs) 

🚧 1 fixed upstream failure:

These were probably caused by upstream breakages that were already fixed.

Please rebase on the viable/strict branch (expand for instructions)

Since your merge base is older than viable/strict, run these commands:

git fetch https://github.com/pytorch/pytorch viable/strict
git rebase FETCH_HEAD

Check out the recency history of this "viable master" tracking branch.


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group.

See how this bot performed.

This comment has been revised 14 times.

durumu added a commit that referenced this pull request Jun 29, 2020
ghstack-source-id: a1cef3a
Pull Request resolved: #40715
durumu added a commit that referenced this pull request Jul 14, 2020
ghstack-source-id: f7c00c9
Pull Request resolved: #40532

Adds unit test for validating correctness

ghstack-source-id: f7c00c9
Pull Request resolved: #40715
durumu added a commit that referenced this pull request Jul 21, 2020
ghstack-source-id: f185ce9
Pull Request resolved: #40532

Adds unit test for validating correctness

ghstack-source-id: f185ce9
Pull Request resolved: #40715
@smessmer smessmer requested a review from supriyar August 10, 2020 21:34
@smessmer smessmer added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Aug 10, 2020
MauiDesign pushed a commit to MauiDesign/PyTorchPyTorch that referenced this pull request Aug 16, 2020
ghstack-source-id: 331009d
Pull Request resolved: pytorch/pytorch#40532

Adds unit test for validating correctness

ghstack-source-id: 331009d
Pull Request resolved: pytorch/pytorch#40715
@facebook-github-bot
Copy link
Copy Markdown
Contributor

Hi @durumu!

Thank you for your pull request. We require contributors to sign our Contributor License Agreement, and yours needs attention.

You currently have a record in our system, but we do not have a signature on file.

In order for us to review and merge your code, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

If you have received this in error or have any questions, please contact us at cla@fb.com. Thanks!

@supriyar supriyar closed this Oct 30, 2020
@facebook-github-bot facebook-github-bot deleted the gh/durumu/7/head branch November 30, 2020 15:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

open source triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants