Skip to content

Revert "Allow Tensor-likes in torch.autograd.gradcheck (#43877)"#44554

Closed
ezyang wants to merge 1 commit intogh/ezyang/836/basefrom
gh/ezyang/836/head
Closed

Revert "Allow Tensor-likes in torch.autograd.gradcheck (#43877)"#44554
ezyang wants to merge 1 commit intogh/ezyang/836/basefrom
gh/ezyang/836/head

Conversation

@ezyang
Copy link
Copy Markdown
Contributor

@ezyang ezyang commented Sep 11, 2020

Stack from ghstack:

This reverts commit f9a0d0c.

ezyang added a commit that referenced this pull request Sep 11, 2020
This reverts commit f9a0d0c.

ghstack-source-id: d662f56
Pull Request resolved: #44554
@anjali411 anjali411 self-requested a review September 11, 2020 15:36
@dr-ci
Copy link
Copy Markdown

dr-ci Bot commented Sep 11, 2020

💊 CI failures summary and remediations

As of commit 7c6878f (more details on the Dr. CI page):



🕵️ 13 new failures recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

See CircleCI build pytorch_windows_vs2019_py36_cuda10.1_test1 (1/13)

Step: "Test" (full log | diagnosis details | 🔁 rerun)

RuntimeError: test_jit_profiling failed!
Generated XML report: test-reports\python-unittest\TEST-jit.test_tracer.TestTracer-20200911160101.xml 
Generated XML report: test-reports\python-unittest\TEST-jit.test_type_sharing.TestTypeSharing-20200911160101.xml 
Generated XML report: test-reports\python-unittest\TEST-jit.test_unsupported_ops.TestUnsupportedOps-20200911160101.xml 
Generated XML report: test-reports\python-unittest\TEST-jit.test_with.TestWith-20200911160101.xml 
Generated XML report: test-reports\python-unittest\TEST-jit.test_data_parallel.TestDataParallel-20200911160101.xml 
Traceback (most recent call last): 
  File "run_test.py", line 740, in <module> 
    main() 
  File "run_test.py", line 723, in main 
    raise RuntimeError(err_message) 
RuntimeError: test_jit_profiling failed! 
+ cleanup
+ retcode=1
+ set +x

See CircleCI build pytorch_linux_xenial_py3_6_gcc5_4_ge_config_legacy_test (2/13)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun)

Sep 11 16:11:19 ERROR [0.009s]: test_trace_tensor_factory (jit.test_tracer.TestTracer)
Sep 11 16:11:19     self.checkTrace(stuff, (example, example[0] + 1)) 
Sep 11 16:11:19   File "/opt/conda/lib/python3.6/site-packages/torch/testing/_internal/jit_utils.py", line 518, in checkTrace 
Sep 11 16:11:19     allow_unused=allow_unused) 
Sep 11 16:11:19   File "/opt/conda/lib/python3.6/site-packages/torch/autograd/__init__.py", line 179, in grad 
Sep 11 16:11:19     grad_outputs_ = _tensor_or_tensors_to_tuple(grad_outputs, len(outputs)) 
Sep 11 16:11:19   File "/opt/conda/lib/python3.6/site-packages/torch/tensor.py", line 563, in __len__ 
Sep 11 16:11:19     raise TypeError("len() of a 0-d tensor") 
Sep 11 16:11:19 TypeError: len() of a 0-d tensor 
Sep 11 16:11:19  
Sep 11 16:11:19 ====================================================================== 
Sep 11 16:11:19 ERROR [0.009s]: test_trace_tensor_factory (jit.test_tracer.TestTracer) 
Sep 11 16:11:19 ---------------------------------------------------------------------- 
Sep 11 16:11:19 Traceback (most recent call last): 
Sep 11 16:11:19   File "/var/lib/jenkins/workspace/test/jit/test_tracer.py", line 531, in test_trace_tensor_factory 
Sep 11 16:11:19     run() 
Sep 11 16:11:19   File "/var/lib/jenkins/workspace/test/jit/test_tracer.py", line 527, in run 
Sep 11 16:11:19     self.checkTrace(fn, (input,), inputs_require_grads=inputs_require_grads) 
Sep 11 16:11:19   File "/opt/conda/lib/python3.6/site-packages/torch/testing/_internal/jit_utils.py", line 518, in checkTrace 
Sep 11 16:11:19     allow_unused=allow_unused) 
Sep 11 16:11:19   File "/opt/conda/lib/python3.6/site-packages/torch/autograd/__init__.py", line 179, in grad 
Sep 11 16:11:19     grad_outputs_ = _tensor_or_tensors_to_tuple(grad_outputs, len(outputs)) 

See CircleCI build pytorch_linux_xenial_py3_clang5_asan_test2 (3/13)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun)

Sep 11 16:10:24 SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /var/lib/jenkins/workspace/aten/src/ATen/Utils.cpp:11:3 in
Sep 11 16:10:24     #7 0x55c649af07eb in PyEval_EvalCode /tmp/build/80754af9/python_1588903631989/work/Python/ceval.c:731 
Sep 11 16:10:24     #8 0x55c649b70e73 in run_mod /tmp/build/80754af9/python_1588903631989/work/Python/pythonrun.c:1025 
Sep 11 16:10:24     #9 0x55c649b70f0c in PyRun_StringFlags /tmp/build/80754af9/python_1588903631989/work/Python/pythonrun.c:949 
Sep 11 16:10:24     #10 0x55c649b70f6e in PyRun_SimpleStringFlags /tmp/build/80754af9/python_1588903631989/work/Python/pythonrun.c:445 
Sep 11 16:10:24     #11 0x55c649b74d72 in run_command /tmp/build/80754af9/python_1588903631989/work/Modules/main.c:301 
Sep 11 16:10:24     #12 0x55c649b74d72 in Py_Main /tmp/build/80754af9/python_1588903631989/work/Modules/main.c:749 
Sep 11 16:10:24     #13 0x55c649a3ef2d in main /tmp/build/80754af9/python_1588903631989/work/Programs/python.c:69 
Sep 11 16:10:24     #14 0x7fac6bf2483f in __libc_start_main /build/glibc-e6zv40/glibc-2.23/csu/../csu/libc-start.c:291 
Sep 11 16:10:24     #15 0x55c649b1e27e in _start /home/rdonnelly/mc/conda-bld/compilers_linux-64_1534865402226/work/.build/src/glibc-2.12.2/csu/../sysdeps/x86_64/elf/start.S:103 
Sep 11 16:10:24  
Sep 11 16:10:24 SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /var/lib/jenkins/workspace/aten/src/ATen/Utils.cpp:11:3 in  
Sep 11 16:10:24 + retcode=1 
Sep 11 16:10:24 + set -e 
Sep 11 16:10:24 + return 1 
Sep 11 16:10:24 + [[ pytorch-linux-xenial-py3-clang5-asan-test2 == *-NO_AVX-* ]] 
Sep 11 16:10:24 + [[ pytorch-linux-xenial-py3-clang5-asan-test2 == *-NO_AVX2-* ]] 
Sep 11 16:10:24 + '[' -n https://github.com/pytorch/pytorch/pull/44554 ']' 
Sep 11 16:10:24 + [[ pytorch-linux-xenial-py3-clang5-asan-test2 != *coverage* ]] 
Sep 11 16:10:24 ++ mktemp 
Sep 11 16:10:24 + DETERMINE_FROM=/tmp/tmp.CgKMmdFlCq 
Sep 11 16:10:24 + file_diff_from_base /tmp/tmp.CgKMmdFlCq 

See CircleCI build pytorch_macos_10_13_py3_test (4/13)

Step: "Test" (full log | diagnosis details | 🔁 rerun)

Sep 11 09:02:13 RuntimeError: test_autograd failed! Received signal: SIGSEGV
Sep 11 09:02:13   test_nansum_scalar_keepdim_dim_cpu (__main__.TestAutogradDeviceTypeCPU) ... ERROR (0.012s) 
Sep 11 09:02:13   test_nansum_scalar_keepdim_dim_neg0_cpu (__main__.TestAutogradDeviceTypeCPU) ... ERROR (0.012s) 
Sep 11 09:02:13   test_narrow_dim_cpu (__main__.TestAutogradDeviceTypeCPU) ... ERROR (0.012s) 
Sep 11 09:02:13   test_narrow_dim_neg0_cpu (__main__.TestAutogradDeviceTypeCPU) ... ERROR (0.012s) 
Sep 11 09:02:13   test_narrow_empty_dim_cpu (__main__.TestAutogradDeviceTypeCPU) ... ERROR (0.034s) 
Sep 11 09:02:13   test_narrow_empty_dim_neg0_cpu (__main__.TestAutogradDeviceTypeCPU) ... Traceback (most recent call last): 
Sep 11 09:02:13   File "test/run_test.py", line 740, in <module> 
Sep 11 09:02:13     main() 
Sep 11 09:02:13   File "test/run_test.py", line 723, in main 
Sep 11 09:02:13     raise RuntimeError(err_message) 
Sep 11 09:02:13 RuntimeError: test_autograd failed! Received signal: SIGSEGV 
Sep 11 09:02:13 + cleanup 
Sep 11 09:02:13 + retcode=1 
Sep 11 09:02:13 + set +x 

See CircleCI build pytorch_linux_xenial_cuda10_2_cudnn7_py3_ge_config_profiling_test (5/13)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun)

Sep 11 16:26:50 ERROR [0.007s]: test_trace_tensor_factory (jit.test_tracer.TestTracer)
Sep 11 16:26:50     self.checkTrace(stuff, (example, example[0] + 1)) 
Sep 11 16:26:50   File "/opt/conda/lib/python3.6/site-packages/torch/testing/_internal/jit_utils.py", line 518, in checkTrace 
Sep 11 16:26:50     allow_unused=allow_unused) 
Sep 11 16:26:50   File "/opt/conda/lib/python3.6/site-packages/torch/autograd/__init__.py", line 179, in grad 
Sep 11 16:26:50     grad_outputs_ = _tensor_or_tensors_to_tuple(grad_outputs, len(outputs)) 
Sep 11 16:26:50   File "/opt/conda/lib/python3.6/site-packages/torch/tensor.py", line 563, in __len__ 
Sep 11 16:26:50     raise TypeError("len() of a 0-d tensor") 
Sep 11 16:26:50 TypeError: len() of a 0-d tensor 
Sep 11 16:26:50  
Sep 11 16:26:50 ====================================================================== 
Sep 11 16:26:50 ERROR [0.007s]: test_trace_tensor_factory (jit.test_tracer.TestTracer) 
Sep 11 16:26:50 ---------------------------------------------------------------------- 
Sep 11 16:26:50 Traceback (most recent call last): 
Sep 11 16:26:50   File "/var/lib/jenkins/workspace/test/jit/test_tracer.py", line 531, in test_trace_tensor_factory 
Sep 11 16:26:50     run() 
Sep 11 16:26:50   File "/var/lib/jenkins/workspace/test/jit/test_tracer.py", line 527, in run 
Sep 11 16:26:50     self.checkTrace(fn, (input,), inputs_require_grads=inputs_require_grads) 
Sep 11 16:26:50   File "/opt/conda/lib/python3.6/site-packages/torch/testing/_internal/jit_utils.py", line 518, in checkTrace 
Sep 11 16:26:50     allow_unused=allow_unused) 
Sep 11 16:26:50   File "/opt/conda/lib/python3.6/site-packages/torch/autograd/__init__.py", line 179, in grad 
Sep 11 16:26:50     grad_outputs_ = _tensor_or_tensors_to_tuple(grad_outputs, len(outputs)) 

See CircleCI build pytorch_xla_linux_bionic_py3_6_clang9_test (6/13)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun)

Sep 11 17:09:39 ERROR [0.054s]: test_polygamma_xla_float64 (__main__.TestDevicePrecisionXLA)
Sep 11 17:09:37   test_multidevice_serialization_xla (__main__.TestDevicePrecisionXLA) ... skip (0.002s) 
Sep 11 17:09:37   test_polygamma_xla_float64 (__main__.TestDevicePrecisionXLA) ... ERROR (0.054s) 
Sep 11 17:09:37   test_solve_methods_arg_device_xla (__main__.TestDevicePrecisionXLA) ... skip (0.004s) 
Sep 11 17:09:37   test_sum_cpu_device_mismatch_xla (__main__.TestDevicePrecisionXLA) ... skip (0.002s) 
Sep 11 17:09:38   test_sum_noncontig_xla_float64 (__main__.TestDevicePrecisionXLA) ... ok (0.994s) 
Sep 11 17:09:38   test_type_conversions_same_device_xla (__main__.TestDevicePrecisionXLA) ... skip (0.003s) 
Sep 11 17:09:39   test_var_large_input_xla (__main__.TestDevicePrecisionXLA) ... ok (0.424s) 
Sep 11 17:09:39   test_var_xla (__main__.TestDevicePrecisionXLA) ... ok (0.122s) 
Sep 11 17:09:39  
Sep 11 17:09:39 ====================================================================== 
Sep 11 17:09:39 ERROR [0.054s]: test_polygamma_xla_float64 (__main__.TestDevicePrecisionXLA) 
Sep 11 17:09:39 ---------------------------------------------------------------------- 
Sep 11 17:09:39 Traceback (most recent call last): 
Sep 11 17:09:39   File "/opt/conda/lib/python3.6/site-packages/torch/testing/_internal/common_device_type.py", line 270, in instantiated_test 
Sep 11 17:09:39     result = test_fn(self, *args) 
Sep 11 17:09:39   File "/var/lib/jenkins/workspace/xla/test/../../test/test_torch.py", line 18853, in test_polygamma 
Sep 11 17:09:39     cpu_tensor) 
Sep 11 17:09:39   File "/opt/conda/lib/python3.6/site-packages/torch/autograd/gradcheck.py", line 308, in gradcheck 
Sep 11 17:09:39     nondet_tol=nondet_tol) 
Sep 11 17:09:39   File "/opt/conda/lib/python3.6/site-packages/torch/autograd/gradcheck.py", line 156, in get_analytical_jacobian 
Sep 11 17:09:39     retain_graph=True, allow_unused=True) 

See CircleCI build pytorch_linux_xenial_py3_clang5_asan_test1 (7/13)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun)

Sep 11 16:11:39 SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /var/lib/jenkins/workspace/aten/src/ATen/Utils.cpp:11:3 in
Sep 11 16:11:39     #7 0x559ea6a917eb in PyEval_EvalCode /tmp/build/80754af9/python_1588903631989/work/Python/ceval.c:731 
Sep 11 16:11:39     #8 0x559ea6b11e73 in run_mod /tmp/build/80754af9/python_1588903631989/work/Python/pythonrun.c:1025 
Sep 11 16:11:39     #9 0x559ea6b11f0c in PyRun_StringFlags /tmp/build/80754af9/python_1588903631989/work/Python/pythonrun.c:949 
Sep 11 16:11:39     #10 0x559ea6b11f6e in PyRun_SimpleStringFlags /tmp/build/80754af9/python_1588903631989/work/Python/pythonrun.c:445 
Sep 11 16:11:39     #11 0x559ea6b15d72 in run_command /tmp/build/80754af9/python_1588903631989/work/Modules/main.c:301 
Sep 11 16:11:39     #12 0x559ea6b15d72 in Py_Main /tmp/build/80754af9/python_1588903631989/work/Modules/main.c:749 
Sep 11 16:11:39     #13 0x559ea69dff2d in main /tmp/build/80754af9/python_1588903631989/work/Programs/python.c:69 
Sep 11 16:11:39     #14 0x7f48144cf83f in __libc_start_main /build/glibc-e6zv40/glibc-2.23/csu/../csu/libc-start.c:291 
Sep 11 16:11:39     #15 0x559ea6abf27e in _start /home/rdonnelly/mc/conda-bld/compilers_linux-64_1534865402226/work/.build/src/glibc-2.12.2/csu/../sysdeps/x86_64/elf/start.S:103 
Sep 11 16:11:39  
Sep 11 16:11:39 SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /var/lib/jenkins/workspace/aten/src/ATen/Utils.cpp:11:3 in  
Sep 11 16:11:39 + retcode=1 
Sep 11 16:11:39 + set -e 
Sep 11 16:11:39 + return 1 
Sep 11 16:11:39 + [[ pytorch-linux-xenial-py3-clang5-asan-test1 == *-NO_AVX-* ]] 
Sep 11 16:11:39 + [[ pytorch-linux-xenial-py3-clang5-asan-test1 == *-NO_AVX2-* ]] 
Sep 11 16:11:39 + '[' -n https://github.com/pytorch/pytorch/pull/44554 ']' 
Sep 11 16:11:39 + [[ pytorch-linux-xenial-py3-clang5-asan-test1 != *coverage* ]] 
Sep 11 16:11:39 ++ mktemp 
Sep 11 16:11:39 + DETERMINE_FROM=/tmp/tmp.cA4bsz4RCS 
Sep 11 16:11:39 + file_diff_from_base /tmp/tmp.cA4bsz4RCS 

See CircleCI build pytorch_linux_xenial_py3_6_gcc5_4_test (8/13)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun)

Sep 11 16:10:04 caused by: Connection refused (os error 111)
Sep 11 16:10:04 ++++ extract_trap_cmd 
Sep 11 16:10:04 ++++ printf '%s\n' '' 
Sep 11 16:10:04 +++ printf '%s\n' cleanup 
Sep 11 16:10:04 ++ trap -- ' 
Sep 11 16:10:04 cleanup' EXIT 
Sep 11 16:10:04 ++ [[ pytorch-linux-xenial-py3.6-gcc5.4-test != *pytorch-win-* ]] 
Sep 11 16:10:04 ++ which sccache 
Sep 11 16:10:04 ++ sccache --stop-server 
Sep 11 16:10:04 Stopping sccache server... 
Sep 11 16:10:04 error: couldn't connect to server 
Sep 11 16:10:04 caused by: Connection refused (os error 111) 
Sep 11 16:10:04 ++ true 
Sep 11 16:10:04 ++ rm /var/lib/jenkins/sccache_error.log 
Sep 11 16:10:04 ++ [[ pytorch-linux-xenial-py3.6-gcc5.4-test == *rocm* ]] 
Sep 11 16:10:04 ++ SCCACHE_ERROR_LOG=/var/lib/jenkins/sccache_error.log 
Sep 11 16:10:04 ++ SCCACHE_IDLE_TIMEOUT=1200 
Sep 11 16:10:04 ++ RUST_LOG=sccache::server=error 
Sep 11 16:10:04 ++ sccache --start-server 
Sep 11 16:10:04 Starting sccache server... 
Sep 11 16:10:04 ++ sccache --zero-stats 
Sep 11 16:10:04 Compile requests                 0 

See CircleCI build pytorch_linux_bionic_py3_6_clang9_test (9/13)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun)

Sep 11 16:05:55 caused by: Connection refused (os error 111)
Sep 11 16:05:55 ++++ extract_trap_cmd 
Sep 11 16:05:55 ++++ printf '%s\n' '' 
Sep 11 16:05:55 +++ printf '%s\n' cleanup 
Sep 11 16:05:55 ++ trap -- ' 
Sep 11 16:05:55 cleanup' EXIT 
Sep 11 16:05:55 ++ [[ pytorch-linux-bionic-py3.6-clang9-test != *pytorch-win-* ]] 
Sep 11 16:05:55 ++ which sccache 
Sep 11 16:05:55 ++ sccache --stop-server 
Sep 11 16:05:55 Stopping sccache server... 
Sep 11 16:05:55 error: couldn't connect to server 
Sep 11 16:05:55 caused by: Connection refused (os error 111) 
Sep 11 16:05:55 ++ true 
Sep 11 16:05:55 ++ rm /var/lib/jenkins/sccache_error.log 
Sep 11 16:05:55 ++ [[ pytorch-linux-bionic-py3.6-clang9-test == *rocm* ]] 
Sep 11 16:05:55 ++ SCCACHE_ERROR_LOG=/var/lib/jenkins/sccache_error.log 
Sep 11 16:05:55 ++ SCCACHE_IDLE_TIMEOUT=1200 
Sep 11 16:05:55 ++ RUST_LOG=sccache::server=error 
Sep 11 16:05:55 ++ sccache --start-server 
Sep 11 16:05:55 Starting sccache server... 
Sep 11 16:05:55 ++ sccache --zero-stats 
Sep 11 16:05:55 Compile requests                 0 

See CircleCI build pytorch_linux_xenial_py3_6_gcc5_4_ge_config_simple_test (10/13)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun)

Sep 11 16:09:42 caused by: Connection refused (os error 111)
Sep 11 16:09:42 ++++ extract_trap_cmd 
Sep 11 16:09:42 ++++ printf '%s\n' '' 
Sep 11 16:09:42 +++ printf '%s\n' cleanup 
Sep 11 16:09:42 ++ trap -- ' 
Sep 11 16:09:42 cleanup' EXIT 
Sep 11 16:09:42 ++ [[ pytorch-linux-xenial-py3.6-gcc5.4-ge_config_simple-test != *pytorch-win-* ]] 
Sep 11 16:09:42 ++ which sccache 
Sep 11 16:09:42 ++ sccache --stop-server 
Sep 11 16:09:42 Stopping sccache server... 
Sep 11 16:09:42 error: couldn't connect to server 
Sep 11 16:09:42 caused by: Connection refused (os error 111) 
Sep 11 16:09:42 ++ true 
Sep 11 16:09:42 ++ rm /var/lib/jenkins/sccache_error.log 
Sep 11 16:09:42 ++ [[ pytorch-linux-xenial-py3.6-gcc5.4-ge_config_simple-test == *rocm* ]] 
Sep 11 16:09:42 ++ SCCACHE_ERROR_LOG=/var/lib/jenkins/sccache_error.log 
Sep 11 16:09:42 ++ SCCACHE_IDLE_TIMEOUT=1200 
Sep 11 16:09:42 ++ RUST_LOG=sccache::server=error 
Sep 11 16:09:42 ++ sccache --start-server 
Sep 11 16:09:42 Starting sccache server... 
Sep 11 16:09:42 ++ sccache --zero-stats 
Sep 11 16:09:42 Compile requests                 0 

See CircleCI build pytorch_linux_xenial_cuda10_2_cudnn7_py3_ge_config_legacy_test (11/13)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun)

Sep 11 16:27:44 ERROR [0.007s]: test_trace_tensor_factory (jit.test_tracer.TestTracer)
Sep 11 16:27:44     self.checkTrace(stuff, (example, example[0] + 1)) 
Sep 11 16:27:44   File "/opt/conda/lib/python3.6/site-packages/torch/testing/_internal/jit_utils.py", line 518, in checkTrace 
Sep 11 16:27:44     allow_unused=allow_unused) 
Sep 11 16:27:44   File "/opt/conda/lib/python3.6/site-packages/torch/autograd/__init__.py", line 179, in grad 
Sep 11 16:27:44     grad_outputs_ = _tensor_or_tensors_to_tuple(grad_outputs, len(outputs)) 
Sep 11 16:27:44   File "/opt/conda/lib/python3.6/site-packages/torch/tensor.py", line 563, in __len__ 
Sep 11 16:27:44     raise TypeError("len() of a 0-d tensor") 
Sep 11 16:27:44 TypeError: len() of a 0-d tensor 
Sep 11 16:27:44  
Sep 11 16:27:44 ====================================================================== 
Sep 11 16:27:44 ERROR [0.007s]: test_trace_tensor_factory (jit.test_tracer.TestTracer) 
Sep 11 16:27:44 ---------------------------------------------------------------------- 
Sep 11 16:27:44 Traceback (most recent call last): 
Sep 11 16:27:44   File "/var/lib/jenkins/workspace/test/jit/test_tracer.py", line 531, in test_trace_tensor_factory 
Sep 11 16:27:44     run() 
Sep 11 16:27:44   File "/var/lib/jenkins/workspace/test/jit/test_tracer.py", line 527, in run 
Sep 11 16:27:44     self.checkTrace(fn, (input,), inputs_require_grads=inputs_require_grads) 
Sep 11 16:27:44   File "/opt/conda/lib/python3.6/site-packages/torch/testing/_internal/jit_utils.py", line 518, in checkTrace 
Sep 11 16:27:44     allow_unused=allow_unused) 
Sep 11 16:27:44   File "/opt/conda/lib/python3.6/site-packages/torch/autograd/__init__.py", line 179, in grad 
Sep 11 16:27:44     grad_outputs_ = _tensor_or_tensors_to_tuple(grad_outputs, len(outputs)) 

See CircleCI build pytorch_linux_xenial_py3_6_gcc5_4_ge_config_profiling_test (12/13)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun)

Sep 11 16:11:22 ERROR [0.012s]: test_trace_tensor_factory (jit.test_tracer.TestTracer)
Sep 11 16:11:22     self.checkTrace(stuff, (example, example[0] + 1)) 
Sep 11 16:11:22   File "/opt/conda/lib/python3.6/site-packages/torch/testing/_internal/jit_utils.py", line 518, in checkTrace 
Sep 11 16:11:22     allow_unused=allow_unused) 
Sep 11 16:11:22   File "/opt/conda/lib/python3.6/site-packages/torch/autograd/__init__.py", line 179, in grad 
Sep 11 16:11:22     grad_outputs_ = _tensor_or_tensors_to_tuple(grad_outputs, len(outputs)) 
Sep 11 16:11:22   File "/opt/conda/lib/python3.6/site-packages/torch/tensor.py", line 563, in __len__ 
Sep 11 16:11:22     raise TypeError("len() of a 0-d tensor") 
Sep 11 16:11:22 TypeError: len() of a 0-d tensor 
Sep 11 16:11:22  
Sep 11 16:11:22 ====================================================================== 
Sep 11 16:11:22 ERROR [0.012s]: test_trace_tensor_factory (jit.test_tracer.TestTracer) 
Sep 11 16:11:22 ---------------------------------------------------------------------- 
Sep 11 16:11:22 Traceback (most recent call last): 
Sep 11 16:11:22   File "/var/lib/jenkins/workspace/test/jit/test_tracer.py", line 531, in test_trace_tensor_factory 
Sep 11 16:11:22     run() 
Sep 11 16:11:22   File "/var/lib/jenkins/workspace/test/jit/test_tracer.py", line 527, in run 
Sep 11 16:11:22     self.checkTrace(fn, (input,), inputs_require_grads=inputs_require_grads) 
Sep 11 16:11:22   File "/opt/conda/lib/python3.6/site-packages/torch/testing/_internal/jit_utils.py", line 518, in checkTrace 
Sep 11 16:11:22     allow_unused=allow_unused) 
Sep 11 16:11:22   File "/opt/conda/lib/python3.6/site-packages/torch/autograd/__init__.py", line 179, in grad 
Sep 11 16:11:22     grad_outputs_ = _tensor_or_tensors_to_tuple(grad_outputs, len(outputs)) 

See CircleCI build pytorch_linux_xenial_cuda10_2_cudnn7_py3_gcc7_test (13/13)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun)

Sep 11 16:20:25 caused by: Connection refused (os error 111)
Sep 11 16:20:25 ++++ extract_trap_cmd 
Sep 11 16:20:25 ++++ printf '%s\n' '' 
Sep 11 16:20:25 +++ printf '%s\n' cleanup 
Sep 11 16:20:25 ++ trap -- ' 
Sep 11 16:20:25 cleanup' EXIT 
Sep 11 16:20:25 ++ [[ pytorch-linux-xenial-cuda10.2-cudnn7-py3-gcc7-test != *pytorch-win-* ]] 
Sep 11 16:20:25 ++ which sccache 
Sep 11 16:20:25 ++ sccache --stop-server 
Sep 11 16:20:25 Stopping sccache server... 
Sep 11 16:20:25 error: couldn't connect to server 
Sep 11 16:20:25 caused by: Connection refused (os error 111) 
Sep 11 16:20:25 ++ true 
Sep 11 16:20:25 ++ rm /var/lib/jenkins/sccache_error.log 
Sep 11 16:20:25 ++ [[ pytorch-linux-xenial-cuda10.2-cudnn7-py3-gcc7-test == *rocm* ]] 
Sep 11 16:20:25 ++ SCCACHE_ERROR_LOG=/var/lib/jenkins/sccache_error.log 
Sep 11 16:20:25 ++ SCCACHE_IDLE_TIMEOUT=1200 
Sep 11 16:20:25 ++ RUST_LOG=sccache::server=error 
Sep 11 16:20:25 ++ sccache --start-server 
Sep 11 16:20:25 Starting sccache server... 
Sep 11 16:20:25 ++ sccache --zero-stats 
Sep 11 16:20:25 Compile requests                 0 

❄️ 1 failure tentatively classified as flaky

but reruns have not yet been triggered to confirm:

See CircleCI build pytorch_windows_vs2019_py36_cuda10.1_test2 (1/1)

Step: "Test" (full log | diagnosis details | 🔁 rerun) ❄️

WARNING: Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProtocolError('Connection aborted.', ConnectionResetError(10054, 'An existing connection was forcibly closed by the remote host', None, 10054, None))': /simple/scikit-learn/
Collecting pillow 
  Downloading Pillow-7.2.0-cp36-cp36m-win_amd64.whl (2.0 MB) 
Collecting unittest-xml-reporting 
  Downloading unittest_xml_reporting-3.0.4-py2.py3-none-any.whl (19 kB) 
Collecting attrs>=19.2.0 
  Downloading attrs-20.2.0-py2.py3-none-any.whl (48 kB) 
Collecting audioread>=2.0.0 
  Downloading audioread-2.1.8.tar.gz (21 kB) 
Requirement already satisfied: numpy>=1.15.0 in c:\jenkins\miniconda3\lib\site-packages (from librosa>=0.6.2) (1.19.1) 
Requirement already satisfied: scipy>=1.0.0 in c:\jenkins\miniconda3\lib\site-packages (from librosa>=0.6.2) (1.5.0) 
WARNING: Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProtocolError('Connection aborted.', ConnectionResetError(10054, 'An existing connection was forcibly closed by the remote host', None, 10054, None))': /simple/scikit-learn/ 
Collecting scikit-learn!=0.19.0,>=0.14.0 
  Downloading scikit_learn-0.23.2-cp36-cp36m-win_amd64.whl (6.8 MB) 
Collecting joblib>=0.14 
  Downloading joblib-0.16.0-py3-none-any.whl (300 kB) 
Collecting decorator>=3.0.0 
  Downloading decorator-4.4.2-py2.py3-none-any.whl (9.2 kB) 
Collecting resampy>=0.2.2 
  Downloading resampy-0.2.2.tar.gz (323 kB) 
Requirement already satisfied: numba>=0.43.0 in c:\jenkins\miniconda3\lib\site-packages (from librosa>=0.6.2) (0.44.0) 
Collecting soundfile>=0.9.0 

🚧 1 ongoing upstream failure:

These were probably caused by upstream breakages that are not fixed yet:


ci.pytorch.org: 1 failed


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group.

See how this bot performed.

This comment has been revised 7 times.

@ezyang
Copy link
Copy Markdown
Contributor Author

ezyang commented Sep 14, 2020

@anjali411 and I agreed that we would just disable the tensor-like test instead.

@ezyang ezyang closed this Sep 14, 2020
@facebook-github-bot facebook-github-bot deleted the gh/ezyang/836/head branch October 15, 2020 14:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants