Skip to content

[WIP] Add type promotion to a fast path for foreach APIs#52449

Closed
izdeby wants to merge 18 commits intogh/izdeby/88/basefrom
gh/izdeby/88/head
Closed

[WIP] Add type promotion to a fast path for foreach APIs#52449
izdeby wants to merge 18 commits intogh/izdeby/88/basefrom
gh/izdeby/88/head

Conversation

@izdeby
Copy link
Copy Markdown
Contributor

@izdeby izdeby commented Feb 18, 2021

Stack from ghstack:

Differential Revision: D26520924

@facebook-github-bot
Copy link
Copy Markdown
Contributor

facebook-github-bot commented Feb 18, 2021

💊 CI failures summary and remediations

As of commit aa03124 (more details on the Dr. CI page):


  • 14/14 failures possibly* introduced in this PR
    • 2/14 non-scanned failure(s)

🕵️ 10 new failures recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

See CircleCI build pytorch_linux_xenial_py3_clang5_asan_test2 (1/10)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun)

Apr 01 20:30:44 RuntimeError: test_foreach failed!
Apr 01 20:30:40 Executing ['/opt/conda/bin/python', 'test_foreach.py', '-v'] ... [2021-04-01 20:30:40.999088]
Apr 01 20:30:44 Traceback (most recent call last):
Apr 01 20:30:44   File "test_foreach.py", line 8, in <module>
Apr 01 20:30:44     from torch.testing._internal.common_methods_invocations import \
Apr 01 20:30:44 ImportError: cannot import name 'foreach_binary_op_tensor_list_db'
Apr 01 20:30:44 Traceback (most recent call last):
Apr 01 20:30:44   File "test/run_test.py", line 1094, in <module>
Apr 01 20:30:44     main()
Apr 01 20:30:44   File "test/run_test.py", line 1073, in main
Apr 01 20:30:44     raise RuntimeError(err_message)
Apr 01 20:30:44 RuntimeError: test_foreach failed!
Apr 01 20:30:45 =================== sccache compilation log ===================
Apr 01 20:30:45 + cleanup
Apr 01 20:30:45 + retcode=1
Apr 01 20:30:45 + set +x
Apr 01 20:30:45 =========== If your build fails, please take a look at the log above for possible reasons ===========
Apr 01 20:30:45 Compile requests                      4
Apr 01 20:30:45 Compile requests executed             2
Apr 01 20:30:45 Cache hits                            2
Apr 01 20:30:45 Cache hits (C/C++)                    2
Apr 01 20:30:45 Cache misses                          0

See CircleCI build pytorch_windows_vs2019_py36_cuda11.1_build (2/10)

Step: "Build" (full log | diagnosis details | 🔁 rerun)

Error generating file

C:/Users/circleci/project/aten/src/ATen/native/cuda/ForeachBinaryOpScalarList.cu(94): error: function "at::native::foreach_tensor_div_scalarlist_kernel_cuda" has already been defined

C:/Users/circleci/project/aten/src/ATen/native/cuda/ForeachBinaryOpScalarList.cu(102): error: expected a declaration

9 errors detected in the compilation of "C:/Users/circleci/project/aten/src/ATen/native/cuda/ForeachBinaryOpScalarList.cu".
ForeachBinaryOpScalarList.cu
-- Removing C:/Users/circleci/project/build/caffe2/CMakeFiles/torch_cuda_cu.dir/__/aten/src/ATen/native/cuda/./torch_cuda_cu_generated_ForeachBinaryOpScalarList.cu.obj
C:/Jenkins/Miniconda3/Library/bin/cmake.exe -E remove C:/Users/circleci/project/build/caffe2/CMakeFiles/torch_cuda_cu.dir/__/aten/src/ATen/native/cuda/./torch_cuda_cu_generated_ForeachBinaryOpScalarList.cu.obj
CMake Error at torch_cuda_cu_generated_ForeachBinaryOpScalarList.cu.obj.Release.cmake:281 (message):
  Error generating file
  C:/Users/circleci/project/build/caffe2/CMakeFiles/torch_cuda_cu.dir/__/aten/src/ATen/native/cuda/./torch_cuda_cu_generated_ForeachBinaryOpScalarList.cu.obj


ninja: build stopped: subcommand failed.
-- Building version 1.9.0a0+gitaa03124
 --- Trying to initialize submodules
 --- Submodule initialization took 163.98 sec
cmake -GNinja -DBUILD_ENVIRONMENT=pytorch-win-vs2019-cuda11-cudnn8-py3 -DBUILD_PYTHON=True -DBUILD_SPLIT_CUDA=ON -DBUILD_TEST=True -DBUILD_TYPE=release -DCMAKE_BUILD_TYPE=Release -DCMAKE_GENERATOR=Ninja -DCMAKE_INCLUDE_PATH=C:\Users\circleci\project\build\win_tmp\mkl\include -DCMAKE_INSTALL_PREFIX=C:\Users\circleci\project\torch -DCMAKE_PREFIX_PATH=C:\Jenkins\Miniconda3\Lib\site-packages -DCMAKE_VERBOSE_MAKEFILE=1 -DCUDA_NVCC_EXECUTABLE=C:\Users\circleci\project\build\win_tmp\bin\randomtemp.exe -DCUDNN_LIBRARY=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1\lib\x64 -DJAVA_HOME=C:\Program Files\OpenJDK\jdk-12.0.2 -DNUMPY_INCLUDE_DIR=C:\Jenkins\Miniconda3\lib\site-packages\numpy\core\include -DPYTHON_EXECUTABLE=C:\Jenkins\Miniconda3\python.exe -DPYTHON_INCLUDE_DIR=C:\Jenkins\Miniconda3\include -DPYTHON_LIBRARY=C:\Jenkins\Miniconda3/libs/python36.lib -DTORCH_BUILD_VERSION=1.9.0a0+gitaa03124 -DUSE_CUDA=1 -DUSE_NUMPY=True C:\Users\circleci\project
cmake --build . --target install --config Release -- -j 16
Traceback (most recent call last):

See CircleCI build pytorch_linux_bionic_rocm3_9_py3_6_build (3/10)

Step: "Build" (full log | diagnosis details | 🔁 rerun)

Apr 01 18:03:18 Error generating file
Apr 01 18:03:18 /var/lib/jenkins/workspace/aten/src/ATen/native/hip/ForeachBinaryOpList.hip:40:31: error: use of undeclared identifier 'alpha'
Apr 01 18:03:18 /var/lib/jenkins/workspace/aten/src/ATen/native/hip/ForeachBinaryOpList.hip:40:31: error: use of undeclared identifier 'alpha'
Apr 01 18:03:18 /var/lib/jenkins/workspace/aten/src/ATen/native/hip/ForeachBinaryOpList.hip:40:31: error: use of undeclared identifier 'alpha'
Apr 01 18:03:18 /var/lib/jenkins/workspace/aten/src/ATen/native/hip/ForeachBinaryOpList.hip:40:31: error: use of undeclared identifier 'alpha'
Apr 01 18:03:18 /var/lib/jenkins/workspace/aten/src/ATen/native/hip/ForeachBinaryOpList.hip:40:31: error: use of undeclared identifier 'alpha'
Apr 01 18:03:18 /var/lib/jenkins/workspace/aten/src/ATen/native/hip/ForeachBinaryOpList.hip:40:31: error: use of undeclared identifier 'alpha'
Apr 01 18:03:18 /var/lib/jenkins/workspace/aten/src/ATen/native/hip/ForeachBinaryOpList.hip:40:31: error: use of undeclared identifier 'alpha'
Apr 01 18:03:18 /var/lib/jenkins/workspace/aten/src/ATen/native/hip/ForeachBinaryOpList.hip:40:31: error: use of undeclared identifier 'alpha'
Apr 01 18:03:18 2 warnings and 12 errors generated when compiling for gfx900.
Apr 01 18:03:18 CMake Error at torch_hip_generated_ForeachBinaryOpList.hip.o.cmake:192 (message):
Apr 01 18:03:18   Error generating file
Apr 01 18:03:18   /var/lib/jenkins/workspace/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/hip/./torch_hip_generated_ForeachBinaryOpList.hip.o
Apr 01 18:03:18 
Apr 01 18:03:18 
Apr 01 18:03:18 caffe2/CMakeFiles/torch_hip.dir/build.make:880: recipe for target 'caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/hip/torch_hip_generated_ForeachBinaryOpList.hip.o' failed
Apr 01 18:03:18 make[2]: *** [caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/hip/torch_hip_generated_ForeachBinaryOpList.hip.o] Error 1
Apr 01 18:03:21 In file included from /var/lib/jenkins/workspace/aten/src/ATen/native/hip/AmpKernels.hip:10:
Apr 01 18:03:21 In file included from /var/lib/jenkins/workspace/aten/src/ATen/native/hip/ForeachFunctors.cuh:3:
Apr 01 18:03:21 In file included from /var/lib/jenkins/workspace/aten/src/ATen/native/hip/MultiTensorApply.cuh:6:
Apr 01 18:03:21 In file included from /var/lib/jenkins/workspace/aten/src/ATen/native/hip/Loops.cuh:18:
Apr 01 18:03:21 /var/lib/jenkins/workspace/aten/src/ATen/native/hip/MemoryAccess.cuh:38:26: warning: template template parameter using 'typename' is a C++17 extension [-Wc++17-extensions]

See CircleCI build pytorch_linux_xenial_cuda10_2_cudnn7_py3_gcc7_build (4/10)

Step: "Build" (full log | diagnosis details | 🔁 rerun)

Apr 01 18:09:07 Error generating file
Apr 01 18:09:07 /var/lib/jenkins/workspace/aten/src/ATen/native/cuda/ForeachBinaryOpList.cu(31): error: expected an expression
Apr 01 18:09:07 
Apr 01 18:09:07 /var/lib/jenkins/workspace/aten/src/ATen/native/cuda/ForeachBinaryOpList.cu(31): error: identifier "alpha" is undefined
Apr 01 18:09:07 
Apr 01 18:09:07 /var/lib/jenkins/workspace/aten/src/ATen/native/cuda/ForeachBinaryOpList.cu(31): error: type name is not allowed
Apr 01 18:09:07 
Apr 01 18:09:07 /var/lib/jenkins/workspace/aten/src/ATen/native/cuda/ForeachBinaryOpList.cu(31): error: expected an expression
Apr 01 18:09:07 
Apr 01 18:09:07 36 errors detected in the compilation of "/tmp/tmpxft_00002154_00000000-6_ForeachBinaryOpList.cpp1.ii".
Apr 01 18:09:07 CMake Error at torch_cuda_generated_ForeachBinaryOpList.cu.o.Release.cmake:281 (message):
Apr 01 18:09:07   Error generating file
Apr 01 18:09:07   /var/lib/jenkins/workspace/build/caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/./torch_cuda_generated_ForeachBinaryOpList.cu.o
Apr 01 18:09:07 
Apr 01 18:09:07 
Apr 01 18:09:07 caffe2/CMakeFiles/torch_cuda.dir/build.make:553: recipe for target 'caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/torch_cuda_generated_ForeachBinaryOpList.cu.o' failed
Apr 01 18:09:07 make[2]: *** [caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/torch_cuda_generated_ForeachBinaryOpList.cu.o] Error 1
Apr 01 18:09:07 make[2]: *** Waiting for unfinished jobs....
Apr 01 18:09:09 make[1]: *** [caffe2/CMakeFiles/torch_cuda.dir/all] Error 2
Apr 01 18:09:09 CMakeFiles/Makefile2:10902: recipe for target 'caffe2/CMakeFiles/torch_cuda.dir/all' failed
Apr 01 18:09:09 make: *** [all] Error 2
Apr 01 18:09:09 Makefile:138: recipe for target 'all' failed

See CircleCI build pytorch_linux_bionic_py3_8_gcc9_coverage_test1 (5/10)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun)

Apr 01 19:42:14 [E request_callback_no_python.cpp:656] Received error while processing request type 256: The following operation failed in the TorchScript interpreter.
Apr 01 19:42:14 
Apr 01 19:42:14 [E request_callback_no_python.cpp:656] Received error while processing request type 256: The following operation failed in the TorchScript interpreter.
Apr 01 19:42:14 Traceback of TorchScript (most recent call last):
Apr 01 19:42:14   File "/opt/conda/lib/python3.8/site-packages/torch/testing/_internal/distributed/rpc/rpc_test.py", line 334, in raise_func_script
Apr 01 19:42:14 @torch.jit.script
Apr 01 19:42:14 def raise_func_script(expected_err: str) -> torch.Tensor:
Apr 01 19:42:14     raise ValueError(expected_err)
Apr 01 19:42:14     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
Apr 01 19:42:14 RuntimeError: Expected error
Apr 01 19:42:14 
Apr 01 19:42:14 [E request_callback_no_python.cpp:656] Received error while processing request type 256: The following operation failed in the TorchScript interpreter.
Apr 01 19:42:14 Traceback of TorchScript (most recent call last):
Apr 01 19:42:14   File "/opt/conda/lib/python3.8/site-packages/torch/testing/_internal/distributed/rpc/rpc_test.py", line 334, in raise_func_script
Apr 01 19:42:14 @torch.jit.script
Apr 01 19:42:14 def raise_func_script(expected_err: str) -> torch.Tensor:
Apr 01 19:42:14     raise ValueError(expected_err)
Apr 01 19:42:14     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
Apr 01 19:42:14 RuntimeError: Expected error
Apr 01 19:42:14 
Apr 01 19:42:15 ok (2.759s)
Apr 01 19:42:17   test_wait_all_multiple_call (__main__.ProcessGroupRpcTestWithSpawn) ... RPC was initialized with the PROCESS_GROUP backend which is deprecated and slated to be removed and superseded by the TENSORPIPE backend. It is recommended to migrate to the TENSORPIPE backend.

See CircleCI build pytorch_windows_vs2019_py36_cuda10.1_build (6/10)

Step: "Build" (full log | diagnosis details | 🔁 rerun)

Error generating file
          detected during instantiation of "std::vector<at::Tensor, std::allocator<at::Tensor>> at::native::foreach_tensor_list_op<Op>(at::TensorList, at::TensorList, const c10::Scalar &, __nv_bool) [with Op=std::multiplies]" 
(103): here

Error limit reached.
100 errors detected in the compilation of "C:/Users/circleci/project/build/win_tmp/bin/.tmpQKQltW/tmpxft_000015a4_00000000-7_ForeachBinaryOpList.cpp1.ii".
Compilation terminated.
ForeachBinaryOpList.cu
-- Removing C:/Users/circleci/project/build/caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/./torch_cuda_generated_ForeachBinaryOpList.cu.obj
C:/Jenkins/Miniconda3/Library/bin/cmake.exe -E remove C:/Users/circleci/project/build/caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/./torch_cuda_generated_ForeachBinaryOpList.cu.obj
CMake Error at torch_cuda_generated_ForeachBinaryOpList.cu.obj.Release.cmake:281 (message):
  Error generating file
  C:/Users/circleci/project/build/caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/./torch_cuda_generated_ForeachBinaryOpList.cu.obj


[4694/5565] cmd.exe /C "cd /D C:\Users\circleci\project\build\caffe2\CMakeFiles\torch_cuda.dir\operators && C:\Jenkins\Miniconda3\Library\bin\cmake.exe -E make_directory C:/Users/circleci/project/build/caffe2/CMakeFiles/torch_cuda.dir/operators/. && C:\Jenkins\Miniconda3\Library\bin\cmake.exe -D verbose:BOOL=ON -D build_configuration:STRING=Release -D generated_file:STRING=C:/Users/circleci/project/build/caffe2/CMakeFiles/torch_cuda.dir/operators/./torch_cuda_generated_channelwise_conv3d_op_cudnn.cu.obj -D generated_cubin_file:STRING=C:/Users/circleci/project/build/caffe2/CMakeFiles/torch_cuda.dir/operators/./torch_cuda_generated_channelwise_conv3d_op_cudnn.cu.obj.cubin.txt -P C:/Users/circleci/project/build/caffe2/CMakeFiles/torch_cuda.dir/operators/torch_cuda_generated_channelwise_conv3d_op_cudnn.cu.obj.Release.cmake"
-- Removing C:/Users/circleci/project/build/caffe2/CMakeFiles/torch_cuda.dir/operators/./torch_cuda_generated_channelwise_conv3d_op_cudnn.cu.obj
C:/Jenkins/Miniconda3/Library/bin/cmake.exe -E remove C:/Users/circleci/project/build/caffe2/CMakeFiles/torch_cuda.dir/operators/./torch_cuda_generated_channelwise_conv3d_op_cudnn.cu.obj
-- Generating dependency file: C:/Users/circleci/project/build/caffe2/CMakeFiles/torch_cuda.dir/operators/torch_cuda_generated_channelwise_conv3d_op_cudnn.cu.obj.NVCC-depend
C:/Users/circleci/project/build/win_tmp/bin/randomtemp.exe -M -D__CUDACC__ C:/Users/circleci/project/caffe2/operators/channelwise_conv3d_op_cudnn.cu -o C:/Users/circleci/project/build/caffe2/CMakeFiles/torch_cuda.dir/operators/torch_cuda_generated_channelwise_conv3d_op_cudnn.cu.obj.NVCC-depend -ccbin cl.exe -m64 -Dtorch_cuda_EXPORTS -DUSE_CUDA -DTORCH_CUDA_BUILD_MAIN_LIB -DWIN32_LEAN_AND_MEAN -DTH_BLAS_MKL -D_OPENMP_NOFORCE_MANIFEST -DONNX_ML=1 -DONNXIFI_ENABLE_EXT=1 -DONNX_NAMESPACE=onnx_torch -D_CRT_SECURE_NO_DEPRECATE=1 -DMAGMA_V2 -DIDEEP_USE_MKL -DUSE_EXTERNAL_MZCRC -DMINIZ_DISABLE_ZIP_READER_CRC32_CHECKS -Xcompiler ,\"/DWIN32\",\"/D_WINDOWS\",\"/GR\",\"/EHsc\",\"/w\",\"/bigobj\",\"-DUSE_PTHREADPOOL\",\"-openmp:experimental\",\"-IC:/Users/circleci/project/build/win_tmp/mkl/include\",\"-DNDEBUG\",\"-DUSE_FBGEMM\",\"-DUSE_XNNPACK\",\"-DHAVE_AVX_CPU_DEFINITION\",\"-DHAVE_AVX2_CPU_DEFINITION\",\"/MD\",\"/O2\",\"/Ob2\",\"/DNDEBUG\",\"/w\",\"/bigobj\",\"-DNDEBUG\" -Xcompiler /w -w -Xfatbin -compress-all -DONNX_NAMESPACE=onnx_torch --use-local-env -gencode arch=compute_75,code=sm_75 -Xcudafe --diag_suppress=cc_clobber_ignored,--diag_suppress=integer_sign_change,--diag_suppress=useless_using_declaration,--diag_suppress=set_but_not_used,--diag_suppress=field_without_dll_interface,--diag_suppress=base_class_has_different_dll_interface,--diag_suppress=dll_interface_conflict_none_assumed,--diag_suppress=dll_interface_conflict_dllexport_assumed,--diag_suppress=implicit_return_from_non_void_function,--diag_suppress=unsigned_compare_with_zero,--diag_suppress=declared_but_not_referenced,--diag_suppress=bad_friend_decl --Werror cross-execution-space-call --no-host-device-move-forward -Xcompiler -MD --expt-relaxed-constexpr --expt-extended-lambda -Xcompiler=/wd4819,/wd4503,/wd4190,/wd4244,/wd4251,/wd4275,/wd4522 -Wno-deprecated-gpu-targets --expt-extended-lambda -DCUDA_HAS_FP16=1 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ -DNVCC "-IC:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v10.1/include" -IC:/Users/circleci/project/build/aten/src -IC:/Users/circleci/project/aten/src -IC:/Users/circleci/project/build -IC:/Users/circleci/project -IC:/Users/circleci/project/build/third_party/gloo -IC:/Users/circleci/project/cmake/../third_party/gloo -IC:/Users/circleci/project/cmake/../third_party/googletest/googlemock/include -IC:/Users/circleci/project/cmake/../third_party/googletest/googletest/include -IC:/Users/circleci/project/third_party/protobuf/src -IC:/Users/circleci/project/build/win_tmp/mkl/include -IC:/Users/circleci/project/third_party/XNNPACK/include -IC:/Users/circleci/project/cmake/../third_party/benchmark/include -IC:/Users/circleci/project/third_party -IC:/Users/circleci/project/cmake/../third_party/eigen -IC:/Jenkins/Miniconda3/include -IC:/Jenkins/Miniconda3/lib/site-packages/numpy/core/include -IC:/Users/circleci/project/cmake/../third_party/pybind11/include -IC:/Users/circleci/project/cmake/../third_party/cub -IC:/Users/circleci/project/build/caffe2/contrib/aten -IC:/Users/circleci/project/third_party/onnx -IC:/Users/circleci/project/build/third_party/onnx -IC:/Users/circleci/project/third_party/foxi -IC:/Users/circleci/project/build/third_party/foxi -IC:/Users/circleci/project/build/win_tmp/magma/include -IC:/Users/circleci/project/third_party/ideep/mkl-dnn/include -IC:/Users/circleci/project/third_party/ideep/include -IC:/Users/circleci/project/build/include -IC:/Users/circleci/project/build/caffe2/aten/src/TH -IC:/Users/circleci/project/aten/src/TH -IC:/Users/circleci/project/build/caffe2/aten/src/THC -IC:/Users/circleci/project/aten/src/THC -IC:/Users/circleci/project/aten/src/THCUNN -IC:/Users/circleci/project/aten/src/ATen/cuda -IC:/Users/circleci/project/build/caffe2/aten/src -IC:/Users/circleci/project/aten/../third_party/catch/single_include -IC:/Users/circleci/project/aten/src/ATen/.. -IC:/Users/circleci/project/build/caffe2/aten/src/ATen -IC:/Users/circleci/project/c10/cuda/../.. -IC:/Users/circleci/project/c10/../ "-IC:/Program Files/NVIDIA Corporation/NvToolsExt/include" -IC:/Users/circleci/project/torch/csrc/api -IC:/Users/circleci/project/torch/csrc/api/include -IC:/Users/circleci/project/build/third_party/ideep/mkl-dnn/include -IC:/Users/circleci/project/third_party/ideep/mkl-dnn/src/../include
channelwise_conv3d_op_cudnn.cu
-- Generating temporary cmake readable file: C:/Users/circleci/project/build/caffe2/CMakeFiles/torch_cuda.dir/operators/torch_cuda_generated_channelwise_conv3d_op_cudnn.cu.obj.depend.tmp

See CircleCI build pytorch_macos_10_13_py3_test (7/10)

Step: "Test" (full log | diagnosis details | 🔁 rerun)

Apr 01 19:33:36 RuntimeError: test_foreach failed!
Apr 01 19:33:35 Executing ['/Users/distiller/workspace/miniconda3/bin/python', 'test_foreach.py', '-v'] ... [2021-04-01 19:33:35.126377]
Apr 01 19:33:36 Traceback (most recent call last):
Apr 01 19:33:36   File "test_foreach.py", line 8, in <module>
Apr 01 19:33:36     from torch.testing._internal.common_methods_invocations import \
Apr 01 19:33:36 ImportError: cannot import name 'foreach_binary_op_tensor_list_db' from 'torch.testing._internal.common_methods_invocations' (/Users/distiller/workspace/miniconda3/lib/python3.7/site-packages/torch/testing/_internal/common_methods_invocations.py)
Apr 01 19:33:36 Traceback (most recent call last):
Apr 01 19:33:36   File "test/run_test.py", line 1094, in <module>
Apr 01 19:33:36     main()
Apr 01 19:33:36   File "test/run_test.py", line 1073, in main
Apr 01 19:33:36     raise RuntimeError(err_message)
Apr 01 19:33:36 RuntimeError: test_foreach failed!
Apr 01 19:33:37 + cleanup
Apr 01 19:33:37 + retcode=1
Apr 01 19:33:37 + set +x


Exited with code exit status 1

See CircleCI build pytorch_linux_bionic_py3_6_clang9_noarch_test (8/10)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun)

Apr 01 18:50:44 RuntimeError: test_foreach failed!
Apr 01 18:50:42 Executing ['/opt/conda/bin/python', 'test_foreach.py', '-v'] ... [2021-04-01 18:50:42.693068]
Apr 01 18:50:43 Traceback (most recent call last):
Apr 01 18:50:43   File "test_foreach.py", line 8, in <module>
Apr 01 18:50:43     from torch.testing._internal.common_methods_invocations import \
Apr 01 18:50:43 ImportError: cannot import name 'foreach_binary_op_tensor_list_db'
Apr 01 18:50:44 Traceback (most recent call last):
Apr 01 18:50:44   File "test/run_test.py", line 1094, in <module>
Apr 01 18:50:44     main()
Apr 01 18:50:44   File "test/run_test.py", line 1073, in main
Apr 01 18:50:44     raise RuntimeError(err_message)
Apr 01 18:50:44 RuntimeError: test_foreach failed!
Apr 01 18:50:44 
Apr 01 18:50:44 real	21m22.861s
Apr 01 18:50:44 user	27m3.757s
Apr 01 18:50:44 sys	4m18.638s
Apr 01 18:50:44 + cleanup
Apr 01 18:50:44 + retcode=1
Apr 01 18:50:44 + set +x
Apr 01 18:50:44 =================== sccache compilation log ===================
Apr 01 18:50:44 ERROR 2021-04-01T18:39:45Z: sccache::server: Compilation failed: Output { status: ExitStatus(ExitStatus(256)), stdout: "", stderr: "/var/lib/jenkins/.cache/torch_extensions/test_compilation_error_formatting/main.cpp: In function ‘int main()’:\n/var/lib/jenkins/.cache/torch_extensions/test_compilation_error_formatting/main.cpp:2:23: error: expected ‘;’ before ‘}’ token\n int main() { return 0 }\n                       ^\n" }
Apr 01 18:50:44 

See CircleCI build pytorch_linux_xenial_py3_6_gcc5_4_test (9/10)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun)

Apr 01 18:49:27 RuntimeError: test_foreach failed!
Apr 01 18:49:26 Executing ['/opt/conda/bin/python', 'test_foreach.py', '-v'] ... [2021-04-01 18:49:26.021023]
Apr 01 18:49:27 Traceback (most recent call last):
Apr 01 18:49:27   File "test_foreach.py", line 8, in <module>
Apr 01 18:49:27     from torch.testing._internal.common_methods_invocations import \
Apr 01 18:49:27 ImportError: cannot import name 'foreach_binary_op_tensor_list_db'
Apr 01 18:49:27 Traceback (most recent call last):
Apr 01 18:49:27   File "test/run_test.py", line 1094, in <module>
Apr 01 18:49:27     main()
Apr 01 18:49:27   File "test/run_test.py", line 1073, in main
Apr 01 18:49:27     raise RuntimeError(err_message)
Apr 01 18:49:27 RuntimeError: test_foreach failed!
Apr 01 18:49:27 + cleanup
Apr 01 18:49:27 + retcode=1
Apr 01 18:49:27 + set +x
Apr 01 18:49:27 =================== sccache compilation log ===================
Apr 01 18:49:27 ERROR 2021-04-01T18:37:39Z: sccache::server: Compilation failed: Output { status: ExitStatus(ExitStatus(256)), stdout: "", stderr: "/var/lib/jenkins/.cache/torch_extensions/test_compilation_error_formatting/main.cpp: In function ‘int main()’:\n/var/lib/jenkins/.cache/torch_extensions/test_compilation_error_formatting/main.cpp:2:23: error: expected ‘;’ before ‘}’ token\n int main() { return 0 }\n                       ^\n" }
Apr 01 18:49:27 
Apr 01 18:49:27 =========== If your build fails, please take a look at the log above for possible reasons ===========
Apr 01 18:49:27 Compile requests                      83
Apr 01 18:49:27 Compile requests executed             54
Apr 01 18:49:27 Cache hits                            29

See CircleCI build pytorch_libtorch_linux_xenial_cuda11_1_cudnn8_py3_gcc7_build (10/10)

Step: "Build" (full log | diagnosis details | 🔁 rerun)

Apr 01 18:18:04 Error generating file
Apr 01 18:18:04 /var/lib/jenkins/workspace/aten/src/ATen/native/cuda/ForeachBinaryOpList.cu(31): error: identifier "alpha" is undefined
Apr 01 18:18:04 
Apr 01 18:18:04 /var/lib/jenkins/workspace/aten/src/ATen/native/cuda/ForeachBinaryOpList.cu(31): error: type name is not allowed
Apr 01 18:18:04 
Apr 01 18:18:04 /var/lib/jenkins/workspace/aten/src/ATen/native/cuda/ForeachBinaryOpList.cu(31): error: expected an expression
Apr 01 18:18:04 
Apr 01 18:18:04 36 errors detected in the compilation of "/var/lib/jenkins/workspace/aten/src/ATen/native/cuda/ForeachBinaryOpList.cu".
Apr 01 18:18:04 -- Removing /var/lib/jenkins/cpp-build/caffe2/build/caffe2/CMakeFiles/torch_cuda_cu.dir/__/aten/src/ATen/native/cuda/./torch_cuda_cu_generated_ForeachBinaryOpList.cu.o
Apr 01 18:18:04 /usr/bin/cmake -E remove /var/lib/jenkins/cpp-build/caffe2/build/caffe2/CMakeFiles/torch_cuda_cu.dir/__/aten/src/ATen/native/cuda/./torch_cuda_cu_generated_ForeachBinaryOpList.cu.o
Apr 01 18:18:04 CMake Error at torch_cuda_cu_generated_ForeachBinaryOpList.cu.o.Debug.cmake:281 (message):
Apr 01 18:18:04   Error generating file
Apr 01 18:18:04   /var/lib/jenkins/cpp-build/caffe2/build/caffe2/CMakeFiles/torch_cuda_cu.dir/__/aten/src/ATen/native/cuda/./torch_cuda_cu_generated_ForeachBinaryOpList.cu.o
Apr 01 18:18:04 
Apr 01 18:18:04 
Apr 01 18:18:04 make[2]: *** [caffe2/CMakeFiles/torch_cuda_cu.dir/__/aten/src/ATen/native/cuda/torch_cuda_cu_generated_ForeachBinaryOpList.cu.o] Error 1
Apr 01 18:18:04 make[2]: *** Waiting for unfinished jobs....
Apr 01 18:18:04 caffe2/CMakeFiles/torch_cuda_cu.dir/build.make:553: recipe for target 'caffe2/CMakeFiles/torch_cuda_cu.dir/__/aten/src/ATen/native/cuda/torch_cuda_cu_generated_ForeachBinaryOpList.cu.o' failed
Apr 01 18:18:05 -- Generating temporary cmake readable file: /var/lib/jenkins/cpp-build/caffe2/build/caffe2/CMakeFiles/torch_cuda_cu.dir/__/aten/src/ATen/native/cuda/torch_cuda_cu_generated_AdaptiveAveragePooling.cu.o.depend.tmp
Apr 01 18:18:05 /usr/bin/cmake -D input_file:FILEPATH=/var/lib/jenkins/cpp-build/caffe2/build/caffe2/CMakeFiles/torch_cuda_cu.dir/__/aten/src/ATen/native/cuda/torch_cuda_cu_generated_AdaptiveAveragePooling.cu.o.NVCC-depend -D output_file:FILEPATH=/var/lib/jenkins/cpp-build/caffe2/build/caffe2/CMakeFiles/torch_cuda_cu.dir/__/aten/src/ATen/native/cuda/torch_cuda_cu_generated_AdaptiveAveragePooling.cu.o.depend.tmp -D verbose=1 -P /var/lib/jenkins/workspace/cmake/Modules_CUDA_fix/upstream/FindCUDA/make2cmake.cmake
Apr 01 18:18:05 CMake Warning at /var/lib/jenkins/workspace/cmake/Modules_CUDA_fix/upstream/FindCUDA/make2cmake.cmake:76 (message):
Apr 01 18:18:05    Removing non-existent dependency file: TH/generic/THStorage.h

2 failures not recognized by patterns:

Job Step Action
GitHub Actions flake8-py3 Fail if there were any warnings 🔁 rerun
GitHub Actions quick-checks Ensure correct trailing newlines 🔁 rerun

ci.pytorch.org: 1 failed


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.


ScalarType result_type = get_result_type(tensors[0], scalars[0], promote_integer_float);
for (int i = 0; i < tensors.size(); i++) {
tensor_lists[0].emplace_back(tensors[i].to(result_type));
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lest I forget, this is not a good way to do type promotion - it will launch as many kernels as "slow" path (each .to launches a kernel), and will probably result in more memory traffic.

Iurii Zdebskyi and others added 8 commits March 1, 2021 14:58
@izdeby izdeby changed the title Add type promotion to a fast path for foreach APIs [WIP] Add type promotion to a fast path for foreach APIs Mar 16, 2021
@facebook-github-bot
Copy link
Copy Markdown
Contributor

Hi @izdeby!

Thank you for your pull request.

We require contributors to sign our Contributor License Agreement, and yours needs attention.

You currently have a record in our system, but the CLA is no longer valid, and will need to be resubmitted.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at cla@fb.com. Thanks!

facebook-github-bot pushed a commit that referenced this pull request May 25, 2021
Summary:
This is based on  #48224.

To make `foreach` more flexible, this PR pushes unsupported cases to slow path.
Also, this adds some tests to verify that
- `foreach` functions work with tensors of different dtypes and/or memory layouts in 7bd4b2c
- `foreach` functions work with tensors on different devices in a list, but are on the same device if the indices are the same: def4b9b

Future plans:
1. Improve the coverage of unittests using `ops` decorator & updating `foreach_unary_op_db` and creating `foreach_(binary|pointwise|minmax)_db`.
2. Support broadcasting in slow path. Ref:  #52448
3. Support type promotion in fast path. Ref #52449

CC: ngimel mcarilli  ptrblck

Pull Request resolved: #56993

Reviewed By: zou3519

Differential Revision: D28630580

Pulled By: ngimel

fbshipit-source-id: e26ee74a39a591025e18c1ead48948cb7ec53c19
deniskokarev pushed a commit to deniskokarev/pytorch that referenced this pull request Jun 9, 2021
Summary:
This is based on  pytorch#48224.

To make `foreach` more flexible, this PR pushes unsupported cases to slow path.
Also, this adds some tests to verify that
- `foreach` functions work with tensors of different dtypes and/or memory layouts in pytorch@7bd4b2c
- `foreach` functions work with tensors on different devices in a list, but are on the same device if the indices are the same: pytorch@def4b9b

Future plans:
1. Improve the coverage of unittests using `ops` decorator & updating `foreach_unary_op_db` and creating `foreach_(binary|pointwise|minmax)_db`.
2. Support broadcasting in slow path. Ref:  pytorch#52448
3. Support type promotion in fast path. Ref pytorch#52449

CC: ngimel mcarilli  ptrblck

Pull Request resolved: pytorch#56993

Reviewed By: zou3519

Differential Revision: D28630580

Pulled By: ngimel

fbshipit-source-id: e26ee74a39a591025e18c1ead48948cb7ec53c19
@pytorchbot
Copy link
Copy Markdown
Collaborator

Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as Stale.
Feel free to remove the Stale label if you feel this was a mistake.
If you are unable to remove the Stale label please contact a maintainer in order to do so.
Stale pull requests will automatically be closed 30 days after being marked Stale

@github-actions github-actions bot closed this May 12, 2022
@facebook-github-bot facebook-github-bot deleted the gh/izdeby/88/head branch June 11, 2022 14:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants