Add padding='same' mode to conv{1,2,3}d, nn.Conv{1,2,3}d by peterbell10 · Pull Request #42190 · pytorch/pytorch

peterbell10 · 2020-07-28T20:41:14Z

First part of #3867 (Pooling operators still to do)

This adds a padding='same' mode to the interface of conv{n}dand nn.Conv{n}d. This should match the behaviour of tensorflow. I couldn't find it explicitly documented but through experimentation I found tensorflow returns the shape ceil(len/stride) and always adds any extra asymmetric padding onto the right side of the input.

Since the native_functions.yaml schema doesn't seem to support strings or enums, I've moved the function interface into python and it now dispatches between the numerically padded conv{n}d and the _conv{n}d_same variant. Underscores because I couldn't see any way to avoid exporting a function into the torch namespace.

A note on asymmetric padding. The total padding required can be odd if both the kernel-length is even and the dilation is odd. mkldnn has native support for asymmetric padding, so there is no overhead there, but for other backends I resort to padding the input tensor by 1 on the right hand side to make the remaining padding symmetrical. In these cases, I use TORCH_WARN_ONCE to notify the user of the performance implications.

dr-ci · 2020-07-28T21:02:47Z

💊 CI failures summary and remediations

As of commit 28ff463 (more details on the Dr. CI page):

10/10 failures possibly* introduced in this PR
- 1/10 non-CircleCI failure(s)

🕵️ 9 new failures recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

pytorch_linux_xenial_py3_6_gcc5_4_ge_config_simple_test (1/9)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun)

Sep 06 22:09:10 ERROR: test_optimize_for_mobile (__main__.TestOptimizer)

Sep 06 22:09:08 Running test_mobile_optimizer ... [2020-09-06 22:09:08.437630] 
/python', 'test_mobile_optimizer.py', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose'] ... [2020-09-06 22:09:08.437751] 
Sep 06 22:09:09 test_generate_mobile_module_lints (__main__.TestOptimizer) ... ok 
Sep 06 22:09:09 test_hoist_conv_packed_params (__main__.TestOptimizer) ... ok 
Sep 06 22:09:10 test_optimize_for_mobile (__main__.TestOptimizer) ... /opt/conda/lib/python3.6/site-packages/torch/testing/_internal/common_utils.py:118: UserWarning: Named tensors and all their associated APIs are an experimental feature and subject to change. Please do not use them for anything important until they are released as stable. (Triggered internally at  /var/lib/jenkins/workspace/c10/core/TensorImpl.h:848.) 
Sep 06 22:09:10   return callable(*args, **kwargs) 
Sep 06 22:09:10 ERROR 
Sep 06 22:09:10 test_quantized_conv_no_asan_failures (__main__.TestOptimizer) ... ok 
Sep 06 22:09:10  
Sep 06 22:09:10 ====================================================================== 
Sep 06 22:09:10 ERROR: test_optimize_for_mobile (__main__.TestOptimizer) 
Sep 06 22:09:10 ---------------------------------------------------------------------- 
Sep 06 22:09:10 Traceback (most recent call last): 
Sep 06 22:09:10   File "test_mobile_optimizer.py", line 99, in test_optimize_for_mobile 
Sep 06 22:09:10     .run(optimized_scripted_model.graph) 
Sep 06 22:09:10 RuntimeError: Expected to find "prepacked::conv2d_clamp_run" but did not find it 
Sep 06 22:09:10 Searched string: 
Sep 06 22:09:10 graph(%self : __torch__.___torch_mangle_51.MyTestModule, 
Sep 06 22:09:10 ~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE 
Sep 06 22:09:10       %x.1 : Tensor): 
Sep 06 22:09:10   %self.conv_weight : Float(24:54, 6:9, 3:3, 3:1, requires_grad=0, device=cpu) = prim::Constant[value=<Tensor>]()

pytorch_windows_vs2019_py36_cuda10.1_test2 (2/9)

Step: "Test" (full log | diagnosis details | 🔁 rerun)

RuntimeError: test_mkldnn failed!

  "The input to trace is already a ScriptModule, tracing it is a no-op. Returning the object as is." 
ok (0.083s) 
  test_batch_norm3d (__main__.TestMkldnn) ... ok (0.897s) 
  test_clone (__main__.TestMkldnn) ... ok (0.009s) 
  test_conv1d (__main__.TestMkldnn) ... ok (0.109s) 
  test_conv1d_same_padding (__main__.TestMkldnn) ... Traceback (most recent call last): 
  File "run_test.py", line 735, in <module> 
    main() 
  File "run_test.py", line 718, in main 
    raise RuntimeError(err_message) 
RuntimeError: test_mkldnn failed! 
 
(base) circleci@PACKER-5F0EEC91 C:\Users\circleci\project\test>if ERRORLEVEL 1 exit /b 1  
+ cleanup
+ retcode=1
+ set +x

pytorch_macos_10_13_py3_test (3/9)

Step: "Test" (full log | diagnosis details | 🔁 rerun)

Sep 06 15:21:22 ERROR: test_optimize_for_mobile (__main__.TestOptimizer)

Sep 06 15:21:20 Running test_mobile_optimizer ... [2020-09-06 15:21:20.183435] 
st_mobile_optimizer.py', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose'] ... [2020-09-06 15:21:20.183762] 
Sep 06 15:21:21 test_generate_mobile_module_lints (__main__.TestOptimizer) ... ok 
Sep 06 15:21:22 test_hoist_conv_packed_params (__main__.TestOptimizer) ... ok 
Sep 06 15:21:22 test_optimize_for_mobile (__main__.TestOptimizer) ... /Users/distiller/workspace/miniconda3/lib/python3.7/site-packages/torch/testing/_internal/common_utils.py:118: UserWarning: Named tensors and all their associated APIs are an experimental feature and subject to change. Please do not use them for anything important until they are released as stable. (Triggered internally at  ../c10/core/TensorImpl.h:848.) 
Sep 06 15:21:22   return callable(*args, **kwargs) 
Sep 06 15:21:22 ERROR 
Sep 06 15:21:22 test_quantized_conv_no_asan_failures (__main__.TestOptimizer) ... ok 
Sep 06 15:21:22  
Sep 06 15:21:22 ====================================================================== 
Sep 06 15:21:22 ERROR: test_optimize_for_mobile (__main__.TestOptimizer) 
Sep 06 15:21:22 ---------------------------------------------------------------------- 
Sep 06 15:21:22 Traceback (most recent call last): 
Sep 06 15:21:22   File "test_mobile_optimizer.py", line 99, in test_optimize_for_mobile 
Sep 06 15:21:22     .run(optimized_scripted_model.graph) 
Sep 06 15:21:22 RuntimeError: Expected to find "prepacked::conv2d_clamp_run" but did not find it 
Sep 06 15:21:22 Searched string: 
Sep 06 15:21:22 graph(%self : __torch__.___torch_mangle_51.MyTestModule, 
Sep 06 15:21:22 ~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE 
Sep 06 15:21:22       %x.1 : Tensor): 
Sep 06 15:21:22   %self.conv_weight : Float(24:54, 6:9, 3:3, 3:1, requires_grad=0, device=cpu) = prim::Constant[value=<Tensor>]()

pytorch_linux_xenial_cuda10_2_cudnn7_py3_gcc7_test (4/9)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun)

Sep 07 00:33:26 ERROR: test_optimize_for_mobile (__main__.TestOptimizer)

Sep 07 00:33:24 Running test_mobile_optimizer ... [2020-09-07 00:33:24.184834] 
imizer.py', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose'] ... [2020-09-07 00:33:24.184963] 
Sep 07 00:33:24 test_generate_mobile_module_lints (__main__.TestOptimizer) ... ok 
Sep 07 00:33:25 test_hoist_conv_packed_params (__main__.TestOptimizer) ... ok 
Sep 07 00:33:25 test_optimize_for_mobile (__main__.TestOptimizer) ... /opt/conda/lib/python3.6/site-packages/torch/testing/_internal/common_utils.py:118: UserWarning: Named tensors and all their associated APIs are an experimental feature and subject to change. Please do not use them for anything important until they are released as stable. (Triggered internally at  /var/lib/jenkins/workspace/c10/core/TensorImpl.h:848.) 
Sep 07 00:33:25   return callable(*args, **kwargs) 
Sep 07 00:33:25 ERROR 
Sep 07 00:33:26 test_quantized_conv_no_asan_failures (__main__.TestOptimizer) ... ok 
Sep 07 00:33:26  
Sep 07 00:33:26 ====================================================================== 
Sep 07 00:33:26 ERROR: test_optimize_for_mobile (__main__.TestOptimizer) 
Sep 07 00:33:26 ---------------------------------------------------------------------- 
Sep 07 00:33:26 Traceback (most recent call last): 
Sep 07 00:33:26   File "test_mobile_optimizer.py", line 99, in test_optimize_for_mobile 
Sep 07 00:33:26     .run(optimized_scripted_model.graph) 
Sep 07 00:33:26 RuntimeError: Expected to find "prepacked::conv2d_clamp_run" but did not find it 
Sep 07 00:33:26 Searched string: 
Sep 07 00:33:26 graph(%self : __torch__.___torch_mangle_51.MyTestModule, 
Sep 07 00:33:26 ~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE 
Sep 07 00:33:26       %x.1 : Tensor): 
Sep 07 00:33:26   %self.conv_weight : Float(24:54, 6:9, 3:3, 3:1, requires_grad=0, device=cpu) = prim::Constant[value=<Tensor>]()

pytorch_linux_bionic_py3_6_clang9_test (5/9)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun)

Sep 06 22:02:21 ERROR: test_optimize_for_mobile (__main__.TestOptimizer)

Sep 06 22:02:19 Running test_mobile_optimizer ... [2020-09-06 22:02:19.400870] 
/python', 'test_mobile_optimizer.py', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose'] ... [2020-09-06 22:02:19.401001] 
Sep 06 22:02:20 test_generate_mobile_module_lints (__main__.TestOptimizer) ... ok 
Sep 06 22:02:20 test_hoist_conv_packed_params (__main__.TestOptimizer) ... ok 
Sep 06 22:02:20 test_optimize_for_mobile (__main__.TestOptimizer) ... /opt/conda/lib/python3.6/site-packages/torch/testing/_internal/common_utils.py:118: UserWarning: Named tensors and all their associated APIs are an experimental feature and subject to change. Please do not use them for anything important until they are released as stable. (Triggered internally at  /var/lib/jenkins/workspace/c10/core/TensorImpl.h:848.) 
Sep 06 22:02:20   return callable(*args, **kwargs) 
Sep 06 22:02:20 ERROR 
Sep 06 22:02:21 test_quantized_conv_no_asan_failures (__main__.TestOptimizer) ... ok 
Sep 06 22:02:21  
Sep 06 22:02:21 ====================================================================== 
Sep 06 22:02:21 ERROR: test_optimize_for_mobile (__main__.TestOptimizer) 
Sep 06 22:02:21 ---------------------------------------------------------------------- 
Sep 06 22:02:21 Traceback (most recent call last): 
Sep 06 22:02:21   File "test_mobile_optimizer.py", line 99, in test_optimize_for_mobile 
Sep 06 22:02:21     .run(optimized_scripted_model.graph) 
Sep 06 22:02:21 RuntimeError: Expected to find "prepacked::conv2d_clamp_run" but did not find it 
Sep 06 22:02:21 Searched string: 
Sep 06 22:02:21 graph(%self : __torch__.___torch_mangle_51.MyTestModule, 
Sep 06 22:02:21 ~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE 
Sep 06 22:02:21       %x.1 : Tensor): 
Sep 06 22:02:21   %self.conv_weight : Float(24:54, 6:9, 3:3, 3:1, requires_grad=0, device=cpu) = prim::Constant[value=<Tensor>]()

pytorch_linux_xenial_py3_6_gcc5_4_test (6/9)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun)

Sep 06 22:09:43 ERROR: test_optimize_for_mobile (__main__.TestOptimizer)

Sep 06 22:09:41 Running test_mobile_optimizer ... [2020-09-06 22:09:41.466948] 
/python', 'test_mobile_optimizer.py', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose'] ... [2020-09-06 22:09:41.467053] 
Sep 06 22:09:42 test_generate_mobile_module_lints (__main__.TestOptimizer) ... ok 
Sep 06 22:09:43 test_hoist_conv_packed_params (__main__.TestOptimizer) ... ok 
Sep 06 22:09:43 test_optimize_for_mobile (__main__.TestOptimizer) ... /opt/conda/lib/python3.6/site-packages/torch/testing/_internal/common_utils.py:118: UserWarning: Named tensors and all their associated APIs are an experimental feature and subject to change. Please do not use them for anything important until they are released as stable. (Triggered internally at  /var/lib/jenkins/workspace/c10/core/TensorImpl.h:848.) 
Sep 06 22:09:43   return callable(*args, **kwargs) 
Sep 06 22:09:43 ERROR 
Sep 06 22:09:43 test_quantized_conv_no_asan_failures (__main__.TestOptimizer) ... ok 
Sep 06 22:09:43  
Sep 06 22:09:43 ====================================================================== 
Sep 06 22:09:43 ERROR: test_optimize_for_mobile (__main__.TestOptimizer) 
Sep 06 22:09:43 ---------------------------------------------------------------------- 
Sep 06 22:09:43 Traceback (most recent call last): 
Sep 06 22:09:43   File "test_mobile_optimizer.py", line 99, in test_optimize_for_mobile 
Sep 06 22:09:43     .run(optimized_scripted_model.graph) 
Sep 06 22:09:43 RuntimeError: Expected to find "prepacked::conv2d_clamp_run" but did not find it 
Sep 06 22:09:43 Searched string: 
Sep 06 22:09:43 graph(%self : __torch__.___torch_mangle_51.MyTestModule, 
Sep 06 22:09:43 ~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE 
Sep 06 22:09:43       %x.1 : Tensor): 
Sep 06 22:09:43   %self.conv_weight : Float(24:54, 6:9, 3:3, 3:1, requires_grad=0, device=cpu) = prim::Constant[value=<Tensor>]()

pytorch_xla_linux_bionic_py3_6_clang9_build (7/9)

Step: "Build" (full log | diagnosis details | 🔁 rerun)

Sep 06 21:38:17 ERROR:sccache::server: Compilation failed: Output { status: ExitStatus(ExitStatus(256)), stdout: "", stderr: "/var/lib/jenkins/workspace/build/CMakeFiles/CMakeTmp/CheckSymbolExists.c:8:19: error: use of undeclared identifier \'strtod_l\'\n  return ((int*)(&strtod_l))[argc];\n                  ^\n1 error generated.\n" }

Sep 06 21:38:17 CMakeFiles/Makefile2:72: recipe for target 'CMakeFiles/test_ptxla.dir/all' failed 
Sep 06 21:38:17 Makefile:83: recipe for target 'all' failed 
Sep 06 21:38:17 make: *** [all] Error 2 
Sep 06 21:38:17 /opt/conda/lib/python3.6/site-packages/torch/utils/cpp_extension.py:340: UserWarning: Attempted to use ninja as the BuildExtension backend but we could not find ninja.. Falling back to using the slow distutils backend. 
Sep 06 21:38:17   warnings.warn(msg.format('we could not find ninja.')) 
Sep 06 21:38:17 Failed to build tests: ['/var/lib/jenkins/workspace/xla/test/cpp/run_tests.sh', '-B'] 
Sep 06 21:38:17 + cleanup 
Sep 06 21:38:17 + retcode=1 
Sep 06 21:38:17 + set +x 
Sep 06 21:38:17 =================== sccache compilation log =================== 
Sep 06 21:38:17 ERROR:sccache::server: Compilation failed: Output { status: ExitStatus(ExitStatus(256)), stdout: "", stderr: "/var/lib/jenkins/workspace/build/CMakeFiles/CMakeTmp/CheckSymbolExists.c:8:19: error: use of undeclared identifier \'strtod_l\'\n  return ((int*)(&strtod_l))[argc];\n                  ^\n1 error generated.\n" } 
Sep 06 21:38:17  
Sep 06 21:38:17 =========== If your build fails, please take a look at the log above for possible reasons =========== 
Sep 06 21:38:17 Compile requests               6121 
Sep 06 21:38:17 Compile requests executed      3619 
Sep 06 21:38:17 Cache hits                     2756 
Sep 06 21:38:17 Cache misses                    847 
Sep 06 21:38:17 Cache timeouts                    0 
Sep 06 21:38:17 Cache read errors                 0 
Sep 06 21:38:17 Forced recaches                   0 
Sep 06 21:38:17 Cache write errors                0

pytorch_linux_bionic_py3_8_gcc9_coverage_test (8/9)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun)

Sep 06 22:18:29 ERROR: test_optimize_for_mobile (__main__.TestOptimizer)

Sep 06 22:18:27 Running test_mobile_optimizer ... [2020-09-06 22:18:27.468478] 
e=torch', 'test_mobile_optimizer.py', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose'] ... [2020-09-06 22:18:27.468574] 
Sep 06 22:18:28 test_generate_mobile_module_lints (__main__.TestOptimizer) ... ok 
Sep 06 22:18:29 test_hoist_conv_packed_params (__main__.TestOptimizer) ... ok 
Sep 06 22:18:29 test_optimize_for_mobile (__main__.TestOptimizer) ... /opt/conda/lib/python3.8/site-packages/torch/testing/_internal/common_utils.py:118: UserWarning: Named tensors and all their associated APIs are an experimental feature and subject to change. Please do not use them for anything important until they are released as stable. (Triggered internally at  /var/lib/jenkins/workspace/c10/core/TensorImpl.h:845.) 
Sep 06 22:18:29   return callable(*args, **kwargs) 
Sep 06 22:18:29 ERROR 
Sep 06 22:18:29 test_quantized_conv_no_asan_failures (__main__.TestOptimizer) ... ok 
Sep 06 22:18:29  
Sep 06 22:18:29 ====================================================================== 
Sep 06 22:18:29 ERROR: test_optimize_for_mobile (__main__.TestOptimizer) 
Sep 06 22:18:29 ---------------------------------------------------------------------- 
Sep 06 22:18:29 Traceback (most recent call last): 
Sep 06 22:18:29   File "test_mobile_optimizer.py", line 90, in test_optimize_for_mobile 
Sep 06 22:18:29     FileCheck().check_not("Tensor = aten::conv2d") \ 
Sep 06 22:18:29 RuntimeError: Expected to find "prepacked::conv2d_clamp_run" but did not find it 
Sep 06 22:18:29 Searched string: 
Sep 06 22:18:29 graph(%self : __torch__.___torch_mangle_51.MyTestModule, 
Sep 06 22:18:29 ~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE 
Sep 06 22:18:29       %x.1 : Tensor): 
Sep 06 22:18:29   %self.conv_weight : Float(24:54, 6:9, 3:3, 3:1, requires_grad=0, device=cpu) = prim::Constant[value=<Tensor>]()

pytorch_linux_xenial_py3_clang5_asan_test2 (9/9)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun)

Sep 06 22:47:04 ERROR: test_optimize_for_mobile (__main__.TestOptimizer)

Sep 06 22:46:58 Running test_mobile_optimizer ... [2020-09-06 22:46:58.378167] 
/python', 'test_mobile_optimizer.py', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose', '--verbose'] ... [2020-09-06 22:46:58.378315] 
Sep 06 22:47:00 test_generate_mobile_module_lints (__main__.TestOptimizer) ... ok 
Sep 06 22:47:03 test_hoist_conv_packed_params (__main__.TestOptimizer) ... ok 
Sep 06 22:47:03 test_optimize_for_mobile (__main__.TestOptimizer) ... /opt/conda/lib/python3.6/site-packages/torch/testing/_internal/common_utils.py:118: UserWarning: Named tensors and all their associated APIs are an experimental feature and subject to change. Please do not use them for anything important until they are released as stable. (Triggered internally at  /var/lib/jenkins/workspace/c10/core/TensorImpl.h:848.) 
Sep 06 22:47:03   return callable(*args, **kwargs) 
Sep 06 22:47:03 ERROR 
Sep 06 22:47:04 test_quantized_conv_no_asan_failures (__main__.TestOptimizer) ... ok 
Sep 06 22:47:04  
Sep 06 22:47:04 ====================================================================== 
Sep 06 22:47:04 ERROR: test_optimize_for_mobile (__main__.TestOptimizer) 
Sep 06 22:47:04 ---------------------------------------------------------------------- 
Sep 06 22:47:04 Traceback (most recent call last): 
Sep 06 22:47:04   File "test_mobile_optimizer.py", line 99, in test_optimize_for_mobile 
Sep 06 22:47:04     .run(optimized_scripted_model.graph) 
Sep 06 22:47:04 RuntimeError: Expected to find "prepacked::conv2d_clamp_run" but did not find it 
Sep 06 22:47:04 Searched string: 
Sep 06 22:47:04 graph(%self : __torch__.___torch_mangle_51.MyTestModule, 
Sep 06 22:47:04 ~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE 
Sep 06 22:47:04       %x.1 : Tensor): 
Sep 06 22:47:04   %self.conv_weight : Float(24:54, 6:9, 3:3, 3:1, requires_grad=0, device=cpu) = prim::Constant[value=<Tensor>]()

ci.pytorch.org: 1 failed

Failed: pr/pytorch-linux-bionic-rocm3.7-py3.6

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group.

See how this bot performed.

This comment has been revised 111 times.

torch/nn/functional.py

peterbell10 · 2020-08-06T20:33:01Z

aten/src/ATen/native/native_functions.yaml

Okay, turns out I was mistaken and native_functions.yaml does indeed support str. It was actually failing to compile because it won't accept str padding as not having a default while the earlier arguments bias and stride do.

If padding is given a default value it can be ambiguous with the int[1] overload and C++ code fails to compile. At the same time, bias and stride need to have defaults in python for when you call it like conv1d(input, weight, padding='same'). I found this ugly workaround of defining these two extra overloads where padding is keyword-only and to the left of all the defaulted arguments.

peterbell10 · 2020-08-10T17:17:38Z

aten/src/ATen/native/mkldnn/Conv.cpp

Tensor.contiguous() here doesn't work for mkldnn tensors. There are also other places where the grads are all returned as CPU tensors and autograd will raise an error about expecting layout=Mkldnn.

It seems the mkldnn backend is not actually tested properly with mkldnn tensors as inputs. Is that not the expected use case?

ezyang · 2020-08-12T16:02:02Z

The JIT schema changes look... pretty gnarly. We may need to figure out if there is some core changes we should make first to support this addition.

eellison · 2020-08-12T16:20:56Z

what do ppl think of the approach in https://github.com/pytorch/pytorch/pull/22484/files ?

peterbell10 · 2020-08-12T18:17:40Z

Oh, I somehow missed that PR reading through the issue. The obvious problem is that by calculating the padding in python, there is no support in the C++ torch api.

eellison · 2020-08-21T17:41:26Z

@peterbell10 it's true that it doesn't cover the C++ frontend, but I think we should also have some consideration for the op library. Adding overloads like conv1d.padding_mode is not ideal for any consumer of our IR. @ezyang do you have any thoughts re:approach of the PR I linked ?

ezyang · 2020-08-21T18:24:35Z

We're going to talk about this at the composability meeting today

peterbell10 · 2020-08-24T16:39:19Z

We're going to talk about this at the composability meeting today

@ezyang what was the outcome of that discussion?

ezyang · 2020-08-25T20:19:06Z

@bhosmer signed up to scribe here

bhosmer · 2020-08-25T20:21:58Z

@bhosmer signed up to scribe here

Confirming - I'll put up a summary of our current thinking in the next day or so. [edit: next day or two 😬 ]

bhosmer · 2020-09-01T09:38:55Z

HI @peterbell10, many apologies for the delayed recap. Here are the main points we arrived at in that discussion, along with some subsequent thought:

it'd be nice to not have to have Python wrappers simply to avoid schema defaulting issues: I think the things you're running into (ambiguous overloads, string defaulting) aren't fundamental and should be fixable (tracking issues: String defaults aren't handled well in native_functions.yaml #43944, caffe2/aten bridge doesn't handle overloads with ambiguous defaulting #43945). Solving them should let you specify intuitive overloads of conv{n}d in native_functions.yaml.
given a choice between enums and strings for representing the padding modes themselves, we wound up thinking strings were the better choice given their lighter weight, as long as the implementations are careful about interning, avoiding repeated comparisons etc.
a disadvantage to downcoding the mode into actual padding values prior to calling the traced op, as the current PR does (via the convolution_same calls in the bodies of conv{n}d) is that baking the calculated values into the trace make it no longer size invariant. As for the logistics of plumbing the mode down to the traced op, we had some ideas around adding an overload of _convolution which took the padding mode, and then having both the current and new overloads call a (untraced) __convolution helper that took the calculated padding values. But this needs a bit more thinking.

I think our next steps are:

fixing the defaulting and overloading issues you ran into in native_functions.yaml
figuring out how best to wrangle the plumbing of padding mode down to a canonical traced op, rather than calculating

Also cc @eellison - I don't think we gave your misgivings a full airing in our discussion, can you elaborate a little on the downside of adding the padding-mode overloads for conv{n}d?

peterbell10 · 2020-09-06T20:36:28Z

As for the logistics of plumbing the mode down to the traced op, we had some ideas around adding an overload of _convolution which took the padding mode, and then having both the current and new overloads call a (untraced) __convolution helper that took the calculated padding values. But this needs a bit more thinking.

I think I've found a nice solution in the latest commit. I noticed that the tracer stops at at::_convolution and doesn't expand it down to the actual implementation like at::cudnn_convolution. So, in a similar way I've added the at::_convolution_mode function which takes in a mode string and internally makes the padding values concrete and just calls at::convolution. When I trace a simple function call, the graph only has aten::_convolution_mode and not any padding values.

If I understand correctly, the tracer will stop at the first function call that isn't in the DONT_RECORD_TRACE list and not expand any deeper. Is that right?

peterbell10 · 2020-10-01T14:39:25Z

This PR was getting pretty large with changes touching codegen, convolution and mkldnn-specific changes. So, I've split this into a ghstack of PRs, the main one being #45667

peterbell10 added the open source label Jul 28, 2020

peterbell10 requested a review from apaszke as a code owner July 28, 2020 20:41

facebook-github-bot added the oncall: jit Add this issue/PR to JIT oncall triage queue label Jul 28, 2020

peterbell10 force-pushed the conv_same branch from ffb67e6 to 5502ee2 Compare July 28, 2020 21:05

peterbell10 marked this pull request as draft July 29, 2020 01:28

peterbell10 force-pushed the conv_same branch 5 times, most recently from 922b06f to 735a7f9 Compare July 30, 2020 19:07

mruberry removed the request for review from apaszke July 31, 2020 03:30

mruberry added module: nn Related to torch.nn triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module and removed oncall: jit Add this issue/PR to JIT oncall triage queue labels Jul 31, 2020

mruberry self-requested a review July 31, 2020 20:52

peterbell10 commented Aug 3, 2020

View reviewed changes

torch/nn/functional.py Outdated Show resolved Hide resolved

peterbell10 force-pushed the conv_same branch 2 times, most recently from 48ffdf3 to c2996d8 Compare August 6, 2020 20:20

peterbell10 commented Aug 6, 2020

View reviewed changes

peterbell10 force-pushed the conv_same branch 2 times, most recently from 436104c to c836ae5 Compare August 10, 2020 17:18

peterbell10 commented Aug 10, 2020

View reviewed changes

ezyang requested a review from dreiss August 12, 2020 16:01

peterbell10 force-pushed the conv_same branch 3 times, most recently from 573dfcb to c28fb97 Compare August 20, 2020 14:41

peterbell10 force-pushed the conv_same branch 3 times, most recently from 6905d13 to 438bca6 Compare August 21, 2020 12:21

This was referenced Sep 1, 2020

String defaults aren't handled well in native_functions.yaml #43944

Closed

caffe2/aten bridge doesn't handle overloads with ambiguous defaulting #43945

Closed

peterbell10 added 2 commits September 6, 2020 20:17

Add padding='same' mode to conv{1,2,3}d

f7a9b73

Move docstring into the conv{1,2,3}d function bodies

efa203b

peterbell10 force-pushed the conv_same branch from 438bca6 to aea0c7f Compare September 6, 2020 20:10

Hide convolution_same from jit IR

28ff463

peterbell10 force-pushed the conv_same branch from aea0c7f to 28ff463 Compare September 6, 2020 20:11

peterbell10 closed this Oct 1, 2020

Conversation

peterbell10 commented Jul 28, 2020

Uh oh!

dr-ci bot commented Jul 28, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

💊 CI failures summary and remediations

🕵️ 9 new failures recognized by patterns

pytorch_linux_xenial_py3_6_gcc5_4_ge_config_simple_test (1/9)

pytorch_windows_vs2019_py36_cuda10.1_test2 (2/9)

pytorch_macos_10_13_py3_test (3/9)

pytorch_linux_xenial_cuda10_2_cudnn7_py3_gcc7_test (4/9)

pytorch_linux_bionic_py3_6_clang9_test (5/9)

pytorch_linux_xenial_py3_6_gcc5_4_test (6/9)

pytorch_xla_linux_bionic_py3_6_clang9_build (7/9)

pytorch_linux_bionic_py3_8_gcc9_coverage_test (8/9)

pytorch_linux_xenial_py3_clang5_asan_test2 (9/9)

ci.pytorch.org: 1 failed

Uh oh!

Uh oh!

peterbell10 Aug 6, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ezyang Aug 12, 2020

Choose a reason for hiding this comment

Uh oh!

peterbell10 Aug 10, 2020

Choose a reason for hiding this comment

Uh oh!

ezyang commented Aug 12, 2020

Uh oh!

eellison commented Aug 12, 2020

Uh oh!

peterbell10 commented Aug 12, 2020

Uh oh!

eellison commented Aug 21, 2020

Uh oh!

ezyang commented Aug 21, 2020

Uh oh!

peterbell10 commented Aug 24, 2020

Uh oh!

ezyang commented Aug 25, 2020

Uh oh!

bhosmer commented Aug 25, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bhosmer commented Sep 1, 2020

Uh oh!

peterbell10 commented Sep 6, 2020

Uh oh!

peterbell10 commented Oct 1, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

dr-ci bot commented Jul 28, 2020 •

edited

Loading

peterbell10 Aug 6, 2020 •

edited

Loading

bhosmer commented Aug 25, 2020 •

edited

Loading