Skip to content

torch.clamp with tensor min and max#52695

Closed
peterbell10 wants to merge 6 commits intopytorch:masterfrom
peterbell10:clamp-tensor
Closed

torch.clamp with tensor min and max#52695
peterbell10 wants to merge 6 commits intopytorch:masterfrom
peterbell10:clamp-tensor

Conversation

@peterbell10
Copy link
Copy Markdown
Collaborator

Fixes gh-2793

@facebook-github-bot
Copy link
Copy Markdown
Contributor

facebook-github-bot commented Feb 23, 2021

💊 CI failures summary and remediations

As of commit b841b02 (more details on the Dr. CI page):


  • 3/3 failures possibly* introduced in this PR
    • 1/3 non-scanned failure(s)

🕵️ 2 new failures recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

See CircleCI build pytorch_xla_linux_bionic_py3_6_clang9_build (1/2)

Step: "Build" (full log | diagnosis details | 🔁 rerun)

May 03 15:45:54 torch_xla/csrc/aten_xla_type_de...ional' to 'const optional'
May 03 15:45:30 clang-9 -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/var/lib/jenkins/workspace/xla -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-bin -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow/external/protobuf_archive/src -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow/external/com_google_protobuf/src -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow/external/eigen_archive -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow/external/com_google_absl -I/var/lib/jenkins/workspace -I/var/lib/jenkins/workspace/torch/csrc -I/var/lib/jenkins/workspace/torch/lib/tmp_install/include -I/opt/conda/lib/python3.6/site-packages/torch/include -I/opt/conda/lib/python3.6/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/lib/python3.6/site-packages/torch/include/TH -I/opt/conda/lib/python3.6/site-packages/torch/include/THC -I/opt/conda/include/python3.6m -c torch_xla/csrc/device.cpp -o build/temp.linux-x86_64-3.6/torch_xla/csrc/device.o -std=c++14 -Wno-sign-compare -Wno-deprecated-declarations -Wno-return-type -Wno-macro-redefined -Wno-return-std-move -DNDEBUG -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="_clang" -DPYBIND11_STDLIB="_libstdcpp" -DPYBIND11_BUILD_ABI="_cxxabi1002" -DTORCH_EXTENSION_NAME=_XLAC -D_GLIBCXX_USE_CXX11_ABI=1
May 03 15:45:31 clang-9 -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/var/lib/jenkins/workspace/xla -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-bin -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow/external/protobuf_archive/src -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow/external/com_google_protobuf/src -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow/external/eigen_archive -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow/external/com_google_absl -I/var/lib/jenkins/workspace -I/var/lib/jenkins/workspace/torch/csrc -I/var/lib/jenkins/workspace/torch/lib/tmp_install/include -I/opt/conda/lib/python3.6/site-packages/torch/include -I/opt/conda/lib/python3.6/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/lib/python3.6/site-packages/torch/include/TH -I/opt/conda/lib/python3.6/site-packages/torch/include/THC -I/opt/conda/include/python3.6m -c torch_xla/csrc/init_python_bindings.cpp -o build/temp.linux-x86_64-3.6/torch_xla/csrc/init_python_bindings.o -std=c++14 -Wno-sign-compare -Wno-deprecated-declarations -Wno-return-type -Wno-macro-redefined -Wno-return-std-move -DNDEBUG -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="_clang" -DPYBIND11_STDLIB="_libstdcpp" -DPYBIND11_BUILD_ABI="_cxxabi1002" -DTORCH_EXTENSION_NAME=_XLAC -D_GLIBCXX_USE_CXX11_ABI=1
May 03 15:45:37 clang-9 -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/var/lib/jenkins/workspace/xla -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-bin -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow/external/protobuf_archive/src -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow/external/com_google_protobuf/src -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow/external/eigen_archive -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow/external/com_google_absl -I/var/lib/jenkins/workspace -I/var/lib/jenkins/workspace/torch/csrc -I/var/lib/jenkins/workspace/torch/lib/tmp_install/include -I/opt/conda/lib/python3.6/site-packages/torch/include -I/opt/conda/lib/python3.6/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/lib/python3.6/site-packages/torch/include/TH -I/opt/conda/lib/python3.6/site-packages/torch/include/THC -I/opt/conda/include/python3.6m -c torch_xla/csrc/debug_util.cpp -o build/temp.linux-x86_64-3.6/torch_xla/csrc/debug_util.o -std=c++14 -Wno-sign-compare -Wno-deprecated-declarations -Wno-return-type -Wno-macro-redefined -Wno-return-std-move -DNDEBUG -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="_clang" -DPYBIND11_STDLIB="_libstdcpp" -DPYBIND11_BUILD_ABI="_cxxabi1002" -DTORCH_EXTENSION_NAME=_XLAC -D_GLIBCXX_USE_CXX11_ABI=1
May 03 15:45:39 In file included from torch_xla/csrc/python_util.cpp:6:
May 03 15:45:39 /var/lib/jenkins/workspace/torch/csrc/utils/python_strings.h:80:19: warning: unused function 'PyObject_FastGetAttrString' [-Wunused-function]
May 03 15:45:39 static py::object PyObject_FastGetAttrString(PyObject *obj, char *name)
May 03 15:45:39                   ^
May 03 15:45:44 1 warning generated.
May 03 15:45:44 clang-9 -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/var/lib/jenkins/workspace/xla -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-bin -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow/external/protobuf_archive/src -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow/external/com_google_protobuf/src -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow/external/eigen_archive -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow/external/com_google_absl -I/var/lib/jenkins/workspace -I/var/lib/jenkins/workspace/torch/csrc -I/var/lib/jenkins/workspace/torch/lib/tmp_install/include -I/opt/conda/lib/python3.6/site-packages/torch/include -I/opt/conda/lib/python3.6/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/lib/python3.6/site-packages/torch/include/TH -I/opt/conda/lib/python3.6/site-packages/torch/include/THC -I/opt/conda/include/python3.6m -c torch_xla/csrc/aten_xla_type_default.cpp -o build/temp.linux-x86_64-3.6/torch_xla/csrc/aten_xla_type_default.o -std=c++14 -Wno-sign-compare -Wno-deprecated-declarations -Wno-return-type -Wno-macro-redefined -Wno-return-std-move -DNDEBUG -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="_clang" -DPYBIND11_STDLIB="_libstdcpp" -DPYBIND11_BUILD_ABI="_cxxabi1002" -DTORCH_EXTENSION_NAME=_XLAC -D_GLIBCXX_USE_CXX11_ABI=1
May 03 15:45:50 clang-9 -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/var/lib/jenkins/workspace/xla -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-bin -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow/external/protobuf_archive/src -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow/external/com_google_protobuf/src -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow/external/eigen_archive -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow/external/com_google_absl -I/var/lib/jenkins/workspace -I/var/lib/jenkins/workspace/torch/csrc -I/var/lib/jenkins/workspace/torch/lib/tmp_install/include -I/opt/conda/lib/python3.6/site-packages/torch/include -I/opt/conda/lib/python3.6/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/lib/python3.6/site-packages/torch/include/TH -I/opt/conda/lib/python3.6/site-packages/torch/include/THC -I/opt/conda/include/python3.6m -c torch_xla/csrc/batch_norm.cpp -o build/temp.linux-x86_64-3.6/torch_xla/csrc/batch_norm.o -std=c++14 -Wno-sign-compare -Wno-deprecated-declarations -Wno-return-type -Wno-macro-redefined -Wno-return-std-move -DNDEBUG -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="_clang" -DPYBIND11_STDLIB="_libstdcpp" -DPYBIND11_BUILD_ABI="_cxxabi1002" -DTORCH_EXTENSION_NAME=_XLAC -D_GLIBCXX_USE_CXX11_ABI=1
May 03 15:45:54 torch_xla/csrc/aten_xla_type_default.cpp:962:49: error: no viable conversion from 'const optional<at::Tensor>' to 'const optional<at::Scalar>'
May 03 15:45:54   auto clamp_out_tmp = AtenXlaType::clamp(self, min, max);
May 03 15:45:54                                                 ^~~
May 03 15:45:54 /var/lib/jenkins/workspace/c10/util/Optional.h:508:13: note: candidate constructor not viable: no known conversion from 'const c10::optional<at::Tensor>' to 'c10::nullopt_t' for 1st argument
May 03 15:45:54   constexpr optional(nullopt_t) noexcept : OptionalBase<T>(){};
May 03 15:45:54             ^
May 03 15:45:54 /var/lib/jenkins/workspace/c10/util/Optional.h:510:3: note: candidate constructor not viable: no known conversion from 'const c10::optional<at::Tensor>' to 'const c10::optional<c10::Scalar> &' for 1st argument
May 03 15:45:54   optional(const optional& rhs) = default;
May 03 15:45:54   ^
May 03 15:45:54 /var/lib/jenkins/workspace/c10/util/Optional.h:515:3: note: candidate constructor not viable: no known conversion from 'const c10::optional<at::Tensor>' to 'c10::optional<c10::Scalar> &&' for 1st argument
May 03 15:45:54   optional(optional&& rhs) = default;

See CircleCI build pytorch_libtorch_linux_xenial_cuda11_1_cudnn8_py3_gcc7_build (2/2)

Step: "Build" (full log | diagnosis details | 🔁 rerun)

May 03 13:55:27 sccache: error: couldn't connect to server
May 03 13:55:27 +++ eval 'extract_trap_cmd '
May 03 13:55:27 ++++ extract_trap_cmd
May 03 13:55:27 ++++ printf '%s\n' ''
May 03 13:55:27 +++ printf '%s\n' cleanup
May 03 13:55:27 ++ trap -- '
May 03 13:55:27 cleanup' EXIT
May 03 13:55:27 ++ [[ pytorch-libtorch-linux-xenial-cuda11.1-cudnn8-py3-gcc7-build != *pytorch-win-* ]]
May 03 13:55:27 ++ which sccache
May 03 13:55:27 ++ sccache --stop-server
May 03 13:55:27 Stopping sccache server...
May 03 13:55:27 sccache: error: couldn't connect to server
May 03 13:55:27 sccache: caused by: Connection refused (os error 111)
May 03 13:55:27 ++ true
May 03 13:55:27 ++ rm /var/lib/jenkins/sccache_error.log
May 03 13:55:27 rm: cannot remove '/var/lib/jenkins/sccache_error.log': No such file or directory
May 03 13:55:27 ++ true
May 03 13:55:27 ++ [[ -n '' ]]
May 03 13:55:27 ++ [[ pytorch-libtorch-linux-xenial-cuda11.1-cudnn8-py3-gcc7-build == *rocm* ]]
May 03 13:55:27 ++ SCCACHE_ERROR_LOG=/var/lib/jenkins/sccache_error.log
May 03 13:55:27 ++ SCCACHE_IDLE_TIMEOUT=1200
May 03 13:55:27 ++ RUST_LOG=sccache::server=error

1 job timed out:

  • pytorch_libtorch_linux_xenial_cuda11_1_cudnn8_py3_gcc7_build

This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

@facebook-github-bot facebook-github-bot added the oncall: jit Add this issue/PR to JIT oncall triage queue label Feb 23, 2021
@codecov
Copy link
Copy Markdown

codecov bot commented Feb 24, 2021

Codecov Report

Merging #52695 (8d7faa2) into master (643dd26) will increase coverage by 0.04%.
The diff coverage is n/a.

❗ Current head 8d7faa2 differs from pull request most recent head 44a4a9a. Consider uploading reports for the commit 44a4a9a to get more accurate results

@@            Coverage Diff             @@
##           master   #52695      +/-   ##
==========================================
+ Coverage   77.01%   77.05%   +0.04%     
==========================================
  Files        1921     1917       -4     
  Lines      190292   190157     -135     
==========================================
- Hits       146554   146534      -20     
+ Misses      43738    43623     -115     

@peterbell10 peterbell10 requested a review from mruberry February 25, 2021 16:48
@anjali411 anjali411 added triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module and removed oncall: jit Add this issue/PR to JIT oncall triage queue labels Feb 25, 2021
@mruberry
Copy link
Copy Markdown
Collaborator

mruberry commented Mar 1, 2021

XLA failure looks real. @JackCaoG - looks like this will require some coordination?

Copy link
Copy Markdown
Collaborator

@mruberry mruberry Mar 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have so many ways to create tensors from scalars (and I should really take the time to fix them)... Creating device tensors from scalars seems like a perf issue, however. cc @ngimel

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, this is a serious perf issue.
We don't actually have many ways of creating device tensors, it's scalar_tensor and wrapped_scalar_tensor that have different behavior wrt type promotion.

Copy link
Copy Markdown
Collaborator

@mruberry mruberry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @peterbell10! This is an expansive PR and I'll have to get back to giving it a proper review. The two key things with this feature are:

  • clearly explaining the scalar x scalar, scalar x tensor, tensor x scalar, and tensor x tensor behavior
  • ensuring performance isn't regressed

In particular, I worry about creating CUDA tensors out of scalars, but if that's only happening in new scalar x tensor and tensor x scalar scenarios maybe that's OK. We'll need to measure the performance of the historic scalar x scalar scenario, however.

@peterbell10
Copy link
Copy Markdown
Collaborator Author

@mruberry PTAL. Added CPU scalar handling and made the doc and error message tweaks.

@peterbell10 peterbell10 force-pushed the clamp-tensor branch 2 times, most recently from 34d1565 to 50811f3 Compare March 3, 2021 15:14
@peterbell10
Copy link
Copy Markdown
Collaborator Author

I did some benchmarking and I saw a 10x slow down for large self tensors and scalar min/max on CPU. So, I moved back to separate scalar kernels and now there is no performance difference.

@rgommers
Copy link
Copy Markdown
Collaborator

This accumulated some merge conflicts

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really nicely written

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The scalar versions of clamp aren't changing with this PR, right? These modifications (and the test modifications in common_methods_invocations.py) are just because the code is moving around?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it just made more sense to me that they all be in the same file.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really nice impls

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really great code parallelism with the scalar impls

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@albanD would you sanity check these?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assigning ties to self makes sense to me, I'm curious what @albanD thinks

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also what the Scalar overload does.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the backward for the scalar case changing?

Copy link
Copy Markdown
Collaborator Author

@peterbell10 peterbell10 Apr 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As @lezcano pointed out, the old formulation propagates nans from the grad even when the gradient should just be zero.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work here

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice update

Copy link
Copy Markdown
Collaborator

@mruberry mruberry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @peterbell10! This PR looks very good (as usual). I made a few comments. The updates to native_functions.yaml and the actual implementations look incredibly precise to me.

I've asked @albanD to sanity-check the gradients. Then we just need to follow-up with ONNX and XLA, too, and we'll get this landed!

@peterbell10
Copy link
Copy Markdown
Collaborator Author

@mruberry PTAL, addressed review comments and rebased to fix merge conflicts.

@mruberry mruberry self-requested a review April 30, 2021 12:14
Copy link
Copy Markdown
Collaborator

@mruberry mruberry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome!

And it looks like ONNX is happy. @JackCaoG, think we can bother you for the XLA side of this PR?

@peterbell10
Copy link
Copy Markdown
Collaborator Author

@mruberry XLA side is already implemented: pytorch/xla#2900

@facebook-github-bot
Copy link
Copy Markdown
Contributor

@ezyang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Copy Markdown
Contributor

@ezyang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Copy Markdown
Contributor

@ezyang merged this pull request in 33eea14.

krshrimali pushed a commit to krshrimali/pytorch that referenced this pull request May 19, 2021
Summary:
Fixes pytorchgh-2793

Pull Request resolved: pytorch#52695

Reviewed By: mruberry

Differential Revision: D27395977

Pulled By: ezyang

fbshipit-source-id: f86aa240feb034d42e4c45447e72218f6a773c24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla signed Merged open source triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support clamp() with tensor min and max