torch.clamp with tensor min and max#52695
torch.clamp with tensor min and max#52695peterbell10 wants to merge 6 commits intopytorch:masterfrom
Conversation
💊 CI failures summary and remediationsAs of commit b841b02 (more details on the Dr. CI page):
🕵️ 2 new failures recognized by patternsThe following CI failures do not appear to be due to upstream breakages:
|
Codecov Report
@@ Coverage Diff @@
## master #52695 +/- ##
==========================================
+ Coverage 77.01% 77.05% +0.04%
==========================================
Files 1921 1917 -4
Lines 190292 190157 -135
==========================================
- Hits 146554 146534 -20
+ Misses 43738 43623 -115 |
|
XLA failure looks real. @JackCaoG - looks like this will require some coordination? |
There was a problem hiding this comment.
We have so many ways to create tensors from scalars (and I should really take the time to fix them)... Creating device tensors from scalars seems like a perf issue, however. cc @ngimel
There was a problem hiding this comment.
Agree, this is a serious perf issue.
We don't actually have many ways of creating device tensors, it's scalar_tensor and wrapped_scalar_tensor that have different behavior wrt type promotion.
mruberry
left a comment
There was a problem hiding this comment.
Hey @peterbell10! This is an expansive PR and I'll have to get back to giving it a proper review. The two key things with this feature are:
- clearly explaining the scalar x scalar, scalar x tensor, tensor x scalar, and tensor x tensor behavior
- ensuring performance isn't regressed
In particular, I worry about creating CUDA tensors out of scalars, but if that's only happening in new scalar x tensor and tensor x scalar scenarios maybe that's OK. We'll need to measure the performance of the historic scalar x scalar scenario, however.
a12c9d3 to
976745c
Compare
|
@mruberry PTAL. Added CPU scalar handling and made the doc and error message tweaks. |
34d1565 to
50811f3
Compare
|
I did some benchmarking and I saw a 10x slow down for large |
|
This accumulated some merge conflicts |
50811f3 to
446833f
Compare
There was a problem hiding this comment.
This is really nicely written
There was a problem hiding this comment.
The scalar versions of clamp aren't changing with this PR, right? These modifications (and the test modifications in common_methods_invocations.py) are just because the code is moving around?
There was a problem hiding this comment.
Yeah, it just made more sense to me that they all be in the same file.
There was a problem hiding this comment.
Really great code parallelism with the scalar impls
tools/autograd/derivatives.yaml
Outdated
tools/autograd/derivatives.yaml
Outdated
There was a problem hiding this comment.
assigning ties to self makes sense to me, I'm curious what @albanD thinks
There was a problem hiding this comment.
This is also what the Scalar overload does.
There was a problem hiding this comment.
Why is the backward for the scalar case changing?
There was a problem hiding this comment.
As @lezcano pointed out, the old formulation propagates nans from the grad even when the gradient should just be zero.
mruberry
left a comment
There was a problem hiding this comment.
Hey @peterbell10! This PR looks very good (as usual). I made a few comments. The updates to native_functions.yaml and the actual implementations look incredibly precise to me.
I've asked @albanD to sanity-check the gradients. Then we just need to follow-up with ONNX and XLA, too, and we'll get this landed!
e1ab2cf to
385b03b
Compare
|
@mruberry PTAL, addressed review comments and rebased to fix merge conflicts. |
|
@mruberry XLA side is already implemented: pytorch/xla#2900 |
|
@ezyang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
|
@ezyang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
Summary: Fixes pytorchgh-2793 Pull Request resolved: pytorch#52695 Reviewed By: mruberry Differential Revision: D27395977 Pulled By: ezyang fbshipit-source-id: f86aa240feb034d42e4c45447e72218f6a773c24
Fixes gh-2793