Implement copysign by ejguan · Pull Request #46396 · pytorch/pytorch

ejguan · 2020-10-15T15:19:59Z

Related #38349
Stack from ghstack:

Implement copysign #46396 Implement copysign

No in-place function
No method
Optional output
Available: byte, char, bool, int, short, long, float, double, half
Integral promoted to float
Not available: float/double complex

c = np.copysign(a, b)

a	b	c	a.grad
-1	-1	-1	1
-0	-1	-0	0
0	-1	-0	0
1	-1	-1	-1
-1	-0	-1	1
-0	-0	-0	0
0	-0	-0	0
1	-0	-1	-1
-1	0	1	-1
-0	0	0	0
0	0	0	0
1	0	1	1
-1	1	1	-1
-0	1	0	0
0	1	0	0
1	1	1	1

This function becomes non-differentiable at a=0 for any b. So, in my opinion, we may set the gradient for a=0 to 0.

TODO:

test (cpu/gpu)
doc
~~kernel_vec~~

Differential Revision: D24401366

[ghstack-poisoned]

facebook-github-bot · 2020-10-15T15:20:23Z

💊 CI failures summary and remediations

As of commit 310bc5e (more details on the Dr. CI page):

💚 💚 Looks good so far! There are no failures yet. 💚 💚

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group.

See how this bot performed.

This comment has been revised 2 times.

dr-ci · 2020-10-15T15:26:56Z

💊 CI failures summary and remediations

As of commit e3e5eed (more details on the Dr. CI page):

💚 💚 Looks good so far! There are no failures yet. 💚 💚

2 failures confirmed as flaky and can be ignored:

pytorch_libtorch_linux_xenial_cuda11_0_cudnn8_py3_gcc7_build
pytorch_linux_xenial_cuda9_2_cudnn7_py3_gcc5_4_build

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group.

See how this bot performed.

This comment has been revised 216 times.

[numpy](https://numpy.org/doc/stable/reference/generated/numpy.copysign.html?highlight=copysign#numpy.copysign) - No in-place function - No method - Optional output - Available: bool, int, short, long, float, double, half - Not available: byte, char, float/double complex TODO: - [ ] test - [ ] doc - [ ] kernel_vec [ghstack-poisoned]

ghstack-source-id: ad13829 Pull Request resolved: #46396

albanD · 2020-10-15T17:58:40Z

 }

+Tensor copysign_tensor_backward(Tensor grad, Tensor self, Tensor other) {
+  auto result = grad * self.sign() * other.sign();


I am not sure that you need the self.sign() here.
It will fail gradcheck when you add it to the list in common_method_invocation.py if the formula is wrong. But I think you will need to change this.

For instance:

a = tensor(-1.)

b = tensor(1.)

c = torch.copysign(a, b) = tensor(1)

The derivative of a is -1 rather than b.sign() = 1. Any thought on that?

Ho right, the derivative is -1 when we change and 1 otherwise so you need both! Agree with you.

Also there is a corner case at 0 here where sign() returns 0. What is copysign() doing for that? Is the backward formula good for this case as well?

albanD · 2020-10-15T21:03:43Z

From reading your table of outputs, is the following correct:

copysign(a, b) = {
    abs(a) if b >=0
    -abs(a) if b <0
}

If it is the case, then it makes the gradient computation easier to derive for special points.
In particular, for b>=0 and a~=0, since it is a convex function and we select the min-norm subgradient here: 0
And for b<0 and a~=0, this is a concave function and we select the min-norm supergradient: 0

So basically all the "?" in your table above should be 0 as they all correspond to this case where a~=0.

Related #38349 [numpy](https://numpy.org/doc/stable/reference/generated/numpy.copysign.html?highlight=copysign#numpy.copysign) - No in-place function - No method - Optional output - Available: bool, int, short, long, float, double, half - Integral promoted to float - Not available: byte, char, float/double complex `c = np.copysign(a, b)` | a | b | c | a.grad | |:--:|:--:|:--:|:----:| | -1 | -1 | -1 | 1 | | -0 | -1 | -0 | 1? | | 0 | -1 | -0 | -1? | | 1 | -1 | -1 | -1 | | -1 | -0 | 1 | -1 | | -0 | -0 | 0 | -1? | | 0 | -0 | 0 | 1? | | 1 | -0 | 1 | 1 | | -1 | 0 | 1 | -1 | | -0 | 0 | 0 | -1? | | 0 | 0 | 0 | 1? | | 1 | 0 | 1 | 1 | | -1 | 1 | 1 | -1 | | -0 | 1 | 0 | -1? | | 0 | 1 | 0 | 1? | | 1 | 1 | 1 | 1 | This function becomes **non-differentiable** at `a=0` for any `b`. So, in my opinion, we may set the gradient for `a=0` to 0. TODO: - [ ] test - [ ] doc - [x] ~kernel_vec~ [ghstack-poisoned]

ghstack-source-id: 0378575 Pull Request resolved: #46396

Related #38349 [numpy](https://numpy.org/doc/stable/reference/generated/numpy.copysign.html?highlight=copysign#numpy.copysign) - No in-place function - No method - Optional output - Available: bool, int, short, long, float, double, half - Integral promoted to float - Not available: byte, char, float/double complex `c = np.copysign(a, b)` | a | b | c | a.grad | |:--:|:--:|:--:|:----:| | -1 | -1 | -1 | 1 | | -0 | -1 | -0 | 1? | | 0 | -1 | -0 | -1? | | 1 | -1 | -1 | -1 | | -1 | -0 | 1 | -1 | | -0 | -0 | 0 | -1? | | 0 | -0 | 0 | 1? | | 1 | -0 | 1 | 1 | | -1 | 0 | 1 | -1 | | -0 | 0 | 0 | -1? | | 0 | 0 | 0 | 1? | | 1 | 0 | 1 | 1 | | -1 | 1 | 1 | -1 | | -0 | 1 | 0 | -1? | | 0 | 1 | 0 | 1? | | 1 | 1 | 1 | 1 | This function becomes **non-differentiable** at `a=0` for any `b`. So, in my opinion, we may set the gradient for `a=0` to 0. TODO: - [ ] test - [ ] doc - [x] ~kernel_vec~ [ghstack-poisoned]

ghstack-source-id: 2937708 Pull Request resolved: #46396

Related #38349 [numpy](https://numpy.org/doc/stable/reference/generated/numpy.copysign.html?highlight=copysign#numpy.copysign) - No in-place function - No method - Optional output - Available: bool, int, short, long, float, double, half - Integral promoted to float - Not available: byte, char, float/double complex `c = np.copysign(a, b)` | a | b | c | a.grad | |:--:|:--:|:--:|:----:| | -1 | -1 | -1 | 1 | | -0 | -1 | -0 | 0 | | 0 | -1 | -0 | 0 | | 1 | -1 | -1 | -1 | | -1 | -0 | 1 | -1 | | -0 | -0 | 0 | 0 | | 0 | -0 | 0 | 0 | | 1 | -0 | 1 | 1 | | -1 | 0 | 1 | -1 | | -0 | 0 | 0 | 0 | | 0 | 0 | 0 | 0 | | 1 | 0 | 1 | 1 | | -1 | 1 | 1 | -1 | | -0 | 1 | 0 | 0 | | 0 | 1 | 0 | 0 | | 1 | 1 | 1 | 1 | This function becomes **non-differentiable** at `a=0` for any `b`. So, in my opinion, we may set the gradient for `a=0` to 0. TODO: - [x] test - [ ] doc - [x] ~kernel_vec~ [ghstack-poisoned]

ghstack-source-id: 5f477aa Pull Request resolved: #46396

Related #38349 [numpy](https://numpy.org/doc/stable/reference/generated/numpy.copysign.html?highlight=copysign#numpy.copysign) - No in-place function - No method - Optional output - Available: byte, char, bool, int, short, long, float, double, half - Integral promoted to float - Not available: float/double complex `c = np.copysign(a, b)` | a | b | c | a.grad | |:--:|:--:|:--:|:----:| | -1 | -1 | -1 | 1 | | -0 | -1 | -0 | 0 | | 0 | -1 | -0 | 0 | | 1 | -1 | -1 | -1 | | -1 | -0 | 1 | -1 | | -0 | -0 | 0 | 0 | | 0 | -0 | 0 | 0 | | 1 | -0 | 1 | 1 | | -1 | 0 | 1 | -1 | | -0 | 0 | 0 | 0 | | 0 | 0 | 0 | 0 | | 1 | 0 | 1 | 1 | | -1 | 1 | 1 | -1 | | -0 | 1 | 0 | 0 | | 0 | 1 | 0 | 0 | | 1 | 1 | 1 | 1 | This function becomes **non-differentiable** at `a=0` for any `b`. So, in my opinion, we may set the gradient for `a=0` to 0. TODO: - [x] test - [x] doc - [x] ~kernel_vec~ [ghstack-poisoned]

ghstack-source-id: edccfde Pull Request resolved: #46396

Related #38349 [numpy](https://numpy.org/doc/stable/reference/generated/numpy.copysign.html?highlight=copysign#numpy.copysign) - No in-place function - No method - Optional output - Available: byte, char, bool, int, short, long, float, double, half - Integral promoted to float - Not available: float/double complex `c = np.copysign(a, b)` | a | b | c | a.grad | |:--:|:--:|:--:|:----:| | -1 | -1 | -1 | 1 | | -0 | -1 | -0 | 0 | | 0 | -1 | -0 | 0 | | 1 | -1 | -1 | -1 | | -1 | -0 | 1 | -1 | | -0 | -0 | 0 | 0 | | 0 | -0 | 0 | 0 | | 1 | -0 | 1 | 1 | | -1 | 0 | 1 | -1 | | -0 | 0 | 0 | 0 | | 0 | 0 | 0 | 0 | | 1 | 0 | 1 | 1 | | -1 | 1 | 1 | -1 | | -0 | 1 | 0 | 0 | | 0 | 1 | 0 | 0 | | 1 | 1 | 1 | 1 | This function becomes **non-differentiable** at `a=0` for any `b`. So, in my opinion, we may set the gradient for `a=0` to 0. TODO: - [x] test - [x] doc - [x] ~kernel_vec~ [ghstack-poisoned]

Related #38349 [numpy](https://numpy.org/doc/stable/reference/generated/numpy.copysign.html?highlight=copysign#numpy.copysign) - No in-place function - No method - Optional output - Available: byte, char, bool, int, short, long, float, double, half - Integral promoted to float - Not available: float/double complex `c = np.copysign(a, b)` | a | b | c | a.grad | |:--:|:--:|:--:|:----:| | -1 | -1 | -1 | 1 | | -0 | -1 | -0 | 0 | | 0 | -1 | -0 | 0 | | 1 | -1 | -1 | -1 | | -1 | -0 | 1 | -1 | | -0 | -0 | 0 | 0 | | 0 | -0 | 0 | 0 | | 1 | -0 | 1 | 1 | | -1 | 0 | 1 | -1 | | -0 | 0 | 0 | 0 | | 0 | 0 | 0 | 0 | | 1 | 0 | 1 | 1 | | -1 | 1 | 1 | -1 | | -0 | 1 | 0 | 0 | | 0 | 1 | 0 | 0 | | 1 | 1 | 1 | 1 | This function becomes **non-differentiable** at `a=0` for any `b`. So, in my opinion, we may set the gradient for `a=0` to 0. TODO: - [x] test - [x] doc - [x] ~kernel_vec~ - [ ] torch.copysign(Number input, Tensor other) [ghstack-poisoned]

mruberry

Nice work, @ejguan. I just made a few comments on the test_torch.py test. This is looking really good. @zou3519, you'll shepherd this the rest of the way, right?

mruberry · 2020-10-30T21:29:59Z

I removed tests with explicit expected outputs, because I think it's enough to compare the results between PyTorch and NumPy with random cases and special cases (0.0/-0.0/inf/-inf/nan).

Sounds great.

And, I did not use cross product of dtypes because I need to enumerate all types (float/integer/boolean) for checking type promotion and result for both arguments of copysign. But, all the special cases (0.0/-0.0/inf/-inf/nan) are only required floating precision input as the second argument of copysign.

Just test if the second dtype is a float type or not and only perform that part of the test if it is.

ejguan · 2020-10-30T22:25:09Z

Nice work, @ejguan. I just made a few comments on the test_torch.py test. This is looking really good. @zou3519, you'll shepherd this the rest of the way, right?

Thank you so much mike for reviewing the code and giving suggestions. @mruberry

[numpy](https://numpy.org/doc/stable/reference/generated/numpy.copysign.html?highlight=copysign#numpy.copysign) - No in-place function - No method - Optional output - Available: byte, char, bool, int, short, long, float, double, half - Integral promoted to float - Not available: float/double complex `c = np.copysign(a, b)` | a | b | c | a.grad | |:--:|:--:|:--:|:----:| | -1 | -1 | -1 | 1 | | -0 | -1 | -0 | 0 | | 0 | -1 | -0 | 0 | | 1 | -1 | -1 | -1 | | -1 | -0 | -1 | 1 | | -0 | -0 | -0 | 0 | | 0 | -0 | -0 | 0 | | 1 | -0 | -1 | -1 | | -1 | 0 | 1 | -1 | | -0 | 0 | 0 | 0 | | 0 | 0 | 0 | 0 | | 1 | 0 | 1 | 1 | | -1 | 1 | 1 | -1 | | -0 | 1 | 0 | 0 | | 0 | 1 | 0 | 0 | | 1 | 1 | 1 | 1 | This function becomes **non-differentiable** at `a=0` for any `b`. So, in my opinion, we may set the gradient for `a=0` to 0. TODO: - [x] test (cpu/gpu) - [x] doc - [x] ~kernel_vec~ Differential Revision: [D24401366](https://our.internmc.facebook.com/intern/diff/D24401366) [ghstack-poisoned]

ghstack-source-id: 39d7768 Pull Request resolved: #46396

zou3519

lgtm

[numpy](https://numpy.org/doc/stable/reference/generated/numpy.copysign.html?highlight=copysign#numpy.copysign) - No in-place function - No method - Optional output - Available: byte, char, bool, int, short, long, float, double, half - Integral promoted to float - Not available: float/double complex `c = np.copysign(a, b)` | a | b | c | a.grad | |:--:|:--:|:--:|:----:| | -1 | -1 | -1 | 1 | | -0 | -1 | -0 | 0 | | 0 | -1 | -0 | 0 | | 1 | -1 | -1 | -1 | | -1 | -0 | -1 | 1 | | -0 | -0 | -0 | 0 | | 0 | -0 | -0 | 0 | | 1 | -0 | -1 | -1 | | -1 | 0 | 1 | -1 | | -0 | 0 | 0 | 0 | | 0 | 0 | 0 | 0 | | 1 | 0 | 1 | 1 | | -1 | 1 | 1 | -1 | | -0 | 1 | 0 | 0 | | 0 | 1 | 0 | 0 | | 1 | 1 | 1 | 1 | This function becomes **non-differentiable** at `a=0` for any `b`. So, in my opinion, we may set the gradient for `a=0` to 0. TODO: - [x] test (cpu/gpu) - [x] doc - [x] ~kernel_vec~ Differential Revision: [D24401366](https://our.internmc.facebook.com/intern/diff/D24401366) [ghstack-poisoned]

ghstack-source-id: caeb446 Pull Request resolved: #46396

[numpy](https://numpy.org/doc/stable/reference/generated/numpy.copysign.html?highlight=copysign#numpy.copysign) - No in-place function - No method - Optional output - Available: byte, char, bool, int, short, long, float, double, half - Integral promoted to float - Not available: float/double complex `c = np.copysign(a, b)` | a | b | c | a.grad | |:--:|:--:|:--:|:----:| | -1 | -1 | -1 | 1 | | -0 | -1 | -0 | 0 | | 0 | -1 | -0 | 0 | | 1 | -1 | -1 | -1 | | -1 | -0 | -1 | 1 | | -0 | -0 | -0 | 0 | | 0 | -0 | -0 | 0 | | 1 | -0 | -1 | -1 | | -1 | 0 | 1 | -1 | | -0 | 0 | 0 | 0 | | 0 | 0 | 0 | 0 | | 1 | 0 | 1 | 1 | | -1 | 1 | 1 | -1 | | -0 | 1 | 0 | 0 | | 0 | 1 | 0 | 0 | | 1 | 1 | 1 | 1 | This function becomes **non-differentiable** at `a=0` for any `b`. So, in my opinion, we may set the gradient for `a=0` to 0. TODO: - [x] test (cpu/gpu) - [x] doc - [x] ~kernel_vec~ Differential Revision: [D24401366](https://our.internmc.facebook.com/intern/diff/D24401366) [ghstack-poisoned]

ghstack-source-id: fdf7fbb Pull Request resolved: #46396

ejguan · 2020-11-03T19:28:28Z

Update because of the following two reasons:

torch.bfloat16 can not hold -nan
torch.half can not hold -nan on CUDA

facebook-github-bot · 2020-11-04T17:10:49Z

@ejguan merged this pull request in f1ac63d.

Summary: Pull Request resolved: pytorch#46396 Related pytorch#38349 [numpy](https://numpy.org/doc/stable/reference/generated/numpy.copysign.html?highlight=copysign#numpy.copysign) - No in-place function - No method - Optional output - Available: byte, char, bool, int, short, long, float, double, half - Integral promoted to float - Not available: float/double complex `c = np.copysign(a, b)` | a | b | c | a.grad | | -1 | -1 | -1 | 1 | | -0 | -1 | -0 | 0 | | 0 | -1 | -0 | 0 | | 1 | -1 | -1 | -1 | | -1 | -0 | -1 | 1 | | -0 | -0 | 0 | 0 | | 0 | -0 | 0 | 0 | | 1 | -0 | -1 | -1 | | -1 | 0 | 1 | -1 | | -0 | 0 | 0 | 0 | | 0 | 0 | 0 | 0 | | 1 | 0 | 1 | 1 | | -1 | 1 | 1 | -1 | | -0 | 1 | 0 | 0 | | 0 | 1 | 0 | 0 | | 1 | 1 | 1 | 1 | This function becomes **non-differentiable** at `a=0` for any `b`. So, in my opinion, we may set the gradient for `a=0` to 0. TODO: - [x] test (cpu/gpu) - [x] doc - [x] ~kernel_vec~ Test Plan: Imported from OSS Reviewed By: mruberry Differential Revision: D24401366 Pulled By: ejguan fbshipit-source-id: 3621c5ff74b185376a3705589983bb5197ab896d

Implement copysign

310bc5e

[ghstack-poisoned]

ejguan requested review from albanD and apaszke as code owners October 15, 2020 15:19

ejguan marked this pull request as draft October 15, 2020 15:20

ejguan changed the title ~~Implement copysign~~ [WIP] Implement copysign Oct 15, 2020

zou3519 reviewed Oct 15, 2020

View reviewed changes

Comment thread aten/src/ATen/native/BinaryOps.cpp Outdated

ejguan added a commit that referenced this pull request Oct 15, 2020

Implement copysign (Fix bugs)

1fbd8ab

ghstack-source-id: ad13829 Pull Request resolved: #46396

albanD reviewed Oct 15, 2020

View reviewed changes

ejguan added a commit that referenced this pull request Oct 15, 2020

Implement copysign (fix derivatives)

0264758

ghstack-source-id: 0378575 Pull Request resolved: #46396

ejguan added a commit that referenced this pull request Oct 16, 2020

Implement copysign (fix derivatives)

9770bf6

ghstack-source-id: 2937708 Pull Request resolved: #46396

ejguan added a commit that referenced this pull request Oct 16, 2020

Implement copysign (add docs)

29ace34

ghstack-source-id: 5f477aa Pull Request resolved: #46396

ejguan added a commit that referenced this pull request Oct 16, 2020

Implement copysign (fix docs)

0fb92d9

ghstack-source-id: edccfde Pull Request resolved: #46396

facebook-github-bot added the cla signed label Oct 30, 2020

mruberry reviewed Oct 30, 2020

View reviewed changes

Comment thread test/test_torch.py Outdated

mruberry reviewed Oct 30, 2020

View reviewed changes

Comment thread test/test_torch.py Outdated

mruberry reviewed Oct 30, 2020

View reviewed changes

Comment thread test/test_torch.py Outdated

mruberry approved these changes Oct 30, 2020

View reviewed changes