Add complex support for torch.nn.L1Loss#49912
Add complex support for torch.nn.L1Loss#49912soulitzer wants to merge 20 commits intopytorch:masterfrom
Conversation
💊 CI failures summary and remediationsAs of commit 8aee998 (more details on the Dr. CI page):
ci.pytorch.org: 1 failedThis comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.Please report bugs/suggestions to the (internal) Dr. CI Users group. |
Codecov Report
@@ Coverage Diff @@
## master #49912 +/- ##
===========================================
+ Coverage 70.48% 80.71% +10.22%
===========================================
Files 1904 1904
Lines 206632 206633 +1
===========================================
+ Hits 145653 166789 +21136
+ Misses 60979 39844 -21135 |
anjali411
left a comment
There was a problem hiding this comment.
We should also update the documentation to indicate that L1Loss now supports complex numbers
| type_map = {} | ||
| if isinstance(obj, torch.Tensor): | ||
| assert obj.is_leaf | ||
| t = type_map.get(obj.type(), get_gpu_type(obj.type())) |
There was a problem hiding this comment.
get_gpu_type is only used at one other place, so it would be awesome if you could update that too and get rid of get_gpu_type method.
https://github.com/pytorch/pytorch/blob/master/test/test_cuda.py#L770
There was a problem hiding this comment.
also nit - let's just change this to t = type_map.get(obj.type(), obj.type()) and change the line below to res = obj.clone().type(t).cuda()
There was a problem hiding this comment.
Might as well get rid of this test in that case
Since
def test_is_tensor(self):
for t in types:
tensor = get_gpu_type(t)()
self.assertTrue(torch.is_tensor(tensor))
self.assertTrue(torch.is_tensor(torch.cuda.HalfTensor()))
becomes something like
for t in types:
tensor = torch.tensor(data, dtype=t).cuda()
There was a problem hiding this comment.
yeah right makes sense! let's do that
| if gradOutput is None: | ||
| gradOutput = torch.ones(()) | ||
| criterion(*args).backward(gradOutput.to(input_tuple[0])) | ||
| criterion(*args).backward(gradOutput.to(output)) |
There was a problem hiding this comment.
For C to R functions, input's dtype is not equal to output's dtype. In general, we'd like gradoutput to be the same dtype as output anyway
There was a problem hiding this comment.
hmm what if the output is a tuple? I think we should add a similar check for output as input:
output_tuple = output if isinstance(output, tuple) else (output,)
There was a problem hiding this comment.
Can output be a tuple though? Input might be a tuple only because when we backward, we might want to populate the grads of multiple inputs. I'm curious which functions return tuples.
There was a problem hiding this comment.
not sure about torch.nn module functions, but some torch functions that come to mind are triangular_solve, qr
There was a problem hiding this comment.
There are functions like nn.AdaptiveMaxPool2d that do return a tuple, so I ended up adding the check for the tuple case.
| auto norm = reduction == Reduction::Mean ? grad_output / input.numel() : grad_output; | ||
| at::sub_out(grad_input, input, target).sign_().mul_(norm); | ||
| return grad_input; | ||
| return at::sub_out(grad_input, input, target).sgn_().mul_(norm); |
anjali411
left a comment
There was a problem hiding this comment.
Thanks @soulitzer the PR changes look good to me overall. let's rebase on the master and check the CI tests.
could you also remove get_gpu_type and test_is_tensor?
655c662 to
3435892
Compare
| target_fn=lambda: torch.randn((2, 3, 4), requires_grad=True), | ||
| reference_fn=lambda i, t, _: 1. / i.numel() * | ||
| sum((a - b).abs().sum() for a, b in zip(i, t)), | ||
| check_complex=True, |
There was a problem hiding this comment.
is the target in this case supposed to be complex or real? The math makes it look like it should be complex, but the target created is real?
There was a problem hiding this comment.
target_fn is only used by test_jit, which basically just tries to see if scripted module behaves the same as the python module. I don't see it handling check_bfloat16 or check_half either.
| } else { | ||
| at::sub_out(result, input, target).abs_(); | ||
| Tensor& l1_loss_out(Tensor& result, const Tensor& input, const Tensor& target, int64_t reduction) { | ||
| auto diff = at::sub_out(result, input, target); |
There was a problem hiding this comment.
does this cause warnings? Because we usually warn when the result is resized (not from something 0-sized).
There was a problem hiding this comment.
Should be fixed in the latest update. When the shape of result matches the post-reduce shape a warning should no longer appear.
There was a problem hiding this comment.
@gchanan do you want to take another look at this
facebook-github-bot
left a comment
There was a problem hiding this comment.
@soulitzer has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
|
Fixes the l1_loss case for #50382. |
facebook-github-bot
left a comment
There was a problem hiding this comment.
@soulitzer has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
|
@anjali411 @albanD Made a non-trivial change to the code. In the latest commit, everything is now routed through the |
| Tensor result = at::empty({0}, input.options().dtype(float_type)); | ||
| return at::l1_loss_out(result, input, target, reduction); | ||
| } | ||
| Tensor result = at::empty({0}, input.options()); |
There was a problem hiding this comment.
this should go in an else branch:
Tensor result;
if (input.is_complex()) {
...
} else {
result = at::empty({0}, input.options());
}
There was a problem hiding this comment.
oh wait just saw you have a return statement in the if condition. I still think it might be cleaner to change it to an if else statement with a common return
There was a problem hiding this comment.
Hmm, then you'd have to declare result before the if else. Otherwise it would go out of scope by the time you try to return it.
There was a problem hiding this comment.
is that necessarily cleaner? :P I feel like it could be good either way.
There was a problem hiding this comment.
wait if you 'declare' it before, you are technically doing an extra default initialization then copy assigning instead of simply copy initializing
There was a problem hiding this comment.
A nice trick for this that @ezyang showed me is to use a lambda:
const auto float_type = [&]() {
if (input.is_complex()) {
return c10::toValueType(input.scalar_type());
} else {
return input.scalar_type();
}
}();
Tensor result = at::empty({0}, input.options().dtype(float_type));
return at::l1_loss_out(result, input, target, reduction);But beyond that, what happens if you call c10::toValueType on a non complex dtype? Is it just returned as-is? If so, you don't need branching in this function at all!
There was a problem hiding this comment.
Ahh you're right c10::toValueType does handle the non-complex dtype by just returning as-is.
albanD
left a comment
There was a problem hiding this comment.
lgtm
just small potential simplification of the composite l1_loss.
facebook-github-bot
left a comment
There was a problem hiding this comment.
@soulitzer has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
|
@soulitzer thanks again! this PR looks good to me with the current changes. is there anything that's blocking this PR? |
|
@anjali411 I actually almost landed this yesterday, but held off due to the CI issues. One thing I wanted to check again was to see if there were any updates to |
|
@soulitzer merged this pull request in 6e3e570. |
|
for the secound time you can implement your own def complex_mse_loss(output, target):
return (0.5*(output - target)**2).mean(dtype=torch.complex64)you can also implement layers or any custom utils needed class CLinear(nn.Module):
def __init__(self, size_in, size_out):
super().__init__()
self.weights = nn.Parameter(torch.randn(size_in, size_out, dtype=torch.complex64)
self.bias = nn.Parameter(torch.zeros(size_out, dtype=torch.complex64))
def forward(self, x):
if not x.dtype == torch.complex64: x = x.type(torch.complex64)
return x@self.weights + self.bias |
Summary: Building on top of the work of anjali411 (pytorch#46640) Things added in this PR: 1. Modify backward and double-backward formulas 2. Add complex support for `new module tests` and criterion tests (and add complex tests for L1) 3. Modify some existing tests to support complex Pull Request resolved: pytorch#49912 Reviewed By: zhangguanheng66 Differential Revision: D25853036 Pulled By: soulitzer fbshipit-source-id: df619f1b71c450ab2818eb17804e0c55990aa8ad
Building on top of the work of @anjali411 (#46640)
Things added in this PR:
new module testsand criterion tests (and add complex tests for L1)