Skip to content

Add type promotion support in sparse tensors#26051

Merged
nairbv merged 16 commits intogh/nairbv/2/basefrom
gh/nairbv/2/head
Sep 27, 2019
Merged

Add type promotion support in sparse tensors#26051
nairbv merged 16 commits intogh/nairbv/2/basefrom
gh/nairbv/2/head

Conversation

@nairbv
Copy link
Copy Markdown
Collaborator

@nairbv nairbv commented Sep 11, 2019

Stack from ghstack:

Differential Revision: D17556195

@pytorchbot pytorchbot added module: cuda Related to torch.cuda, and CUDA support in general module: operators module: sparse Related to torch.sparse labels Sep 11, 2019
nairbv added a commit that referenced this pull request Sep 11, 2019
ghstack-source-id: 4d9b7ed
Pull Request resolved: #26051
@nairbv nairbv requested a review from gchanan September 11, 2019 20:58
nairbv added a commit that referenced this pull request Sep 16, 2019
ghstack-source-id: a7b07fd
Pull Request resolved: #26051
nairbv added a commit that referenced this pull request Sep 17, 2019
ghstack-source-id: 8ff8369
Pull Request resolved: #26051
nairbv added a commit that referenced this pull request Sep 17, 2019
ghstack-source-id: a8bb04e
Pull Request resolved: #26051
@gchanan
Copy link
Copy Markdown
Contributor

gchanan commented Sep 17, 2019

@ezyang do you think this would cause any problems with your multiple dispatch patches? I'm thinking the cases of mixed sparse/strided matrices.

Comment thread test/test_type_promotion.py Outdated
Comment thread test/test_type_promotion.py Outdated
Comment thread test/test_type_promotion.py Outdated
Comment thread test/test_type_promotion.py Outdated
auto promotedType = promoteTypes(t.scalar_type(), src.scalar_type());
TORCH_CHECK(canCast(promotedType, r_.scalar_type()), "Can't convert result type ", promotedType, " to output ", r_.scalar_type());

if (s_values_.scalar_type() != promotedType) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

err..what happens if you did all of this with s instead of s_values? Would it copy the indices tensor?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would yeah. I think it'd also be a bit messier converting src directly since below still needs the values/indices separately, and src is a reference, so there'd be an additional temporary variable that isn't really needed.

@ezyang
Copy link
Copy Markdown
Contributor

ezyang commented Sep 18, 2019

Yes, this will conflict. I've been trying to land multiple dispatch since all day yesterday. I can give advice where code will have moved afterwards, but all in all it should be pretty straightforward.

nairbv added a commit that referenced this pull request Sep 18, 2019
ghstack-source-id: 4c125c5
Pull Request resolved: #26051
@gchanan
Copy link
Copy Markdown
Contributor

gchanan commented Sep 18, 2019

@ezyang: just to clarify, you are suggesting we get your patch in first, correct?

@ezyang
Copy link
Copy Markdown
Contributor

ezyang commented Sep 19, 2019

Yes, that would be preferred!

nairbv added a commit that referenced this pull request Sep 20, 2019
ghstack-source-id: 260fca2
Pull Request resolved: #26051
nairbv added a commit that referenced this pull request Sep 20, 2019
ghstack-source-id: a5e0863
Pull Request resolved: #26051
nairbv added a commit that referenced this pull request Sep 20, 2019
ghstack-source-id: fbab083
Pull Request resolved: #26051
@nairbv
Copy link
Copy Markdown
Collaborator Author

nairbv commented Sep 21, 2019

@pytorchbot retest this please

nairbv added a commit that referenced this pull request Sep 21, 2019
ghstack-source-id: 5904bdd
Pull Request resolved: #26051
nairbv added a commit that referenced this pull request Sep 23, 2019
ghstack-source-id: de71015
Pull Request resolved: #26051
nairbv added a commit that referenced this pull request Sep 23, 2019
ghstack-source-id: 9f999b0
Pull Request resolved: #26051
@nairbv nairbv added this to the 1.3 milestone Sep 24, 2019
@nairbv nairbv requested a review from gchanan September 24, 2019 20:29
Comment thread aten/src/ATen/native/sparse/SparseTensorMath.cpp Outdated
@nairbv nairbv requested a review from gchanan September 25, 2019 13:36
gchanan pushed a commit to gchanan/pytorch that referenced this pull request Sep 26, 2019
ghstack-source-id: 229bc9a
Pull Request resolved: pytorch#26051
Copy link
Copy Markdown
Contributor

@gchanan gchanan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename looks fine. @zou3519 can you take a look?

Comment thread test/test_type_promotion.py
@nairbv nairbv requested a review from zou3519 September 27, 2019 16:22
TORCH_CHECK(!r.is_cuda(), "add: expected 'out' to be CPU tensor, but got CUDA tensor");
TORCH_CHECK(!src.is_cuda(), "add: expected 'other' to be a CPU tensor, but got a CUDA tensor");

auto commonDtype = promoteTypes(t.scalar_type(), src.scalar_type());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems wrong because t, src could be zero-dim tensors:

i = torch.zeros([0, 1])
v = torch.tensor(3.14, dtype=torch.float32)
sparse_scalar = torch.sparse_coo_tensor(i, v, [])

Are we saying that sparse zero-dim scalars belong in a different category for type promotion?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah I didn't realize it was possible to construct a zero-dim sparse tensor.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is, but on the other hand, I'm not sure who does

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually, just tried this, we don't support adding dimensioned tensors to zero-dim tensors. Current implementation of sparse tensors requires t.dims == src.dims() so as long as the dims match promoteTypes will be correct anyway. We should probably expand sparse capabilities but that'll be another issue.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, then there should be an assert here before the promoteTypes call so it's clear that we're assuming the inputs are in the same category

get_sparse_impl(r)->set_indices_and_values_unsafe(r_indices, r_values);

if (commonDtype != r_values.scalar_type()) {
r_values = r_values.to(commonDtype);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can speed this up by allocating r_values = torch.empty_like(r_values). We don't actually care about the data that is initially in the result tensor.

);

if (r.scalar_type() != commonDtype) {
r_values = r_values.to(r.scalar_type());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of doing a .to, we can save memory here by doing a copy from r_values directly into the values tensor instead of creating a new tensor here and assigning it.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the values tensor of the result might not be large enough to hold all of that...

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for this one we may still need the to, otherwise we would have to create a new empty tensor first to copy values into anyway, and I assume that's just basically the implementation of to?

r_values here is the promoted version (e.g. float += double, r_values is a double tensor, but we'd need to set r._values to a yet-uncreated float tensor containing those values.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

r is the result tensor. It's possible that r already has data and r._values() and r._indices() are tensors with some non-trivial storage. If that is the case, instead of discarding values and indices in favor of the new values, we could copy r_values directly into r._values(), assuming there is enough memory in r._values().

});
if (r.scalar_type() != commonDtype) {
r.copy_(promoted_result);
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation is not the nicest; it looks like to do type promotion, the current implementation adds the logic to each possible pathway. Instead of doing that, we could imagine writing a wrapper function around the implementation that promotes the input sparse tensors to the correct dtype, calls the implementation (that operates on tensors of the same dtype), and then copy the result back to the result tensor.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I'd like to clean this up eventually, but in doing so I think I'd want to extend support for more combinations of tensors too. Right now arithmetic ops on sparse tensors are pretty limited and didn't want to try to make it generic without actually making the availability of sparse ops more generic.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see why "adding more support for sparse ops" and "making the implementation more generic" have to go together. Right now, the implementation for add_sparse_out_cpu is hard to review because there are two code paths that need to be considered, one for where values are contiguous, and one for when they're not.

It would make it easier to review (and also make the implementation look better) if we did something like the following (pseudocode)

add_out_sparse_cpu(result, tensor, other) {
  // do type promotion
  common_dtype = result_type(tensor, other)
  result_tmp = empty_like(result, common_dtype)
  tensor_tmp = tensor.to(common_dtype)
  other_tmp = other.to(common_dtype)

  // call the base op
  add_out_sparse_cpu_no_type_promotion(result_tmp, other_tmp, tensor_tmp)

  // shove the result back into its original type.
  result.set_inplace(result_tmp.indices(), result_tmp.values().to(result.dtype), result_tmp.sizes())
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general putting the type promotion code in helper functions or a wrapper function like this would help because the body of add_out_sparse_cpu as well as the other functions are really, really long

self.assertTrue(actual, expected)
self.assertTrue(actual.dtype == torch.bool)

def test_sparse(self):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm reading the code correctly, for add and mul, type promotion is implemented for the following:

  • add_out_sparse (cpu + cuda)
  • add_out_sparse_dense_sparse (cpu only)
  • mul_out_sparse (cpu + cuda)

Why is there a discrepancy in what is supported in add_out_sparse_dense_sparse?

self.assertEqual((s + s.to(torch.double)).to_dense(), t + t)
self.assertEqual((s + s.to(torch.double)).dtype, torch.double)

inplace = s.to(torch.float, copy=True)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a lot of test cases here. I imagine when we add type promotion to more sparse ops, we will not want to just append LOC to this test case (test_sparse). Can we split this up into semantically meaningful tests for different ops and different overloads?

@nairbv nairbv merged commit 4b40b96 into gh/nairbv/2/base Sep 27, 2019
nairbv added a commit that referenced this pull request Sep 27, 2019
ghstack-source-id: c8133b6
Pull Request resolved: #26051
@ezyang
Copy link
Copy Markdown
Contributor

ezyang commented Sep 30, 2019

FYI: this was fake merged due to a bug in ghstack

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

module: cuda Related to torch.cuda, and CUDA support in general module: sparse Related to torch.sparse

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants