add narrow() support for sparse tensors re: #8853 by realdoug · Pull Request #11342 · pytorch/pytorch

realdoug · 2018-09-06T18:58:49Z

Couple questions:

I used the log1p implementation in Add log1p for sparse tensor #8969 as a guide especially for testing. I'm not sure what the @skipIfROCM annotation is for, so unsure if i need it for my test.
I implemented the branching logic in the narrow function itself; is this the right place to do so? I noticed that there a number of places where sparse-specific logic is handled with just an if statement in this file. Or should I implement a separate dispatch in native_functions.yml as in the log1p?

And of course, happy to make any any other updates/changes that I may have missed as well. This is my first PR to the project.

ezyang · 2018-09-09T05:15:11Z

I didn't review the code closely, but IIUC you are not sharing storage with the old tensor are you? Since this is the case, you should not call it narrow, as part of the API contract of narrow is that it shares storage.

aten/src/ATen/native/TensorShape.cpp

+Tensor _narrow_sparse(const Tensor& self, int64_t dim, int64_t start, int64_t length){
+  LongTensor indices = self._indices();
+  Tensor values = self._values();
+  int64_t numCoords = indices.size(1);


ezyang

Unfortunately, this needs some work.

The big problem is that, on CUDA, this kernels is written in a hugely inefficient way, because the iterated calls to toCLong each do a CUDA synchronization. That's a lot of synchronizations. There is also a code smell, where loops are written by hand (which means there's no opportunity for vectorization or parallelism).

Assuming that you want to write a non-storage sharing narrow, I think a better strategy is to compute a boolean mask of indices to keep, and then use masked_select to grab them from indices and values. That should eliminate all of the loops.

ezyang · 2018-09-09T05:36:57Z

To answer your questions:

I used the log1p implementation in #8969 as a guide especially for testing. I'm not sure what the @skipIfROCM annotation is for, so unsure if i need it for my test.

skipIfROCM turns off the test for AMD GPUs. I would advise you to omit that for now, and then add it if the AMD tests fail.

I implemented the branching logic in the narrow function itself; is this the right place to do so? I noticed that there a number of places where sparse-specific logic is handled with just an if statement in this file. Or should I implement a separate dispatch in native_functions.yml as in the log1p?

Moot if you don't call this narrow.

realdoug · 2018-09-10T12:36:11Z

@ezyang Thanks for the comments. My intention is to mirror the functionality of dense.narrow() as closely as I can; so sharing storage is the preferred approach as much as possible.

A) For sizes & indices the values must change, so I don't know if sharing storage is possible. Am I correct in saying that? Can still eliminate the loop w/ masked_select though.

B) For the values tensor sounds like I can just compute a boolean mask to share storage. Will do.

If I'm correct in A & B do you still suggest a name change for the function?

realdoug · 2018-09-10T22:34:55Z

I've pushed a new version w/ rename & replaced the for loops. This implementation does not share storage, but it is an improvement over to_dense().narrow().

If there is an easy way to do Tensor newValues = self._values().masked_select(mask);
while preserving shared storage, let me know. Otherwise I think I'd have to rely on/enforce that the values tensor is contiguous, meaning indices must be sorted by the dimension being narrowed on.

ezyang · 2018-09-12T03:15:50Z

Yes, I think I agree that writing a storage sharing narrow seems difficult. It might be possible for coalesced tensors, if you only narrow a trailing dimension, but that seems like a limited enough case that we shouldn't bother.

If there is an easy way to do Tensor newValues = self._values().masked_select(mask);
while preserving shared storage, let me know. Otherwise I think I'd have to rely on/enforce that the values tensor is contiguous, meaning indices must be sorted by the dimension being narrowed on.

I don't see how your second sentence follows from the first. There's no way to do a masked_select that preserves shared storage. But if you select the correct values for the indices in the narrow range, I don't see why they have to be sorted. You've already committed to doing an O(n) operation...

aten/src/ATen/native/TensorShape.cpp

+    Tensor newIndices = indices.masked_select(mask).view({dims, -1});
+    return self.type().sparse_coo_tensor(newIndices, newValues, newSizes);
+  }else{
+    return self.clone().narrow(dim,start,length);


ezyang · 2018-09-12T03:17:41Z

CC @yf225 @weiyangfb @gchanan if you have any opinions about the naming. Maybe copy_narrow is marginally better than narrow_copy, because it has the correct order of operations in English.

aten/src/ATen/native/TensorShape.cpp

+    newSizes[dim]=length;
+
+    Tensor narrowDim = at::zeros_like(indices[dim]);
+    narrowDim.copy_(indices[dim]);


aten/src/ATen/native/TensorShape.cpp

+    narrowDim.copy_(indices[dim]);
+    Tensor mask = (narrowDim >= start).__and__((narrowDim < end));
+
+    indices[dim] = indices[dim].add(-start);


ezyang · 2018-09-12T03:21:58Z

New functions would need docs.

ezyang

Still buggy.

realdoug · 2018-09-12T05:12:58Z

I don't see how your second sentence follows from the first. There's no way to do a masked_select that preserves shared storage. But if you select the correct values for the indices in the narrow range, I don't see why they have to be sorted. You've already committed to doing an O(n) operation...

Sorry, I skipped a few logical steps there. I was just referring to the fact that you can theoretically do values[start:end] which does preserve storage.

realdoug · 2018-09-12T14:34:39Z

alrighty, added docs & a simplifications re: convo above.

torch/_tensor_docs.py

+               r"""
+narrow(dimension, start, length) -> Tensor
+
+Same functionality as :meth:`Tensor.narrow` except returning a full copy,


aten/src/ATen/native/TensorShape.cpp

weiyangfb

This PR still requires some changes

aten/src/ATen/native/TensorShape.cpp

test/test_sparse.py

torch/_tensor_docs.py

+narrow_copy(dimension, start, length) -> Tensor
+
+Same as :meth:`Tensor.narrow` except returning a copy rather
+than shared storage.  This is primarily for sparse tensors, which


aten/src/ATen/native/TensorShape.cpp

+Tensor narrow_copy_sparse(const Tensor& self, int64_t dim, int64_t start, int64_t length){
+  AT_CHECK(self.dim() > 0, "narrow() cannot be applied to a 0-dim tensor.");
+  auto cur_size = self.size(dim);
+  AT_CHECK(length >= 0 && start <= cur_size - length,


test/test_sparse.py

+
+        with_dense, _, _ = self._gen_sparse(2, 7, shape)
+        for narrow_args in self._all_narrow_combs(shape):
+            self._test_narrow(with_dense, narrow_args)


test/test_sparse.py

+            self._test_narrow(with_dense, narrow_args)
+            self._test_narrow(with_dense, narrow_args)
+
+        self.assertRaises(RuntimeError, lambda: input.narrow_copy(10, 0, 3))  # dim > sparseDim + denseDim


test/test_sparse.py

+        input, _, _ = self._gen_sparse(4, 19, shape)
+        for narrow_args in self._all_narrow_combs(shape):
+            self._test_narrow(input, narrow_args)
+            self._test_narrow(input.coalesce(), narrow_args)


aten/src/ATen/native/TensorShape.cpp


+Tensor narrow_copy_sparse(const Tensor& self, int64_t dim, int64_t start, int64_t length){
+  int64_t allDim = self.dim();
+  int64_t end = start+length;


facebook-github-bot

weiyangfb has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

weiyangfb · 2018-09-26T18:33:26Z

Looks good! Thanks @realdoug !

Summary: Couple questions: 1) I used the log1p implementation in #8969 as a guide especially for testing. I'm not sure what the ```skipIfROCM``` annotation is for, so unsure if i need it for my test. 2) I implemented the branching logic in the narrow function itself; is this the right place to do so? I noticed that there a number of places where sparse-specific logic is handled with just an if statement in this file. Or should I implement a separate dispatch in native_functions.yml as in the log1p? And of course, happy to make any any other updates/changes that I may have missed as well. This is my first PR to the project. Pull Request resolved: pytorch/pytorch#11342 Differential Revision: D9978430 Pulled By: weiyangfb fbshipit-source-id: e73dc20302ab58925afb19e609e31f4a38c634ad

* upstream/master: (117 commits) Add full namespace resolution in CAFFE_DURATION (pytorch#12065) T33898723: Simple put operators for caffe2 stats (pytorch#12057) add narrow() support for sparse tensors re: pytorch#8853 (pytorch#11342) Fix ONNX bug, add symbolic for full Enable tracing of tensor factories with an out argument Fix warnings emitted when testing distributions (pytorch#12038) Unify versions across setup.py, libtorch, and libcaffe2 (pytorch#12053) add autodiff expressions for common operations (pytorch#11832) Blob doesn't allow access to destroyCall anymore (pytorch#11548) IValue can store Blob (pytorch#11414) Move Blob to ATen/core (pytorch#11924) Use tempfile during serialized test comparison (pytorch#12021) fix segfault when grad to a hook fn is None (pytorch#12028) Fallback CreateMutex/AtomicIter operators for mkl-dnn Unify all *_EXPORT and *_IMPORT macros across c++ backend (pytorch#12019) Add safety asserts for methods on TensorImpl which don't work on Variable. (pytorch#12058) Make USE_IDEEP work again (pytorch#12026) Fix "identifier following the 'template' keyword does not refer to a template" (pytorch#12037) Delete some unused variables. (pytorch#12059) Support TypeIdentifier::name() (pytorch#12036) ...

realdoug requested review from apaszke, colesbury, ezyang, gchanan, soumith and zdevito as code owners September 6, 2018 18:58

realdoug mentioned this pull request Sep 6, 2018

Todo functions and autograd supports for Sparse Tensor #8853

Open

14 tasks

realdoug force-pushed the narrow-sparse branch from c5bd567 to bce0904 Compare September 6, 2018 19:18

ezyang reviewed Sep 9, 2018

View reviewed changes

aten/src/ATen/native/TensorShape.cpp Outdated

Tensor _narrow_sparse(const Tensor& self, int64_t dim, int64_t start, int64_t length){

LongTensor indices = self._indices();

Tensor values = self._values();

int64_t numCoords = indices.size(1);

This comment was marked as off-topic.

Sign in to view

ezyang requested changes Sep 9, 2018

View reviewed changes

realdoug force-pushed the narrow-sparse branch 3 times, most recently from 8187c8b to 0cc6561 Compare September 10, 2018 22:16

ezyang reviewed Sep 12, 2018

View reviewed changes

aten/src/ATen/native/TensorShape.cpp Outdated

Tensor newIndices = indices.masked_select(mask).view({dims, -1});

return self.type().sparse_coo_tensor(newIndices, newValues, newSizes);

}else{

return self.clone().narrow(dim,start,length);

This comment was marked as off-topic.

Sign in to view

ezyang reviewed Sep 12, 2018

View reviewed changes

aten/src/ATen/native/TensorShape.cpp Outdated

newSizes[dim]=length;

Tensor narrowDim = at::zeros_like(indices[dim]);

narrowDim.copy_(indices[dim]);

This comment was marked as off-topic.

Sign in to view

This comment was marked as off-topic.

Sign in to view

ezyang reviewed Sep 12, 2018

View reviewed changes

ezyang requested changes Sep 12, 2018

View reviewed changes

realdoug requested review from ssnl and zou3519 as code owners September 12, 2018 14:30

ezyang reviewed Sep 14, 2018

View reviewed changes

torch/_tensor_docs.py Outdated

r"""

narrow(dimension, start, length) -> Tensor

Same functionality as :meth:`Tensor.narrow` except returning a full copy,

This comment was marked as off-topic.

Sign in to view

ezyang approved these changes Sep 14, 2018

View reviewed changes

weiyangfb reviewed Sep 25, 2018

View reviewed changes

aten/src/ATen/native/TensorShape.cpp Outdated

This comment was marked as off-topic.

Sign in to view

weiyangfb suggested changes Sep 25, 2018

View reviewed changes

weiyangfb reviewed Sep 25, 2018

View reviewed changes

aten/src/ATen/native/TensorShape.cpp Outdated

This comment was marked as off-topic.

Sign in to view

weiyangfb reviewed Sep 25, 2018

View reviewed changes

aten/src/ATen/native/TensorShape.cpp Outdated

This comment was marked as off-topic.

Sign in to view

realdoug force-pushed the narrow-sparse branch from 0d9577a to c4f0628 Compare September 25, 2018 16:34

weiyangfb reviewed Sep 25, 2018

View reviewed changes

test/test_sparse.py Outdated

This comment was marked as off-topic.

Sign in to view

This comment was marked as off-topic.

Sign in to view

Doug Friedman and others added 4 commits September 25, 2018 20:57

add narrow() support for sparse tensors re: pytorch#8853

090af8e

fixup whitespace

cc3a7a1

add checks and cleanup

e6b972f

refactor tests to add more cases & incorporate factory api change

126f6a2

realdoug force-pushed the narrow-sparse branch from 75baf75 to 126f6a2 Compare September 25, 2018 21:35

python formatting

0367874

weiyangfb reviewed Sep 26, 2018

View reviewed changes

test/test_sparse.py

with_dense, _, _ = self._gen_sparse(2, 7, shape)

for narrow_args in self._all_narrow_combs(shape):

self._test_narrow(with_dense, narrow_args)

This comment was marked as off-topic.

Sign in to view

weiyangfb reviewed Sep 26, 2018

View reviewed changes

test/test_sparse.py Outdated

self._test_narrow(with_dense, narrow_args)

self._test_narrow(with_dense, narrow_args)

self.assertRaises(RuntimeError, lambda: input.narrow_copy(10, 0, 3)) # dim > sparseDim + denseDim

This comment was marked as off-topic.

Sign in to view

weiyangfb reviewed Sep 26, 2018

View reviewed changes

test/test_sparse.py Outdated

input, _, _ = self._gen_sparse(4, 19, shape)

for narrow_args in self._all_narrow_combs(shape):

self._test_narrow(input, narrow_args)

self._test_narrow(input.coalesce(), narrow_args)

This comment was marked as off-topic.

Sign in to view

realdoug added 2 commits September 26, 2018 04:17

add more docs & AT_CHECKs

061ac9d

whitespace fixup

13f660e

weiyangfb reviewed Sep 26, 2018

View reviewed changes

aten/src/ATen/native/TensorShape.cpp

Tensor narrow_copy_sparse(const Tensor& self, int64_t dim, int64_t start, int64_t length){

int64_t allDim = self.dim();

int64_t end = start+length;

This comment was marked as off-topic.

Sign in to view

weiyangfb approved these changes Sep 26, 2018

View reviewed changes

facebook-github-bot reviewed Sep 26, 2018

View reviewed changes

facebook-github-bot closed this in c2f8f50 Sep 26, 2018

ezyang added open source merged labels Jun 24, 2019

Conversation

realdoug commented Sep 6, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ezyang commented Sep 9, 2018

Uh oh!

This comment was marked as off-topic.

Uh oh!

ezyang left a comment

Choose a reason for hiding this comment

Uh oh!

ezyang commented Sep 9, 2018

Uh oh!

realdoug commented Sep 10, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

realdoug commented Sep 10, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ezyang commented Sep 12, 2018

Uh oh!

This comment was marked as off-topic.

Uh oh!

ezyang commented Sep 12, 2018

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

ezyang commented Sep 12, 2018

Uh oh!

ezyang left a comment

Choose a reason for hiding this comment

Uh oh!

realdoug commented Sep 12, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

realdoug commented Sep 12, 2018

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

weiyangfb left a comment

Choose a reason for hiding this comment

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

weiyangfb commented Sep 26, 2018

Uh oh!

Reviewers

realdoug commented Sep 6, 2018 •

edited

Loading

realdoug commented Sep 10, 2018 •

edited

Loading

realdoug commented Sep 10, 2018 •

edited

Loading

realdoug commented Sep 12, 2018 •

edited

Loading