Skip to content

[jit] Fix scalar tensor assert in fusion compiler#10952

Closed
zou3519 wants to merge 6 commits intopytorch:masterfrom
zou3519:scalar-fusion
Closed

[jit] Fix scalar tensor assert in fusion compiler#10952
zou3519 wants to merge 6 commits intopytorch:masterfrom
zou3519:scalar-fusion

Conversation

@zou3519
Copy link
Contributor

@zou3519 zou3519 commented Aug 28, 2018

Fixes #8560.
Unblocks #10715.

The assert (nDim <= uncompressedDims) was being triggered for a scalar
tensor because we compute nDim to be 1 for a scalar tensor but
uncompressedDim = 0.

This PR changes it so that we compute nDim to be 0 for a scalar tensor. This
works because indexing in a kernel depends on nDim. If nDim = 0, then
offset is always 0, which is what we want.

Some other (small) changes were necessary to make this work:

  • One cannot define a 0-length array IndexType arr[0] so the code
    guards against that
  • Needed to change some of the maxTensorInfoSize logic to handle the
    case when uncompressedDim == 0.

cc @apaszke @zdevito

@zou3519 zou3519 added the oncall: jit Add this issue/PR to JIT oncall triage queue label Aug 28, 2018
Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

zou3519 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@neerajprad
Copy link
Contributor

neerajprad commented Aug 28, 2018

@zou3519 : This fixes the particular assert (nDim <= uncompressedDims) issue on the Pyro examples in #10715, but they are now segfaulting. I think it might be related?

Process 2322 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=EXC_I386_GPFLT)
    frame #0: 0x0000000115059c6d libtorch.dylib`torch::jit::(anonymous namespace)::compressContiguous(at::ArrayRef<long long>, at::ArrayRef<long long>, std::__1::vector<bool, std::__1::allocator<bool> > const&, unsigned int*, unsigned int*) [inlined] std::__1::__bit_const_reference<std::__1::vector<bool, std::__1::allocator<bool> > >::operator bool(this=0x00007ffeefbf7840) const at __bit_reference:140
   137 	        : __seg_(__x.__seg_), __mask_(__x.__mask_) {}
   138
   139 	    _LIBCPP_INLINE_VISIBILITY _LIBCPP_CONSTEXPR operator bool() const _NOEXCEPT
-> 140 	        {return static_cast<bool>(*__seg_ & __mask_);}
   141
   142 	    _LIBCPP_INLINE_VISIBILITY __bit_iterator<_Cp, true> operator&() const _NOEXCEPT
   143 	        {return __bit_iterator<_Cp, true>(__seg_, static_cast<unsigned>(__ctz(__mask_)));}
Target 0: (python) stopped.

@zou3519
Copy link
Contributor Author

zou3519 commented Aug 28, 2018

Oh I see, yes I think that is related. Sending a fix...

@neerajprad
Copy link
Contributor

neerajprad commented Aug 28, 2018

Thanks! I think that this fixes the scalar issue, and the Hamiltonian Monte Carlo examples pass with this change. The VAE example is now throwing a different error, but I think that might be a separate issue. I'll either create a new one or update #10715.

Stack Trace
  $ python examples/vae/vae.py --jit
clang: error: unsupported option '-fopenmp'
clang: error: unsupported option '-fopenmp'
warning: pytorch jit fuser failed to compile with openmp, trying without it...
Traceback (most recent call last):
  File "examples/vae/vae.py", line 212, in <module>
    model = main(args)
  File "examples/vae/vae.py", line 154, in main
    epoch_loss += svi.step(x)
  File "/Users/npradhan/workspace/pyro_dev/pyro/pyro/infer/svi.py", line 96, in step
    loss = self.loss_and_grads(self.model, self.guide, *args, **kwargs)
  File "/Users/npradhan/workspace/pyro_dev/pyro/pyro/infer/trace_elbo.py", line 202, in loss_and_grads
    loss, surrogate_loss = self._loss_and_surrogate_loss(*args)
  File "/Users/npradhan/workspace/pyro_dev/pyro/pyro/ops/jit.py", line 59, in __call__
    ret = self.compiled[argc](*params_and_args)
  File "/Users/npradhan/miniconda2/envs/pytorch-master/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/Users/npradhan/miniconda2/envs/pytorch-master/lib/python3.6/site-packages/torch/jit/__init__.py", line 736, in forward
    return self._get_method('forward')(*args, **kwargs)
RuntimeError:
The size of tensor a (256) must match the size of tensor b (96) at non-singleton dimension 0 (infer_size at /Users/npradhan/workspace/pyro_dev/pytorch/pytorch/aten/src/ATen/ExpandUtils.cpp:22)
frame #0: at::TensorIterator::compute_shape() + 381 (0x1142a1a7d in libcaffe2.dylib)
frame #1: at::TensorIterator::Builder::build() + 159 (0x1142a11bf in libcaffe2.dylib)
frame #2: at::TensorIterator::binary_op(at::Tensor&, at::Tensor const&, at::Tensor const&) + 126 (0x1142a0fee in libcaffe2.dylib)
frame #3: at::native::mul_out(at::Tensor&, at::Tensor const&, at::Tensor const&) + 319 (0x114163d3f in libcaffe2.dylib)
frame #4: at::native::mul(at::Tensor const&, at::Tensor const&) + 60 (0x11416439c in libcaffe2.dylib)
frame #5: at::Type::mul(at::Tensor const&, at::Tensor const&) const + 64 (0x11456a190 in libcaffe2.dylib)
frame #6: torch::autograd::VariableType::mul(at::Tensor const&, at::Tensor const&) const + 1861 (0x116674315 in libtorch.dylib)
frame #7: at::mul(at::Tensor const&, at::Tensor const&) + 86 (0x116b4e0b6 in libtorch.dylib)
frame #8: torch::jit::(anonymous namespace)::$_445::operator()(std::__1::vector<torch::jit::IValue, std::__1::allocator<torch::jit::IValue> >&) const + 162 (0x116b4df92 in libtorch.dylib)
frame #9: int std::__1::__invoke_void_return_wrapper<int>::__call<torch::jit::(anonymous namespace)::$_445&, std::__1::vector<torch::jit::IValue, std::__1::allocator<torch::jit::IValue> >&>(torch::jit::(anonymous namespace)::$_445&&&, std::__1::vector<torch::jit::IValue, std::__1::allocator<torch::jit::IValue> >&&&) + 77 (0x116b4dedd in libtorch.dylib)
frame #10: std::__1::__function::__func<torch::jit::(anonymous namespace)::$_445, std::__1::allocator<torch::jit::(anonymous namespace)::$_445>, int (std::__1::vector<torch::jit::IValue, std::__1::allocator<torch::jit::IValue> >&)>::operator()(std::__1::vector<torch::jit::IValue, std::__1::allocator<torch::jit::IValue> >&) + 68 (0x116b4ddd4 in libtorch.dylib)
frame #11: std::__1::function<int (std::__1::vector<torch::jit::IValue, std::__1::allocator<torch::jit::IValue> >&)>::operator()(std::__1::vector<torch::jit::IValue, std::__1::allocator<torch::jit::IValue> >&) const + 142 (0x113256c4e in _C.cpython-36m-darwin.so)
frame #12: torch::jit::InterpreterStateImpl::runOneStage(std::__1::vector<torch::jit::IValue, std::__1::allocator<torch::jit::IValue> >&) + 315 (0x116d5a5eb in libtorch.dylib)
frame #13: torch::jit::InterpreterState::runOneStage(std::__1::vector<torch::jit::IValue, std::__1::allocator<torch::jit::IValue> >&) + 40 (0x116d5a4a8 in libtorch.dylib)
frame #14: torch::jit::FusedKernelCache::runFallback(std::__1::vector<torch::jit::IValue, std::__1::allocator<torch::jit::IValue> >&) + 50 (0x1168b0de2 in libtorch.dylib)
frame #15: torch::jit::FusedKernelCache::run(std::__1::vector<torch::jit::IValue, std::__1::allocator<torch::jit::IValue> >&) + 271 (0x1168ae3af in libtorch.dylib)
frame #16: torch::jit::(anonymous namespace)::$_0::operator()(torch::jit::Node*) const::'lambda'(std::__1::vector<torch::jit::IValue, std::__1::allocator<torch::jit::IValue> >&)::operator()(std::__1::vector<torch::jit::IValue, std::__1::allocator<torch::jit::IValue> >&) const + 78 (0x116e8325e in libtorch.dylib)
frame #17: int std::__1::__invoke_void_return_wrapper<int>::__call<torch::jit::(anonymous namespace)::$_0::operator()(torch::jit::Node*) const::'lambda'(std::__1::vector<torch::jit::IValue, std::__1::allocator<torch::jit::IValue> >&)&, std::__1::vector<torch::jit::IValue, std::__1::allocator<torch::jit::IValue> >&>(torch::jit::(anonymous namespace)::$_0::operator()(torch::jit::Node*) const::'lambda'(std::__1::vector<torch::jit::IValue, std::__1::allocator<torch::jit::IValue> >&)&&&, std::__1::vector<torch::jit::IValue, std::__1::allocator<torch::jit::IValue> >&&&) + 77 (0x116e831fd in libtorch.dylib)
frame #18: std::__1::__function::__func<torch::jit::(anonymous namespace)::$_0::operator()(torch::jit::Node*) const::'lambda'(std::__1::vector<torch::jit::IValue, std::__1::allocator<torch::jit::IValue> >&), std::__1::allocator<torch::jit::(anonymous namespace)::$_0::operator()(torch::jit::Node*) const::'lambda'(std::__1::vector<torch::jit::IValue, std::__1::allocator<torch::jit::IValue> >&)>, int (std::__1::vector<torch::jit::IValue, std::__1::allocator<torch::jit::IValue> >&)>::operator()(std::__1::vector<torch::jit::IValue, std::__1::allocator<torch::jit::IValue> >&) + 57 (0x116e82f19 in libtorch.dylib)
frame #19: std::__1::function<int (std::__1::vector<torch::jit::IValue, std::__1::allocator<torch::jit::IValue> >&)>::operator()(std::__1::vector<torch::jit::IValue, std::__1::allocator<torch::jit::IValue> >&) const + 142 (0x113256c4e in _C.cpython-36m-darwin.so)
frame #20: torch::jit::InterpreterStateImpl::runOneStage(std::__1::vector<torch::jit::IValue, std::__1::allocator<torch::jit::IValue> >&) + 315 (0x116d5a5eb in libtorch.dylib)
frame #21: torch::jit::InterpreterState::runOneStage(std::__1::vector<torch::jit::IValue, std::__1::allocator<torch::jit::IValue> >&) + 40 (0x116d5a4a8 in libtorch.dylib)
frame #22: torch::jit::(anonymous namespace)::ExecutionPlan::runWithGrad(std::__1::vector<torch::jit::IValue, std::__1::allocator<torch::jit::IValue> >&) const + 1224 (0x116cbf9d8 in libtorch.dylib)
frame #23: torch::jit::(anonymous namespace)::ExecutionPlan::run(std::__1::vector<torch::jit::IValue, std::__1::allocator<torch::jit::IValue> >&) const + 62 (0x116caff9e in libtorch.dylib)
frame #24: torch::jit::GraphExecutorImpl::run(std::__1::vector<torch::jit::IValue, std::__1::allocator<torch::jit::IValue> >&) + 2195 (0x116caa533 in libtorch.dylib)
frame #25: torch::jit::GraphExecutor::run(std::__1::vector<torch::jit::IValue, std::__1::allocator<torch::jit::IValue> >&) + 40 (0x116ca9c98 in libtorch.dylib)
frame #26: torch::jit::CodeImpl::getInterpreterOperation(torch::jit::Node*)::'lambda'(std::__1::vector<torch::jit::IValue, std::__1::allocator<torch::jit::IValue> >&)::operator()(std::__1::vector<torch::jit::IValue, std::__1::allocator<torch::jit::IValue> >&) + 78 (0x116d804be in libtorch.dylib)
frame #27: int std::__1::__invoke_void_return_wrapper<int>::__call<torch::jit::CodeImpl::getInterpreterOperation(torch::jit::Node*)::'lambda'(std::__1::vector<torch::jit::IValue, std::__1::allocator<torch::jit::IValue> >&)&, std::__1::vector<torch::jit::IValue, std::__1::allocator<torch::jit::IValue> >&>(torch::jit::CodeImpl::getInterpreterOperation(torch::jit::Node*)::'lambda'(std::__1::vector<torch::jit::IValue, std::__1::allocator<torch::jit::IValue> >&)&&&, std::__1::vector<torch::jit::IValue, std::__1::allocator<torch::jit::IValue> >&&&) + 77 (0x116d8045d in libtorch.dylib)
frame #28: std::__1::__function::__func<torch::jit::CodeImpl::getInterpreterOperation(torch::jit::Node*)::'lambda'(std::__1::vector<torch::jit::IValue, std::__1::allocator<torch::jit::IValue> >&), std::__1::allocator<torch::jit::CodeImpl::getInterpreterOperation(torch::jit::Node*)::'lambda'(std::__1::vector<torch::jit::IValue, std::__1::allocator<torch::jit::IValue> >&)>, int (std::__1::vector<torch::jit::IValue, std::__1::allocator<torch::jit::IValue> >&)>::operator()(std::__1::vector<torch::jit::IValue, std::__1::allocator<torch::jit::IValue> >&) + 57 (0x116d80179 in libtorch.dylib)
frame #29: std::__1::function<int (std::__1::vector<torch::jit::IValue, std::__1::allocator<torch::jit::IValue> >&)>::operator()(std::__1::vector<torch::jit::IValue, std::__1::allocator<torch::jit::IValue> >&) const + 142 (0x113256c4e in _C.cpython-36m-darwin.so)
frame #30: torch::jit::InterpreterStateImpl::runOneStage(std::__1::vector<torch::jit::IValue, std::__1::allocator<torch::jit::IValue> >&) + 315 (0x116d5a5eb in libtorch.dylib)
frame #31: torch::jit::InterpreterState::runOneStage(std::__1::vector<torch::jit::IValue, std::__1::allocator<torch::jit::IValue> >&) + 40 (0x116d5a4a8 in libtorch.dylib)
frame #32: torch::jit::GraphExecutorImpl::runFallback(std::__1::vector<torch::jit::IValue, std::__1::allocator<torch::jit::IValue> >&) + 62 (0x116cae51e in libtorch.dylib)
frame #33: torch::jit::GraphExecutorImpl::run(std::__1::vector<torch::jit::IValue, std::__1::allocator<torch::jit::IValue> >&) + 2110 (0x116caa4de in libtorch.dylib)
frame #34: torch::jit::GraphExecutor::run(std::__1::vector<torch::jit::IValue, std::__1::allocator<torch::jit::IValue> >&) + 40 (0x116ca9c98 in libtorch.dylib)
frame #35: torch::jit::script::Method::run(std::__1::vector<torch::jit::IValue, std::__1::allocator<torch::jit::IValue> >&) + 1144 (0x113399198 in _C.cpython-36m-darwin.so)
frame #36: torch::jit::invokeScriptMethodFromPython(torch::jit::script::Method&, pybind11::args, pybind11::kwargs) + 175 (0x11337b9df in _C.cpython-36m-darwin.so)
frame #37: pybind11::object pybind11::detail::argument_loader<torch::jit::script::Method&, pybind11::args, pybind11::kwargs>::call_impl<pybind11::object, pybind11::object (*&)(torch::jit::script::Method&, pybind11::args, pybind11::kwargs), 0ul, 1ul, 2ul, pybind11::detail::void_type>(pybind11::object (*&&&)(torch::jit::script::Method&, pybind11::args, pybind11::kwargs), pybind11::detail::index_sequence<0ul, 1ul, 2ul>, pybind11::detail::void_type&&) + 276 (0x1133e5624 in _C.cpython-36m-darwin.so)
frame #38: std::__1::enable_if<!(std::is_void<pybind11::object>::value), pybind11::object>::type pybind11::detail::argument_loader<torch::jit::script::Method&, pybind11::args, pybind11::kwargs>::call<pybind11::object, pybind11::detail::void_type, pybind11::object (*&)(torch::jit::script::Method&, pybind11::args, pybind11::kwargs)>(pybind11::object (*&&&)(torch::jit::script::Method&, pybind11::args, pybind11::kwargs)) + 56 (0x1133e4e68 in _C.cpython-36m-darwin.so)
frame #39: void pybind11::cpp_function::initialize<pybind11::object (*&)(torch::jit::script::Method&, pybind11::args, pybind11::kwargs), pybind11::object, torch::jit::script::Method&, pybind11::args, pybind11::kwargs, pybind11::name, pybind11::is_method, pybind11::sibling>(pybind11::object (*&&&)(torch::jit::script::Method&, pybind11::args, pybind11::kwargs), pybind11::object (*)(torch::jit::script::Method&, pybind11::args, pybind11::kwargs), pybind11::name const&, pybind11::is_method const&, pybind11::sibling const&)::'lambda'(pybind11::detail::function_call&)::operator()(pybind11::detail::function_call&) const + 225 (0x1133e4d21 in _C.cpython-36m-darwin.so)
frame #40: void pybind11::cpp_function::initialize<pybind11::object (*&)(torch::jit::script::Method&, pybind11::args, pybind11::kwargs), pybind11::object, torch::jit::script::Method&, pybind11::args, pybind11::kwargs, pybind11::name, pybind11::is_method, pybind11::sibling>(pybind11::object (*&&&)(torch::jit::script::Method&, pybind11::args, pybind11::kwargs), pybind11::object (*)(torch::jit::script::Method&, pybind11::args, pybind11::kwargs), pybind11::name const&, pybind11::is_method const&, pybind11::sibling const&)::'lambda'(pybind11::detail::function_call&)::__invoke(pybind11::detail::function_call&) + 24 (0x1133e4c28 in _C.cpython-36m-darwin.so)
frame #41: pybind11::cpp_function::dispatcher(_object*, _object*, _object*) + 6919 (0x112c432b7 in _C.cpython-36m-darwin.so)
<omitting python frames>
:
operation failed in interpreter:
/Users/npradhan/miniconda2/envs/pytorch-master/lib/python3.6/site-packages/torch/distributions/normal.py(59): rsample
/Users/npradhan/miniconda2/envs/pytorch-master/lib/python3.6/site-packages/torch/distributions/independent.py(75): rsample
/Users/npradhan/workspace/pyro_dev/pyro/pyro/distributions/torch_distribution.py(42): __call__
/Users/npradhan/workspace/pyro_dev/pyro/pyro/poutine/runtime.py(119): default_process_message
/Users/npradhan/workspace/pyro_dev/pyro/pyro/poutine/runtime.py(181): apply_stack
/Users/npradhan/workspace/pyro_dev/pyro/pyro/primitives.py(84): sample
examples/vae/vae.py(105): guide
/Users/npradhan/workspace/pyro_dev/pyro/pyro/poutine/messenger.py(27): _wraps
/Users/npradhan/workspace/pyro_dev/pyro/pyro/poutine/trace_messenger.py(176): __call__
/Users/npradhan/workspace/pyro_dev/pyro/pyro/poutine/trace_messenger.py(192): get_trace
/Users/npradhan/workspace/pyro_dev/pyro/pyro/infer/enum.py(40): get_importance_trace
/Users/npradhan/workspace/pyro_dev/pyro/pyro/infer/trace_elbo.py(52): _get_trace
/Users/npradhan/workspace/pyro_dev/pyro/pyro/infer/elbo.py(111): _get_traces
/Users/npradhan/workspace/pyro_dev/pyro/pyro/infer/trace_elbo.py(168): loss_and_surrogate_loss
/Users/npradhan/workspace/pyro_dev/pyro/pyro/poutine/messenger.py(27): _wraps
/Users/npradhan/workspace/pyro_dev/pyro/pyro/ops/jit.py(49): compiled
/Users/npradhan/miniconda2/envs/pytorch-master/lib/python3.6/site-packages/torch/jit/__init__.py(290): wrapper
/Users/npradhan/workspace/pyro_dev/pyro/pyro/ops/jit.py(39): __call__
/Users/npradhan/workspace/pyro_dev/pyro/pyro/infer/trace_elbo.py(202): loss_and_grads
/Users/npradhan/workspace/pyro_dev/pyro/pyro/infer/svi.py(96): step
examples/vae/vae.py(154): main
examples/vae/vae.py(212): <module>

@zou3519
Copy link
Contributor Author

zou3519 commented Aug 29, 2018

@neerajprad the stack track you have there is probably a different issue. Let me know (or update/open an issue) if you can find a small example that causes the crash

@neerajprad
Copy link
Contributor

Let me know (or update/open an issue) if you can find a small example that causes the crash

Thanks, @zou3519! I'll debug to see if there is a minimal example that we can isolate.

Copy link
Contributor

@apaszke apaszke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will also conflict with @mruberry's fuser split 😕

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

@zou3519
Copy link
Contributor Author

zou3519 commented Aug 31, 2018

I can sit on this until @mruberry's fuser split happens :)

@neerajprad
Copy link
Contributor

@zou3519 : Regarding my comment above, I think that is most likely a Pyro specific issue pyro-ppl/pyro#1358.

@mruberry
Copy link
Collaborator

mruberry commented Sep 4, 2018

I can rebase my PR with your changes. My PR is still waiting on review and I don't want to hold things up.

@zou3519
Copy link
Contributor Author

zou3519 commented Sep 4, 2018

Thanks, @mruberry, I'll let you know when this goes in.

@neerajprad based on the error message it looks like the tracer is specializing on input sizes -- I think we changed the behavior last week and it shouldn't do that anymore. Could you try pulling from master again and see if your bug still exists?

@neerajprad
Copy link
Contributor

@neerajprad based on the error message it looks like the tracer is specializing on input sizes -- I think we changed the behavior last week and it shouldn't do that anymore. Could you try pulling from master again and see if your bug still exists?

I have verified that the PyTorch tracer correctly generalizes to other input sizes (at least for the examples I tried it on), so this is most likely a bug within Pyro (in the way we are using the tracing functionality). This is being tracked in pyro-ppl/pyro#1358.

This comment was marked as off-topic.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

zou3519 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@apaszke
Copy link
Contributor

apaszke commented Sep 4, 2018

My only concern is that this doesn't really check that the fusion even happened on those scalars, but LGTM.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

zou3519 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Copy link
Contributor

@zdevito zdevito left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Look good.

Fixes pytorch#8560.
Unblocks pytorch#10715.

The assert (nDim <= uncompressedDims) was being triggered for a scalar
tensor because we compute nDim to be 1 for a scalar tensor but
uncompressedDim = 0.

This PR changes it so that we compute nDim to be 0 for a scalar tensor. This
works because indexing in a kernel depends on nDim. If nDim = 0, then
offset is always 0, which is what we want.

Some other (small) changes were necessary to make this work:
- One cannot define a 0-length array `IndexType arr[0]` so the code
  guards against that
- Needed to change some of the maxTensorInfoSize logic to handle the
  case when uncompressedDim == 0.
Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

zou3519 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

zou3519 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@zou3519 zou3519 deleted the scalar-fusion branch September 6, 2018 15:22
petrex pushed a commit to petrex/pytorch that referenced this pull request Sep 6, 2018
* upstream/master: (26 commits)
  cudnn 7 upgrade with spatialBN fix (pytorch#11291)
  Ignore FuseGraph Call on Windows (pytorch#11015)
  defer resolution of mkl to a cmake wrapper library (pytorch#11298)
  Cleanup dependency of distributed flags (pytorch#11221)
  Move minimal wrapdim functionality to core, remove THTensor include i… (pytorch#11283)
  Change includes from ATen/Storage.h to ATen/core/Storage.h (pytorch#11217)
  Fix scalar tensor assert in fusion compiler (pytorch#10952)
  Add dead code elimination pass (pytorch#10101)
  Distributed Data Parallel CPU module for C10D (pytorch#11168)
  Back out "[pt1][tensor] Add strides to caffe2::Tensor"
  Fix conv gradient conversion (pytorch#11312)
  Bag of clang tidy fixes for torch/csrc/ and torch/csrc/autograd (pytorch#11050)
  Sparse tensor printing; add NotImplemented autograd fn (pytorch#10181)
  Add convertToCaffe2Proto to python API
  fix doc for functional.dropout* (pytorch#10417)
  typo fix Tranpose2D -> Transpose2D (pytorch#11281)
  Remove THFinalizer
  Forward declarations of needed curand functions (pytorch#10911)
  nomnigraph - simplify core graph API and test (pytorch#11256)
  Small fixes to cppdocs for sync script (pytorch#11300)
  ...
PenghuiCheng pushed a commit to PenghuiCheng/pytorch that referenced this pull request Sep 11, 2018
Summary:
Fixes pytorch#8560.
Unblocks pytorch#10715.

The assert (nDim <= uncompressedDims) was being triggered for a scalar
tensor because we compute nDim to be 1 for a scalar tensor but
uncompressedDim = 0.

This PR changes it so that we compute nDim to be 0 for a scalar tensor. This
works because indexing in a kernel depends on nDim. If nDim = 0, then
offset is always 0, which is what we want.

Some other (small) changes were necessary to make this work:
- One cannot define a 0-length array `IndexType arr[0]` so the code
  guards against that
- Needed to change some of the maxTensorInfoSize logic to handle the
  case when uncompressedDim == 0.

cc apaszke zdevito
Pull Request resolved: pytorch#10952

Differential Revision: D9544607

Pulled By: zou3519

fbshipit-source-id: 2b873f47e2377125e1f94eb1b310a95cda51476c
@ezyang ezyang added the merged label Jun 26, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

oncall: jit Add this issue/PR to JIT oncall triage queue

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[JIT] scalars tensors inside the fuser cause an assertion failure in dimension compression.

7 participants