Skip to content

Add __tensor_wrap__ method similar to numpy __array_wrap__ #17249#18610

Closed
dylanbespalko wants to merge 11 commits intopytorch:masterfrom
dylanbespalko:master
Closed

Add __tensor_wrap__ method similar to numpy __array_wrap__ #17249#18610
dylanbespalko wants to merge 11 commits intopytorch:masterfrom
dylanbespalko:master

Conversation

@dylanbespalko
Copy link
Copy Markdown
Contributor

Added numpy array_ufunc for Tensor python type propagation during numpy function calls.
Added result_ptype for Tensor python type propagation during torch function calls.
Added workaround so that torch.nn.Parameters do not have python type propagation.
Added test_ptype_propagation to test/test_torch.py
Added test_ptype_non_propagation to test/test_torch.py

Example

class A(th.Tensor):
...
class B(th.Tensor):
....
class C(A, B):
...

a = A(th.ones((1,), dtype=th.double, requires_grad=False)) -> A
c = C(th.ones((1,), dtype=th.double, requires_grad=False)) -> C
a + c -> C

p = th.nn.Parameter(th.randn(5, 5))
print(p+1) -> Tensor

@ezyang
Copy link
Copy Markdown
Contributor

ezyang commented Apr 5, 2019

Could you post what the autogenerated code looks like before and after the change? (Just for one sample, representative function.)

@dylanbespalko
Copy link
Copy Markdown
Contributor Author

dylanbespalko commented Apr 5, 2019

Hi @ezyang,

Here is a method that uses both the static namedtuple and dynamic ptype subclass return type. I have to dynamically check my python type on each call because the subclass type is determined at runtime.

before:python_variable_methods.cpp

static PyObject * THPVariable_max(PyObject* self_, PyObject* args, PyObject* kwargs)
{
  HANDLE_TH_ERRORS
  static PythonArgParser parser({
    "max()",
    "max(Tensor other)",
    "max(int64_t dim, bool keepdim=False)",
  }, /*traceable=*/true);
  auto& self = reinterpret_cast<THPVariable*>(self_)->cdata;
  ParsedArgs<3> parsed_args;
  auto r = parser.parse(args, kwargs, parsed_args);

  static PyStructSequence_Field fields0[] = {
    {"values", ""}, {"indices", ""}, {nullptr}
  };
  static PyStructSequence_Desc desc0 = {
    "torch.return_types.max", nullptr,
    fields0, 2
  };
  static PyTypeObject type0;
  static bool namedtuple_type_initialized0 = false;
  if (!namedtuple_type_initialized0) {
    PyStructSequence_InitType(&type0, &desc0);
    type0.tp_repr = (reprfunc)torch::utils::returned_structseq_repr;
    namedtuple_type_initialized0 = true;
  }
  if (r.idx == 0) {
    return wrap(dispatch_max(self));
  } else if (r.idx == 1) {
    return wrap(dispatch_max(self, r.tensor(0)));
  } else if (r.idx == 2) {
    return wrap(&type0, dispatch_max(self, r.toInt64(0), r.toBool(1)));
  }
  Py_RETURN_NONE;
  END_HANDLE_TH_ERRORS
}

after:python_variable_methods.cpp

static PyObject * THPVariable_max(PyObject* self_, PyObject* args, PyObject* kwargs)
{
  HANDLE_TH_ERRORS
  static PythonArgParser parser({
    "max()",
    "max(Tensor other)",
    "max(int64_t dim, bool keepdim=False)",
  }, /*traceable=*/true);
  auto& self = reinterpret_cast<THPVariable*>(self_)->cdata;
  ParsedArgs<3> parsed_args;
  auto r = parser.parse(args, kwargs, parsed_args);
  auto *ptype = THPVariable_result_ptype(self_, args); // NEW!
  static PyStructSequence_Field fields0[] = {
    {"values", ""}, {"indices", ""}, {nullptr}
  };
  static PyStructSequence_Desc desc0 = {
    "torch.return_types.max", nullptr,
    fields0, 2
  };
  static PyTypeObject type0;
  static bool namedtuple_type_initialized0 = false;
  if (!namedtuple_type_initialized0) {
    PyStructSequence_InitType(&type0, &desc0);
    type0.tp_repr = (reprfunc)torch::utils::returned_structseq_repr;
    namedtuple_type_initialized0 = true;
  }
  if (r.idx == 0) {
    return wrap(ptype, dispatch_max(self)); // NEW ARGUMENT!
  } else if (r.idx == 1) {
    return wrap(ptype, dispatch_max(self, r.tensor(0)));
  } else if (r.idx == 2) {
    return wrap(&type0, dispatch_max(self, r.toInt64(0), r.toBool(1)));
  }
  Py_RETURN_NONE;
  END_HANDLE_TH_ERRORS
}

Aside:
I keep trying to pull the latest changes from master, but it's been a week since I could pull something that compiles and passes the unit tests. I also could not use your ghstack module because it won't work on a fork of the repository (I don't have contributor access to pytorch). I will keep pulling the changes from master until it works.

@apaszke
Copy link
Copy Markdown
Contributor

apaszke commented Apr 7, 2019

Also, we should carefully benchmark this change before it goes it.

@ezyang
Copy link
Copy Markdown
Contributor

ezyang commented Apr 8, 2019

I keep trying to pull the latest changes from master, but it's been a week since I could pull something that compiles and passes the unit tests

Ah, that's because you have a bunch of spurious submodule changes in your PR. If you revert them that will solve your problem, e.g., as per https://stackoverflow.com/questions/7718780/how-to-undo-git-submodule-update

@ezyang
Copy link
Copy Markdown
Contributor

ezyang commented Apr 9, 2019

@dylanbespalko Also, I'll get you setup with ghstack access shortly

@ezyang ezyang added module: internals Related to internal abstractions in c10 and ATen module: pybind Related to our Python bindings / interactions with other Python libraries module: performance Issues related to performance, either of kernel code or framework glue labels Apr 9, 2019

ptype = (PyTypeObject*)base;
Py_INCREF(ptype);
if (self != NULL && PyObject_IsInstance(self, base) && !PyObject_IsInstance(self, parameter) && PyObject_IsInstance(self, (PyObject*)ptype)){
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I expected to see some sort of fast path here for the case of Tensor and Parameter. Also, I see you have special-cased Parameter here, and it would be good if it were described what exactly this special case is doing (and why we actually need to special-case Parameter in this way; it's not obvious to me that Parameter should be special-cased here.)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example, the Numpy __array_ufunc__ has a special case for the case when the ufunc is identical to ndarray.__array_ufunc__, which I believe lets you get "Parameter" like behavior, because Parameter inherits the ufunc from Tensor, which creates Tensors (not Parameters!)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(It is difficult for me to argue about this, because there is not a spec of the desired behavior clearly written down anywhere!)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"I expected to see some sort of fast path here..."

A: It's not a fast path. The simplest implementation has to look at the python type of every input argument as follows:

  1. self (The 1st argument).
  2. Then, the args tuple (The 2nd, ...., nth argument).
    I have assumed simple inheritance, therefore I only analyze the ptype of each arg from left to right to find the most derived subclass of either np.NDarray OR torch.Tensor.

"Also, I see you have special-cased Parameter here...."

I will add comments to say torch.nn.Parameter explicitly does not perform ptype propagation. This is because:

  1. Parameters designate inputs to the neural network.
  2. And, All other derived quantities should upcast to torch.Tensor so that they are not treated as tunable parameters in the neural network.
    I return the most derived subclass of Tensor that is not a Parameter.

Eg. Parameter + Tensor_Subclass = Tensor_Subclass.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"For example, the Numpy array_ufunc has a special case for the case when the ufunc is identical to ndarray.array_ufunc, which I believe lets you get "Parameter" like behaviour"

This could arguably go two ways:

  1. Parameter + Tensor_Subclass = Tensor_Subclass. This is the desirable behaviour for me, but I need to process the python type of every input arg.
  2. Parameter + Tensor_Subclass = Tensor. This would allow for a fast-track if for-example the first argument was a Parameter, but I can't see this being much faster.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"(It is difficult for me to argue about this, because there is not a spec of the desired behavior clearly written down anywhere!)"

Yes, I'm in conflict the implantation of Parameter. I wish the difference between a Tensor and a Parameter was a boolean flag stored inside the metadata of the tensor, but I'm sure others will disagree.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's talk about this more on #22235. I don't mind Parameter too much but subclass propagation rules are a bit funny.

END_HANDLE_TH_ERRORS
}

// Implements the default __array_ufunc__ python subclass type propagation. Used in torch._C._TensorBase.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need some link to the specification of what __array_ufunc__ does, or a recapitulation of its spec here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"We need some link to the specification of what array_ufunc does, or a recapitulation of its spec here."

__array_ufunc__ traditionally does two things:

  1. Cast all input args and output kwargs to the np.NDarray base class
  2. Look for any subclass overrides of the ufunc that is being called. (eg. did the subclass override np.add?)
  3. Cast all result args to the np.NDarray subclass

When __array_func__ is implemented inside torch.Tensor we should note that torch.Tensor is NOT as subclass of np.NDArray. It should do the following:

  1. Cast all input and output args to the np.NDarray base class
  2. DO NOT Look for any subclass overrides of the ufunc that is being called (It is NOT a subclass of np.NDArray and can not override ufuncs).
  3. Cast all result args to the np.NDarray subclass

My implementation of __array_ufunc__ does not support subclass overrides, of numpy ufuncs because Tensor is not a subclass of np.NDArray.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, don't tell me here, put this in the code, if you can :)

if (PyObject_IsInstance(obj, base)){
auto& var = ((THPVariable*)obj)->cdata;
result_requires_grad |= var.requires_grad();
var.detach_();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An in-place detach as is done here is unlikely to be the desired semantics. If we pass a tensor through numpy we should end up with a Tensor which errors when you try to backward() through it (since the autograd is not defined). No mutations to inputs should occur.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"An in-place detach as is done here is unlikely to be the desired semantics. If we pass a tensor through numpy we should end up with a Tensor which errors when you try to backward() through it (since the autograd is not defined). No mutations to inputs should occur."

I agree. I have fixed the mistake.

// Up-Cast Results to ptype
if (((PyUFuncObject*)ufunc)->nout == 1){
if (PyTypeNum_ISBOOL(PyArray_DESCR((PyArrayObject*)result)->type_num)){
//Workaround, torch has no built-in bool tensor
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do now! cc @izdeby

obj = PyTuple_GET_ITEM(results, i);
}
auto var = torch::utils::tensor_from_numpy(obj);
result_requires_grad &= at::is_floating_point(var);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks... questionable.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"This looks... questionable."

I totally agree, I have fixed this so that requires_grad=False when calling a numpy __array_ufunc__ because autograd is not possible.
This was a bogus workaround to pass test_multiplication_numpy_scalar in test_torch.py which tests.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a bogus workaround to pass test_multiplication_numpy_scalar in test_torch.py

Py_DECREF(in_args);
Py_DECREF(out_args);
Py_DECREF(ptype);
Py_DECREF(base);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Explicit refcounting is giving me hives. We have C++ helper classes for doing RAII on these, please use them!

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have cleaned up this function so that it uses pybind11 whenever possible.

@ezyang
Copy link
Copy Markdown
Contributor

ezyang commented Apr 10, 2019

It seems like there are two orthogonal features in this patchset, which could be merged separately. Is that true?

@dylanbespalko
Copy link
Copy Markdown
Contributor Author

Thanks @ezyang for looking at my code. I will look though these tomorrow. So sorry about the explicit reference counting. I will look into pybind11.

@ezyang ezyang self-requested a review April 19, 2019 22:01
@dylanbespalko
Copy link
Copy Markdown
Contributor Author

@ezyang,

"It seems like there are two orthogonal features in this patchset, which could be merged separately. Is that true?"

Yes. I will split:

  1. Pytorch tensor ptype propagation. PR: ptype propagation on torch functions #22235
  2. Numpy tensor ptype propagation via array_ufunc implementation. (Coming soon).

@ezyang
Copy link
Copy Markdown
Contributor

ezyang commented Jun 26, 2019

Closing this PR as obsoleted by #22247 and #22235

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

module: internals Related to internal abstractions in c10 and ATen module: performance Issues related to performance, either of kernel code or framework glue module: pybind Related to our Python bindings / interactions with other Python libraries open source

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants