fix static init issue with JIT container types by bdhirsh · Pull Request #76085 · pytorch/pytorch

bdhirsh · 2022-04-20T01:49:42Z

This should fix at::index.Tensor for functionalization and address pytorch/torchdynamo#88.

The bug

we have a bunch of code that expects all of our c10::Type objects to have a unique singleton instance, but it turns out that this isn't true for container types (like c10::optional).

It turns out that the IValue::isOptionalTensorList function I added earlier in #75716 doesn't always work:

  const auto& ty = static_cast<detail::ListImpl*>(payload.u.as_intrusive_ptr)->elementType;
  return ty == c10::getTypePtr<c10::optional<at::Tensor>>();

That equality check calls this code:

template <typename T, typename U>
bool operator==(const SingletonOrSharedTypePtr<T>& x, const SingletonOrSharedTypePtr<U>& y) {
  return (void*)x.get() == (void*)y.get();
}

Every c10::Type can be compared, but it also has its own singleton instance to make equality checks cheaper (just check that the two singleton instances are the same, instead of comparing the full type objects).

You can call c10::getTypePtr<T>(), and get back the singleton instance of the pointer to that type. For optional<T>, that singleton instance lives here.

When I was debugging, I noticed that isOptionalTensorList was returning false because the two pointers being compared were different, but the actual type objects were equal. I was able to repro this with functionalize(), but I couldn't repro it directly with test in core. Changing to this code worked:

  const auto& ty = static_cast<detail::ListImpl*>(payload.u.as_intrusive_ptr)->elementType;
  const auto& expected_ty == c10::getTypePtr<c10::optional<at::Tensor>>();
  // compare pointers, but if that fails compare the actual type objects
  return expected_ty == ty || *expected_ty == *ty;

So why do we have more than one "static singleton" instance of the same type object? The singleton instance for c10::optional lives here, and is defined in a header file (it has to be because it's a template).

I think that's because "function local statics are duplicated across DSO's". We have a similar comment about the dispatcher singleton and why it needs to live in a .cpp file here. Basically, since functorch and pytorch core live in two separate .so file, we'll end up with a new static singleton instance for each library.

We can't just move the singleton into a cpp though, since the function is templated - we want one singleton instance per optional<T> type.

I ended up doing it by converting each T into a TypePtr object, and keeping a mapping from TypePtr objects of the inner type to the static singleton instances in the .cpp file.

Testing?

I couldn't figure out how to repro this failure in core, since I think the functionalize() failure came from the fact that we're loading multiple libraries that we're invoking the c10::getTypePtr call from (libtorch_cpu.so and functorch/_C.so).

I confirmed that with this patch, this code runs successfully (it would break before)

import torch
from functorch import make_fx
from functorch.experimental import functionalize

def f(x, y):
    return x[y]

t1 = make_fx(functionalize(f))(torch.arange(3), torch.ones(2, dtype=torch.long))
print("Functionalized:\n", t1.graph)

Generalizing this fix to other container types?

This bug probably affects the other container c10::Types, like List/Dict. I put this up as a PoC first, but if this seems like a reasonable fix then I can use the same fix for the other container types too.

Stack from ghstack:

integrate functionalization <> LTC torchscript backend #75527 [prototype] integrate functionalization <> LTC torchscript backend
generate out= and functional variants of NativeFunctions, get functionalization to work for all mutable ops #76320 [test] attempt to functionalize ops with mutable positional-only args
add native view_copy.out ops, teach codegen about tensorlist out= #76126 add native view_copy.out ops, teach codegen about tensorlist out=
fix torch.tensor for functionalization #76319 [poc] try to fix torch.tensor for functionalization
fix nested grad(functionalize(f)) transforms #76318 fix nested grad(functionalize(f)) transforms
functionalization bugfix: using owning type when unwrapping tensors #76125 functionalization bugfix: using owning type when unwrapping tensors
fix static init issue with JIT container types #76085 fix static init issue with JIT container types
remove _is_foreach_op codegen special cases, clean up mutable return type checks #76190 remove _is_foreach_op codegen special cases, clean up mutable return type checks
functionalization: add native fill() op #76084 functionalization: add native fill() op
functionalization: add a copy() native function #76083 functionalization: add a copy() native function
functionalization: introduce a "zero()" aten op #75913 functionalization: introduce a "zero()" aten op

[ghstack-poisoned]

facebook-github-bot · 2022-04-20T01:49:57Z

🔗 Helpful links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/76085
📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓Need help or want to give feedback on the CI? Visit our office hours

💊 CI failures summary and remediations

As of commit c8b962f (more details on the Dr. CI page):

Expand to see more

💚 💚 Looks good so far! There are no failures yet. 💚 💚

This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

[ghstack-poisoned]

aten/src/ATen/core/jit_type.h

bdhirsh · 2022-04-20T14:19:05Z

I tagged @ezyang for this fix, but I'm not sure if there's someone from JIT who would also want to look at this PR.

This should fix `at::index.Tensor` for functionalization and address pytorch/torchdynamo#88. ### The bug we have a bunch of code that expects all of our `c10::Type` objects to have a unique singleton instance, but it turns out that this isn't true for container types (like `c10::optional`). It turns out that the `IValue::isOptionalTensorList` function I added earlier in #75716 doesn't always work: ``` const auto& ty = static_cast<detail::ListImpl*>(payload.u.as_intrusive_ptr)->elementType; return ty == c10::getTypePtr<c10::optional<at::Tensor>>(); ``` That equality check calls [this](https://github.com/pytorch/pytorch/blob/e20793b05426284d973ac25e563046c037b4e4b2/aten/src/ATen/core/jit_type_base.h#L586) code: ``` template <typename T, typename U> bool operator==(const SingletonOrSharedTypePtr<T>& x, const SingletonOrSharedTypePtr& y) { return (void*)x.get() == (void*)y.get(); } ``` Every `c10::Type` can be compared, but it also has its own singleton instance to make equality checks cheaper (just check that the two singleton instances are the same, instead of comparing the full type objects). You can call `c10::getTypePtr<T>()`, and get back the singleton instance of the pointer to that type. For `optional<T>`, that singleton instance lives [here](https://github.com/pytorch/pytorch/blob/e20793b05426284d973ac25e563046c037b4e4b2/aten/src/ATen/core/jit_type.h#L1871). When I was debugging, I noticed that `isOptionalTensorList` was returning false because the two pointers being compared were different, but the actual type objects were equal. I was able to repro this with `functionalize()`, but I couldn't repro it directly with test in core. Changing to this code worked: ``` const auto& ty = static_cast<detail::ListImpl*>(payload.u.as_intrusive_ptr)->elementType; const auto& expected_ty == c10::getTypePtr<c10::optional<at::Tensor>>(); // compare pointers, but if that fails compare the actual type objects return expected_ty == ty || *expected_ty == *ty; ``` So why do we have more than one "static singleton" instance of the same type object? The singleton instance for `c10::optional` lives [here](https://github.com/pytorch/pytorch/blob/e20793b05426284d973ac25e563046c037b4e4b2/aten/src/ATen/core/jit_type.h#L1871), and is defined in a header file (it has to be because it's a template). I think that's because "function local statics are duplicated across DSO's". We have a similar comment about the dispatcher singleton and why it needs to live in a .cpp file [here](https://github.com/pytorch/pytorch/blob/e20793b05426284d973ac25e563046c037b4e4b2/aten/src/ATen/core/dispatch/Dispatcher.h#L95). Basically, since functorch and pytorch core live in two separate `.so` file, we'll end up with a new static singleton instance for each library. We can't just move the singleton into a cpp though, since the function is templated - we want one singleton instance *per `optional<T>` type*. I ended up doing it by converting each `T` into a `TypePtr` object, and keeping a mapping from `TypePtr` objects of the inner type to the static singleton instances in the .cpp file. ### Testing? I couldn't figure out how to repro this failure in core, since I think the `functionalize()` failure came from the fact that we're loading multiple libraries that we're invoking the `c10::getTypePtr` call from (`libtorch_cpu.so` and `functorch/_C.so`). I confirmed that with this patch, this code runs successfully (it would break before) ``` import torch from functorch import make_fx from functorch.experimental import functionalize def f(x, y): return x[y] t1 = make_fx(functionalize(f))(torch.arange(3), torch.ones(2, dtype=torch.long)) print("Functionalized:\n", t1.graph) ``` ### Generalizing this fix to other container types? This bug probably affects the other container `c10::Type`s, like List/Dict. I put this up as a PoC first, but if this seems like a reasonable fix then I can use the same fix for the other container types too. [ghstack-poisoned]

ezyang · 2022-04-21T01:50:33Z

aten/src/ATen/core/type.cpp

+  if (optionalTypePtrs.find(inner) == optionalTypePtrs.end()) {
+    TypePtr t = TypeFactory::create<OptionalType>(inner);
+    optionalTypePtrs.emplace(inner, t);
+  }


do you need locking here?

yep 😬 will add (and it shouldn't matter if it's slow since it's behind a static singleton)

update - locking fixed the bug described above lol. With the lock, this now works:

static const auto& call() { static auto inner_type = getTypePtr_<T>::call(); static auto type = OptionalType::get(inner_type); return type; }

I'm going to try to update the other container types to do something similar.

same as above -- move & be careful not to use after move in the return statement

ezyang · 2022-04-21T02:01:02Z

@eellison do you think you'd be able to help review this?

This should fix `at::index.Tensor` for functionalization and address pytorch/torchdynamo#88. ### The bug we have a bunch of code that expects all of our `c10::Type` objects to have a unique singleton instance, but it turns out that this isn't true for container types (like `c10::optional`). It turns out that the `IValue::isOptionalTensorList` function I added earlier in #75716 doesn't always work: ``` const auto& ty = static_cast<detail::ListImpl*>(payload.u.as_intrusive_ptr)->elementType; return ty == c10::getTypePtr<c10::optional<at::Tensor>>(); ``` That equality check calls [this](https://github.com/pytorch/pytorch/blob/e20793b05426284d973ac25e563046c037b4e4b2/aten/src/ATen/core/jit_type_base.h#L586) code: ``` template <typename T, typename U> bool operator==(const SingletonOrSharedTypePtr<T>& x, const SingletonOrSharedTypePtr& y) { return (void*)x.get() == (void*)y.get(); } ``` Every `c10::Type` can be compared, but it also has its own singleton instance to make equality checks cheaper (just check that the two singleton instances are the same, instead of comparing the full type objects). You can call `c10::getTypePtr<T>()`, and get back the singleton instance of the pointer to that type. For `optional<T>`, that singleton instance lives [here](https://github.com/pytorch/pytorch/blob/e20793b05426284d973ac25e563046c037b4e4b2/aten/src/ATen/core/jit_type.h#L1871). When I was debugging, I noticed that `isOptionalTensorList` was returning false because the two pointers being compared were different, but the actual type objects were equal. I was able to repro this with `functionalize()`, but I couldn't repro it directly with test in core. Changing to this code worked: ``` const auto& ty = static_cast<detail::ListImpl*>(payload.u.as_intrusive_ptr)->elementType; const auto& expected_ty == c10::getTypePtr<c10::optional<at::Tensor>>(); // compare pointers, but if that fails compare the actual type objects return expected_ty == ty || *expected_ty == *ty; ``` So why do we have more than one "static singleton" instance of the same type object? The singleton instance for `c10::optional` lives [here](https://github.com/pytorch/pytorch/blob/e20793b05426284d973ac25e563046c037b4e4b2/aten/src/ATen/core/jit_type.h#L1871), and is defined in a header file (it has to be because it's a template). I think that's because "function local statics are duplicated across DSO's". We have a similar comment about the dispatcher singleton and why it needs to live in a .cpp file [here](https://github.com/pytorch/pytorch/blob/e20793b05426284d973ac25e563046c037b4e4b2/aten/src/ATen/core/dispatch/Dispatcher.h#L95). Basically, since functorch and pytorch core live in two separate `.so` file, we'll end up with a new static singleton instance for each library. We can't just move the singleton into a cpp though, since the function is templated - we want one singleton instance *per `optional<T>` type*. I ended up doing it by converting each `T` into a `TypePtr` object, and keeping a mapping from `TypePtr` objects of the inner type to the static singleton instances in the .cpp file. ### Testing? I couldn't figure out how to repro this failure in core, since I think the `functionalize()` failure came from the fact that we're loading multiple libraries that we're invoking the `c10::getTypePtr` call from (`libtorch_cpu.so` and `functorch/_C.so`). I confirmed that with this patch, this code runs successfully (it would break before) ``` import torch from functorch import make_fx from functorch.experimental import functionalize def f(x, y): return x[y] t1 = make_fx(functionalize(f))(torch.arange(3), torch.ones(2, dtype=torch.long)) print("Functionalized:\n", t1.graph) ``` ### Generalizing this fix to other container types? This bug probably affects the other container `c10::Type`s, like List/Dict. I put this up as a PoC first, but if this seems like a reasonable fix then I can use the same fix for the other container types too. [ghstack-poisoned]

ezyang · 2022-04-21T20:34:27Z

OK, I'm A+ for this approach then. Here's the approve.

eellison · 2022-04-21T21:11:09Z

@ezyang hey sorry do you still need me to take a look ?

ezyang · 2022-04-21T23:36:09Z

I think a high level "this makes sense for JIT C++ types" nod would be a help. PR is not long.

This should fix `at::index.Tensor` for functionalization and address pytorch/torchdynamo#88. ### The bug we have a bunch of code that expects all of our `c10::Type` objects to have a unique singleton instance, but it turns out that this isn't true for container types (like `c10::optional`). It turns out that the `IValue::isOptionalTensorList` function I added earlier in #75716 doesn't always work: ``` const auto& ty = static_cast<detail::ListImpl*>(payload.u.as_intrusive_ptr)->elementType; return ty == c10::getTypePtr<c10::optional<at::Tensor>>(); ``` That equality check calls [this](https://github.com/pytorch/pytorch/blob/e20793b05426284d973ac25e563046c037b4e4b2/aten/src/ATen/core/jit_type_base.h#L586) code: ``` template <typename T, typename U> bool operator==(const SingletonOrSharedTypePtr<T>& x, const SingletonOrSharedTypePtr& y) { return (void*)x.get() == (void*)y.get(); } ``` Every `c10::Type` can be compared, but it also has its own singleton instance to make equality checks cheaper (just check that the two singleton instances are the same, instead of comparing the full type objects). You can call `c10::getTypePtr<T>()`, and get back the singleton instance of the pointer to that type. For `optional<T>`, that singleton instance lives [here](https://github.com/pytorch/pytorch/blob/e20793b05426284d973ac25e563046c037b4e4b2/aten/src/ATen/core/jit_type.h#L1871). When I was debugging, I noticed that `isOptionalTensorList` was returning false because the two pointers being compared were different, but the actual type objects were equal. I was able to repro this with `functionalize()`, but I couldn't repro it directly with test in core. Changing to this code worked: ``` const auto& ty = static_cast<detail::ListImpl*>(payload.u.as_intrusive_ptr)->elementType; const auto& expected_ty == c10::getTypePtr<c10::optional<at::Tensor>>(); // compare pointers, but if that fails compare the actual type objects return expected_ty == ty || *expected_ty == *ty; ``` So why do we have more than one "static singleton" instance of the same type object? The singleton instance for `c10::optional` lives [here](https://github.com/pytorch/pytorch/blob/e20793b05426284d973ac25e563046c037b4e4b2/aten/src/ATen/core/jit_type.h#L1871), and is defined in a header file (it has to be because it's a template). I think that's because "function local statics are duplicated across DSO's". We have a similar comment about the dispatcher singleton and why it needs to live in a .cpp file [here](https://github.com/pytorch/pytorch/blob/e20793b05426284d973ac25e563046c037b4e4b2/aten/src/ATen/core/dispatch/Dispatcher.h#L95). Basically, since functorch and pytorch core live in two separate `.so` file, we'll end up with a new static singleton instance for each library. We can't just move the singleton into a cpp though, since the function is templated - we want one singleton instance *per `optional<T>` type*. I ended up doing it by converting each `T` into a `TypePtr` object, and keeping a mapping from `TypePtr` objects of the inner type to the static singleton instances in the .cpp file. ### Testing? I couldn't figure out how to repro this failure in core, since I think the `functionalize()` failure came from the fact that we're loading multiple libraries that we're invoking the `c10::getTypePtr` call from (`libtorch_cpu.so` and `functorch/_C.so`). I confirmed that with this patch, this code runs successfully (it would break before) ``` import torch from functorch import make_fx from functorch.experimental import functionalize def f(x, y): return x[y] t1 = make_fx(functionalize(f))(torch.arange(3), torch.ones(2, dtype=torch.long)) print("Functionalized:\n", t1.graph) ``` ### Generalizing this fix to other container types? This bug probably affects the other container `c10::Type`s, like List/Dict. I put this up as a PoC first, but if this seems like a reasonable fix then I can use the same fix for the other container types too. [ghstack-poisoned]

eellison

Took a look.. I don't know if i'm the best person to review this as is this a little deep in the weeds of C++ for me. Maybe @swolchok ? who has also done a good amount of work optimizing TypePtrs

eellison · 2022-04-22T00:33:10Z

aten/src/ATen/core/jit_type.h

+  // the type List<T>.
+  // The extra "identifier" argument is needed beccause we have multiple container types
+  // that all re-use this function (List<T>, array<T, N>, etc.)
+  static TypePtr get(std::string identifier, TypePtr inner);


Can we make change the std::string identifier to an Enum?

We can't make it an enum because of array<T, N> - we basically need a different output type for every different container template (and std::string gives maybe too much freedom, but I figured it was simple)

This should fix `at::index.Tensor` for functionalization and address pytorch/torchdynamo#88. ### The bug we have a bunch of code that expects all of our `c10::Type` objects to have a unique singleton instance, but it turns out that this isn't true for container types (like `c10::optional`). It turns out that the `IValue::isOptionalTensorList` function I added earlier in #75716 doesn't always work: ``` const auto& ty = static_cast<detail::ListImpl*>(payload.u.as_intrusive_ptr)->elementType; return ty == c10::getTypePtr<c10::optional<at::Tensor>>(); ``` That equality check calls [this](https://github.com/pytorch/pytorch/blob/e20793b05426284d973ac25e563046c037b4e4b2/aten/src/ATen/core/jit_type_base.h#L586) code: ``` template <typename T, typename U> bool operator==(const SingletonOrSharedTypePtr<T>& x, const SingletonOrSharedTypePtr& y) { return (void*)x.get() == (void*)y.get(); } ``` Every `c10::Type` can be compared, but it also has its own singleton instance to make equality checks cheaper (just check that the two singleton instances are the same, instead of comparing the full type objects). You can call `c10::getTypePtr<T>()`, and get back the singleton instance of the pointer to that type. For `optional<T>`, that singleton instance lives [here](https://github.com/pytorch/pytorch/blob/e20793b05426284d973ac25e563046c037b4e4b2/aten/src/ATen/core/jit_type.h#L1871). When I was debugging, I noticed that `isOptionalTensorList` was returning false because the two pointers being compared were different, but the actual type objects were equal. I was able to repro this with `functionalize()`, but I couldn't repro it directly with test in core. Changing to this code worked: ``` const auto& ty = static_cast<detail::ListImpl*>(payload.u.as_intrusive_ptr)->elementType; const auto& expected_ty == c10::getTypePtr<c10::optional<at::Tensor>>(); // compare pointers, but if that fails compare the actual type objects return expected_ty == ty || *expected_ty == *ty; ``` So why do we have more than one "static singleton" instance of the same type object? The singleton instance for `c10::optional` lives [here](https://github.com/pytorch/pytorch/blob/e20793b05426284d973ac25e563046c037b4e4b2/aten/src/ATen/core/jit_type.h#L1871), and is defined in a header file (it has to be because it's a template). I think that's because "function local statics are duplicated across DSO's". We have a similar comment about the dispatcher singleton and why it needs to live in a .cpp file [here](https://github.com/pytorch/pytorch/blob/e20793b05426284d973ac25e563046c037b4e4b2/aten/src/ATen/core/dispatch/Dispatcher.h#L95). Basically, since functorch and pytorch core live in two separate `.so` file, we'll end up with a new static singleton instance for each library. We can't just move the singleton into a cpp though, since the function is templated - we want one singleton instance *per `optional<T>` type*. I ended up doing it by converting each `T` into a `TypePtr` object, and keeping a mapping from `TypePtr` objects of the inner type to the static singleton instances in the .cpp file. ### Testing? I couldn't figure out how to repro this failure in core, since I think the `functionalize()` failure came from the fact that we're loading multiple libraries that we're invoking the `c10::getTypePtr` call from (`libtorch_cpu.so` and `functorch/_C.so`). I confirmed that with this patch, this code runs successfully (it would break before) ``` import torch from functorch import make_fx from functorch.experimental import functionalize def f(x, y): return x[y] t1 = make_fx(functionalize(f))(torch.arange(3), torch.ones(2, dtype=torch.long)) print("Functionalized:\n", t1.graph) ``` ### Generalizing this fix to other container types? This bug probably affects the other container `c10::Type`s, like List/Dict. I put this up as a PoC first, but if this seems like a reasonable fix then I can use the same fix for the other container types too. [ghstack-poisoned]

swolchok · 2022-04-25T15:37:08Z

aten/src/ATen/core/jit_type.h

+    // otherwise we'll end up with one singleton instance per shared library.
+    // (Concatenating the length onto the end of the string because we want a unique
+    // type_ptr created for every std::array<T, N> type).
+    static auto type = ListType::get(std::string("array") + std::to_string(N), inner_type);


it is definitely possible to produce the string here at compile time, but maybe it doesn't matter enough.

swolchok · 2022-04-25T15:38:06Z

aten/src/ATen/core/type.cpp

+    // This hashing is all hidden behind a static initializer so it
+    // doesn't have to be optimal


great, might be a good idea to use hash_combine then

Just saw that we have an at::hash_combine - I'll use that, thanks!

swolchok · 2022-04-25T15:39:27Z

aten/src/ATen/core/type.cpp

+    TypePtr t = TypeFactory::create<OptionalType>(inner);
+    containerTypePtrs.emplace(inner, t);


good: std::move(t)
better: just eliminate t and create it directly

also, std::move(inner) in the emplace call

swolchok · 2022-04-25T15:41:54Z

aten/src/ATen/core/type.cpp

+  if (containerTypePtrs.find(key) == containerTypePtrs.end()) {
+    TypePtr t = ListType::create(inner);
+    containerTypePtrs.emplace(key, t);
+  }
+  return containerTypePtrs[key];


same as above, but needs some extra care to avoid use-after-move, and this still does some extra refcount bumps we could potentially avoid

auto it = containerTypePtrs.find(key); if (it == containerTypePtrs.end()) { it = containerTypePtrs.emplace(std::move(key), ListType::create(std::move(inner)); } return it->second;

swolchok · 2022-04-25T15:42:26Z

aten/src/ATen/core/type.cpp

+  if (optionalTypePtrs.find(inner) == optionalTypePtrs.end()) {
+    TypePtr t = TypeFactory::create<OptionalType>(inner);
+    optionalTypePtrs.emplace(inner, t);
+  }


same as above -- move & be careful not to use after move in the return statement

This should fix `at::index.Tensor` for functionalization and address pytorch/torchdynamo#88. ### The bug we have a bunch of code that expects all of our `c10::Type` objects to have a unique singleton instance, but it turns out that this isn't true for container types (like `c10::optional`). It turns out that the `IValue::isOptionalTensorList` function I added earlier in #75716 doesn't always work: ``` const auto& ty = static_cast<detail::ListImpl*>(payload.u.as_intrusive_ptr)->elementType; return ty == c10::getTypePtr<c10::optional<at::Tensor>>(); ``` That equality check calls [this](https://github.com/pytorch/pytorch/blob/e20793b05426284d973ac25e563046c037b4e4b2/aten/src/ATen/core/jit_type_base.h#L586) code: ``` template <typename T, typename U> bool operator==(const SingletonOrSharedTypePtr<T>& x, const SingletonOrSharedTypePtr& y) { return (void*)x.get() == (void*)y.get(); } ``` Every `c10::Type` can be compared, but it also has its own singleton instance to make equality checks cheaper (just check that the two singleton instances are the same, instead of comparing the full type objects). You can call `c10::getTypePtr<T>()`, and get back the singleton instance of the pointer to that type. For `optional<T>`, that singleton instance lives [here](https://github.com/pytorch/pytorch/blob/e20793b05426284d973ac25e563046c037b4e4b2/aten/src/ATen/core/jit_type.h#L1871). When I was debugging, I noticed that `isOptionalTensorList` was returning false because the two pointers being compared were different, but the actual type objects were equal. I was able to repro this with `functionalize()`, but I couldn't repro it directly with test in core. Changing to this code worked: ``` const auto& ty = static_cast<detail::ListImpl*>(payload.u.as_intrusive_ptr)->elementType; const auto& expected_ty == c10::getTypePtr<c10::optional<at::Tensor>>(); // compare pointers, but if that fails compare the actual type objects return expected_ty == ty || *expected_ty == *ty; ``` So why do we have more than one "static singleton" instance of the same type object? The singleton instance for `c10::optional` lives [here](https://github.com/pytorch/pytorch/blob/e20793b05426284d973ac25e563046c037b4e4b2/aten/src/ATen/core/jit_type.h#L1871), and is defined in a header file (it has to be because it's a template). I think that's because "function local statics are duplicated across DSO's". We have a similar comment about the dispatcher singleton and why it needs to live in a .cpp file [here](https://github.com/pytorch/pytorch/blob/e20793b05426284d973ac25e563046c037b4e4b2/aten/src/ATen/core/dispatch/Dispatcher.h#L95). Basically, since functorch and pytorch core live in two separate `.so` file, we'll end up with a new static singleton instance for each library. We can't just move the singleton into a cpp though, since the function is templated - we want one singleton instance *per `optional<T>` type*. I ended up doing it by converting each `T` into a `TypePtr` object, and keeping a mapping from `TypePtr` objects of the inner type to the static singleton instances in the .cpp file. ### Testing? I couldn't figure out how to repro this failure in core, since I think the `functionalize()` failure came from the fact that we're loading multiple libraries that we're invoking the `c10::getTypePtr` call from (`libtorch_cpu.so` and `functorch/_C.so`). I confirmed that with this patch, this code runs successfully (it would break before) ``` import torch from functorch import make_fx from functorch.experimental import functionalize def f(x, y): return x[y] t1 = make_fx(functionalize(f))(torch.arange(3), torch.ones(2, dtype=torch.long)) print("Functionalized:\n", t1.graph) ``` ### Generalizing this fix to other container types? This bug probably affects the other container `c10::Type`s, like List/Dict. I put this up as a PoC first, but if this seems like a reasonable fix then I can use the same fix for the other container types too. [ghstack-poisoned]

bdhirsh · 2022-04-25T21:32:10Z

@pytorchbot merge this please

github-actions · 2022-04-25T21:34:54Z

Hey @bdhirsh.
You've committed this PR, but it does not have both a 'release notes: ...' and 'topics: ...' label. Please add one of each to the PR. The 'release notes: ...' label should represent the part of PyTorch that this PR changes (fx, autograd, distributed, etc) and the 'topics: ...' label should represent the kind of PR it is (not user facing, new feature, bug fix, perf improvement, etc). The list of valid labels can be found here for the 'release notes: ...' and here for the 'topics: ...'.
For changes that are 'topic: not user facing' there is no need for a release notes label.

Summary: Pull Request resolved: #76085 Approved by: https://github.com/ezyang Test Plan: contbuild & OSS CI, see https://hud.pytorch.org/commit/pytorch/pytorch/40f3e85005b60dab26b89c6b09e3fb7cf686d2eb Reviewed By: osalpekar Differential Revision: D35938182 Pulled By: bdhirsh fbshipit-source-id: f21f52255d49daac558c1c595dee2894405d14fd

…ton bug" This should fix the remaining dynamo <> functionalization integration issue. I had a fix earlier for JIT containers not correctly having singleton instances [here](#76085), which fixed a bug in this code for functionalization: ``` def f(a, b): return a[b] functionalize(foo)(torch.arange(3), torch.ones(2, dtype=torch.long)) ``` But apparently the following code is still broken: ``` def f(a, b): return torch.ops.aten.index(a, b) functionalize(foo)(torch.arange(3), torch.ones(2, dtype=torch.long)) ``` Why? we have separate schema parsing logic for the ops in `torch.ops.aten`, and that logic circumvented my fix, creating its own singleton instance for Optional[Tensor]. We can't test this in core for the same reason as the last fix, so companion functorch tests here: pytorch/functorch#820 [ghstack-poisoned]

This should fix the remaining dynamo <> functionalization integration issue. I had a fix earlier for JIT containers not correctly having singleton instances [here](#76085), which fixed a bug in this code for functionalization: ``` def f(a, b): return a[b] functionalize(foo)(torch.arange(3), torch.ones(2, dtype=torch.long)) ``` But apparently the following code is still broken: ``` def f(a, b): return torch.ops.aten.index(a, b) functionalize(foo)(torch.arange(3), torch.ones(2, dtype=torch.long)) ``` Why? we have separate schema parsing logic for the ops in `torch.ops.aten`, and that logic circumvented my fix, creating its own singleton instance for Optional[Tensor]. We can't test this in core for the same reason as the last fix, so companion functorch tests here: pytorch/functorch#820 [ghstack-poisoned]

[POC] fix static init issue with JIT container types

3da4eaf

[ghstack-poisoned]

This was referenced Apr 20, 2022

functionalization: introduce a "zero()" aten op #75913

Closed

functionalization: add a copy() native function #76083

Closed

functionalization: add native fill() op #76084

Closed

integrate functionalization <> LTC torchscript backend #75527

Closed

facebook-github-bot added the cla signed label Apr 20, 2022

Update on "[POC] fix static init issue with JIT container types"

bb46794

[ghstack-poisoned]

bdhirsh requested a review from ezyang April 20, 2022 14:11

bdhirsh commented Apr 20, 2022

View reviewed changes

aten/src/ATen/core/jit_type.h Show resolved Hide resolved

This was referenced Apr 20, 2022

functionalization bugfix: using owning type when unwrapping tensors #76125

Closed

add native view_copy.out ops, teach codegen about tensorlist out= #76126

Closed

ezyang reviewed Apr 21, 2022

View reviewed changes

ezyang requested a review from eellison April 21, 2022 02:00

This was referenced Apr 21, 2022

functionalization: fix nested grad + functionalize transforms #76189

Closed

remove _is_foreach_op codegen special cases, clean up mutable return type checks #76190

Closed

ezyang approved these changes Apr 21, 2022

View reviewed changes

bdhirsh changed the title ~~[POC] fix static init issue with JIT container types~~ fix static init issue with JIT container types Apr 21, 2022

eellison reviewed Apr 22, 2022

View reviewed changes

eellison requested a review from swolchok April 22, 2022 00:36

bdhirsh added 2 commits April 22, 2022 05:45

swolchok reviewed Apr 25, 2022

View reviewed changes

This was referenced Apr 25, 2022

fix nested grad(functionalize(f)) transforms #76318

Closed

fix torch.tensor for functionalization #76319

Closed

generate out= and functional variants of NativeFunctions, get functionalization to work for all mutable ops #76320

Closed

pytorchmergebot closed this in 40f3e85 Apr 25, 2022

bdhirsh mentioned this pull request Apr 26, 2022

functionalize test hardening pytorch/functorch#732

Merged

facebook-github-bot deleted the gh/bdhirsh/211/head branch April 29, 2022 14:17

bdhirsh mentioned this pull request May 19, 2022

fix jit List[Optional[Tensor]] type singleton bug #77846

Closed

		// This hashing is all hidden behind a static initializer so it
		// doesn't have to be optimal

		TypePtr t = TypeFactory::create<OptionalType>(inner);
		containerTypePtrs.emplace(inner, t);

Conversation

bdhirsh commented Apr 20, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

The bug

Testing?

Generalizing this fix to other container types?

Uh oh!

facebook-github-bot commented Apr 20, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful links

💊 CI failures summary and remediations

Uh oh!

Uh oh!

bdhirsh commented Apr 20, 2022

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ezyang commented Apr 21, 2022

Uh oh!

ezyang commented Apr 21, 2022

Uh oh!

eellison commented Apr 21, 2022

Uh oh!

ezyang commented Apr 21, 2022

Uh oh!

eellison left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bdhirsh commented Apr 25, 2022

Uh oh!

github-actions bot commented Apr 25, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

bdhirsh commented Apr 20, 2022 •

edited

Loading

facebook-github-bot commented Apr 20, 2022 •

edited

Loading