Skip to content

Upstream review process.#10

Closed
csarofeen wants to merge 85 commits intopytorch_fusionfrom
MinorRefactor
Closed

Upstream review process.#10
csarofeen wants to merge 85 commits intopytorch_fusionfrom
MinorRefactor

Conversation

@csarofeen
Copy link
Copy Markdown
Owner

No description provided.

jerryzh168 and others added 30 commits March 18, 2020 14:35
…h#34347)

Summary: Pull Request resolved: pytorch#34347

Test Plan:
python test/test_jit.py

Imported from OSS

Differential Revision: D20504453

fbshipit-source-id: 1bab29e21d0564ed88cdeb4894addfe00ebbd390
Summary:
(Updated per review feedback)

`torch.floor_divide` is currently a function that can operate on two tensors or a tensor and a scalar (scalar x scalar floor division is handled natively by Python and the JIT has a builtin function for it). This PR updates it to:

- have an out variant: `floor_divide(x, y, out=z)`
- be a method on a tensor: `x.floor_divide(y)`
- have an in-place variant: `x.floor_divide_(y)`
- work with sparse tensors

Tests are added to test_sparse.py and test_torch.py for these new behaviors.

In addition, this PR:

- cleans up the existing sparse division and true_division code and improves their error message
- adds testing of sparse true_division to test_sparse.py
- extends existing floor_divide testing in test_torch to run on CUDA, too, not just the CPU

Unfortunately, making floor_divide a method requires breaking backwards compatibility, and floor_divide has been added to the BC whitelist since this is international. The BC issue is that the first parameter name to torch.floor_divide is changing from input to self. If you previously called torch.floor_divide with keyword arguments, e.g. torch.floor_divide(input=x, other=y), you will need to update to torch.floor_divide(self=x, other=y), or the more common torch.floor_divide(x, y).

The intent of this PR is to allow floor_divide to be substituted for division (torch.div, /) wherever division was previously used. In 1.6 we expect torch.div to perform true_division, and floor_divide is how users can continue to perform integer division with tensors.

There are two potential follow-up issues suggested by this PR:

- the test framework might benefit from additional tensor construction classes, like one to create dividends and divisors for multiple dtypes
- the test framework might benefit from a universal function test class. while methods have reasonable coverage as part of test_torch.py's TestTensorOp tests, function coverage is spotty. Universal functions are similar enough it should be possible to generate tests for them.
Pull Request resolved: pytorch#34552

Differential Revision: D20509850

Pulled By: mruberry

fbshipit-source-id: 2cd3c828aad67191c77f2ed8470411e246f604f8
Summary:
Previously there was no indication of why you would get an `OSError` for something (such as the generated methods of a `dataclass`).
](https://our.intern.facebook.com/intern/diff/20426570/)
Pull Request resolved: pytorch#34669

Pulled By: driazati

Differential Revision: D20426570

fbshipit-source-id: 45d63631984fa26a87c03de5523fb10d8abbc6db
…upport layer model transfer learning

Summary: Add transfer_learning_blob_name_mappings into layer_model_helper to support layer model transfer learning

Reviewed By: mraway

Differential Revision: D20286298

fbshipit-source-id: de3e029611d843f38d3f42ecd4148358f7e14a2b
…4803)

Summary: Pull Request resolved: pytorch#34803

Test Plan:
python test/test_jit.py

Imported from OSS

Differential Revision: D20504457

fbshipit-source-id: 5ca691ef4880c72d30d62390e63e3288b2f06dce
Summary:
fix flake, add overload names
Pull Request resolved: pytorch#34974

Differential Revision: D20519191

Pulled By: eellison

fbshipit-source-id: d08d36b64397287cad484690074e694d8a0e472e
Summary: Pull Request resolved: pytorch#35001

Differential Revision: D20524479

Pulled By: anjali411

fbshipit-source-id: 3413779676ab95c1ee82298f95d3441a89873107
Summary: Pull Request resolved: pytorch#34942

Differential Revision: D20505894

Pulled By: Krovatkin

fbshipit-source-id: 7b442fae6aa2b1a29891b94f824094a1fddae4a2
… mobile callsites

Summary:
There are three guards related to mobile build:
* AutoGradMode
* AutoNonVariableTypeMode
* GraphOptimizerEnabledGuard

Today we need set some of these guards before calling libtorch APIs because we customized mobile build to only support inference (for both OSS and most FB use cases) to optimize binary size.

Several changes were made since 1.3 release so there are already inconsistent uses of these guards in the codebase. I did a sweep of all mobile related model loading & forward() call sites, trying to unify the use of these guards:

Full JIT: still set all three guards. More specifically:
* OSS: Fixed a bug of not setting the guard at model load time correctly in Android JNI.
* FB: Not covered by this diff (as we are using mobile interpreter for most internal builds).

Lite JIT (mobile interpreter): only needs AutoNonVariableTypeMode guard. AutoGradMode doesn't seem to be relevant (so removed from a few places) and GraphOptimizerEnabledGuard definitely not relevant (only full JIT has graph optimizer). More specifically:
* OSS: At this point we are not committed to support Lite-JIT. For Android it shares the same code with FB JNI callsites.
* FB:
** JNI callsites: Use the unified LiteJITCallGuard.
** For iOS/C++: manually set AutoNonVariableTypeMode for _load_for_mobile() & forward() callsites.

Ideally we should avoid having to set AutoNonVariableTypeMode for mobile interpreter. It's currently needed for dynamic dispatch + inference-only mobile build (where variable kernels are not registered) - without the guard it will try to run `variable_fallback_kernel` and crash (PR pytorch#34038). The proper fix will take some time so using this workaround to unblock selective BUCK build which depends on dynamic dispatch.

PS. The current status (of having to set AutoNonVariableTypeMode) should not block running FL model + mobile interpreter - if all necessary variable kernels are registered then it can call _load_for_mobile()/forward() against the FL model without setting the AutoNonVariableTypeMode guard. It's still inconvenient for JAVA callsites as it's set unconditionally inside JNI methods.

Test Plan: - CI

Reviewed By: xta0

Differential Revision: D20498017

fbshipit-source-id: ba6740f66839a61790873df46e8e66e4e141c728
Summary:
**Summary**
This commit parallelizes the invocation of `clang-format` on all files
in `tools/clang_format_new.py` using `asyncio`.

**Testing**
Ran and timed the script.

*Before*
```
$ time ./tools/clang_format_new.py  --diff
...
real	0m7.615s
user	0m6.012s
sys	0m1.634s
```

*After*
```
$ time ./tools/clang_format_new.py  --diff
...
Some files not formatted correctly

real	0m2.156s
user	0m8.488s
sys	0m3.201s
```
Pull Request resolved: pytorch#34750

Differential Revision: D20523133

Pulled By: SplitInfinity

fbshipit-source-id: 509741a0b4fcfcdcd7c5a45654e3453b4874d256
…er (pytorch#34957)

Summary:
1. Removed LossClosureOptimizer, and merged Optimizer into OptimizerBase (and renamed the merged class to Optimizer)
2. Merged the LBFGS-specific serialize test function and the generic test_serialize_optimizer function.
3. BC-compatibility serialization test for LBFGS
4. Removed mentions of parameters_ in optimizer.cpp, de-virtualize all functions
5. Made defaults_ optional argument in all optimizers except SGD
Pull Request resolved: pytorch#34957

Test Plan: Imported from GitHub, without a `Test Plan:` line.

Differential Revision: D20518647

Pulled By: anjali411

fbshipit-source-id: 4760d1d29df1784e2d01e2a476d2a08e9df4ea1c
…wner, instead of adding nothing to pending users (pytorch#34988)

Summary:
Pull Request resolved: pytorch#34988

In pytorch#31893, we introduced a confirmedUsers_ map in RRefContext.

For the case that the fork is shared from the owner,  there is no `pendingUsers_` intermediate phase for this fork, we should put this fork into `confirmedUsers_` immediately.

Test Plan:
```
buck test mode/dev-nosan //caffe2/test/distributed/rpc:rpc_fork
```

```
buck test mode/dev-nosan //caffe2/test/distributed/rpc/jit:rpc_fork
```

Differential Revision: D7735909

fbshipit-source-id: 14c36a16486f0cc9618dcfb111fe5223781b647d
Summary:
`GetEmptyStringAlreadyInited` invocation pattern in protobuf generated header files chanegd to
 `:PROTOBUF_NAMESPACE_ID::internal::GetEmptyStringAlreadyInited`, where `PROTOBUF_NAMESPACE_ID` is defined in `protobuf/port_def.inc` as `google::protobuf`

This likely to have changed around protobuf-3.8.x time, but I've only tested it using protobuf-3.11.4
Pull Request resolved: pytorch#35008

Test Plan: Update `third-party/protobuf` submodule to 3.11.4, compile and run `pattern_net_transform_test`

Differential Revision: D20526949

Pulled By: malfet

fbshipit-source-id: fddaa3622c48ad883612c73c40a20d306d88d66b
…ilure during Distributed Autograd (pytorch#34638)

Summary:
Pull Request resolved: pytorch#34638

Fixes: pytorch#27643

This PR manages notifying workers in the event of a failure during distributed autograd. Gracefully handles propagating errors across all nodes in the backward pass and sets state in the local autograd engines accordingly.

(Note: this ignores all push blocking failures!)

Test Plan: Added 2 new tests checking errors when they are thrown in an intermediate node during distributed autograd. Ensured that all existing distributed autograd tests pass.

Differential Revision: D20164420

fbshipit-source-id: 3d4ed74230969ac70bb763f1b5b1c16d979f66a2
Summary:
Update gloo submodule to `113bde13035594cafdca247be953610b53026553` be compatible with separate compilation introduced by
pytorch/gloo#251
Pull Request resolved: pytorch#34969

Test Plan: CI

Differential Revision: D20527163

Pulled By: malfet

fbshipit-source-id: 300d83d8fe95d57b8d740543efada3c56ac7b493
Summary: Pull Request resolved: pytorch#35018

Test Plan: Imported from OSS

Differential Revision: D20528402

Pulled By: suo

fbshipit-source-id: badb487a4fbb0299b49c1b1022bcd7b61eba1e88
Summary:
Pull Request resolved: pytorch#34980

We were passing sample inputs to `torch.jit.script` (as if it was
`torch.jit.trace`), but this parameter was treated as an optional
`optimize` parameter. That parameter is deprecated and that caused a
warning.

Differential Revision: D20520369

Test Plan: Imported from OSS

Pulled By: ZolotukhinM

fbshipit-source-id: 87b40a5e35bfc4a3d7a5d95494632bfe117e40b7
Summary: Pull Request resolved: pytorch#35007

Test Plan: Imported from OSS

Reviewed By: driazati

Differential Revision: D20525680

Pulled By: jamesr66a

fbshipit-source-id: aaa768f395e30dcec8007d50e17f21837c306719
…test for lbfgs

Test Plan: revert-hammer

Differential Revision:
D20524479

Original commit changeset: 3413779676ab

fbshipit-source-id: ef8007ed6c184bc8b8751eb713aac2a891260048
…e path" (pytorch#34962)

Summary:
Pull Request resolved: pytorch#34962

Relanding pytorch#34733. Fix is in pytorch#34988.

Test Plan:
```
buck test mode/dev-nosan //caffe2/test/distributed/rpc:rpc_fork
```

```
buck test mode/dev-nosan //caffe2/test/distributed/rpc/jit:rpc_fork

buck build mode/dev-nosan //caffe2/test/distributed/rpc/jit:rpc_fork \
&& buck-out/gen/caffe2/test/distributed/rpc/jit/rpc_fork\#binary.par \
-r test_return_local_script_class_rref_in_py_and_use_in_script

buck build mode/dev-nosan //caffe2/test/distributed/rpc/jit:rpc_fork \
&& buck-out/gen/caffe2/test/distributed/rpc/jit/rpc_fork\#binary.par \
-r test_return_local_script_module_rref_in_py_and_use_in_script
```

```
buck test mode/dev //caffe2/test/distributed/rpc/jit:rpc_fork_thrift -- test_return_local_script_module_rref_in_py_and_use_in_script
```

Differential Revision: D7778113

fbshipit-source-id: b830c03ac9463075fca248eba75be364b0e8b080
Summary:
Issue: pytorch#33780
After this PR:
1. dtype promotion logic will correctly work for ops involving complex scalars
2. torch.ComplexFloatTensor, torch.ComplexDoubleTensor works
3. added alias for complex64 (cfloat) and complex128 (cdouble)
4. added an internal function get_complex_default_dtype (consciously not exposed in public API)

>>> 1j*torch.ones(2)
tensor([(0.0000 + 1.0000j), (0.0000 + 1.0000j)], dtype=torch.complex64)

>>> torch.set_default_dtype(torch.float64)
>>> 1j*torch.ones(2)
tensor([(0.0000 + 1.0000j), (0.0000 + 1.0000j)], dtype=torch.complex128)

>>> 1j + torch.ones(2)
tensor([(1.0000 + 1.0000j), (1.0000 + 1.0000j)], dtype=torch.complex128)

>>> torch.tensor(1j) + torch.ones(2,2)
tensor([[(1.0000 + 1.0000j), (1.0000 + 1.0000j)],
        [(1.0000 + 1.0000j), (1.0000 + 1.0000j)]], dtype=torch.complex128)
Pull Request resolved: pytorch#34093

Differential Revision: D20312366

Pulled By: anjali411

fbshipit-source-id: 90f00a1a916d9c8eeda101eb6e9d250fce569815
Summary:
And few typos
Pull Request resolved: pytorch#34791

Test Plan: CI

Differential Revision: D20524879

Pulled By: malfet

fbshipit-source-id: 58fa03bd6356979e77cd1bffb6370d41a177c409
…34570)

Summary:
Per title.

In the future we want to make div(), the division operator, and addcdiv perform true division as in Python 3, NumPy, and JAX. To do this without silently breaking users we plan to:

- Warn (once) in 1.5 when a user performs integer division using div or addcdiv
- RuntimeError in 1.6 when a user attempts to perform integer division using div or addcdiv
- Always perform true division in 1.7 using div, /, and addcdiv

Users can use true_divide or floor_divide today to explicitly specify the type of division they like.

A test for this behavior is added to test_type_promotion. Unfortunately, because we are only warning once (to avoid a deluge) the test only uses maybeWarns Regex.

The XLA failure is real but will be solved by pytorch#34552. I'll be sure to land that PR first to avoid temporarily breaking the XLA build.
Pull Request resolved: pytorch#34570

Differential Revision: D20529211

Pulled By: mruberry

fbshipit-source-id: 65af5a9641c5825175d029e8413c9e1730c661d0
…x numbers

Test Plan: revert-hammer

Differential Revision:
D20312366

Original commit changeset: 90f00a1a916d

fbshipit-source-id: 4510739a888b2eec5d8a72e792998ac46da6d82a
… torch script code path"

Test Plan: revert-hammer

Differential Revision:
D7778113

Original commit changeset: b830c03ac946

fbshipit-source-id: ef08b287a6db58320c738cde0c99b3333f5724eb
…Optimizer and LossClosureOptimizer

Test Plan: revert-hammer

Differential Revision:
D20518647

Original commit changeset: 4760d1d29df1

fbshipit-source-id: b84f1a06c2de27e147716279223a6844ef89f760
…tify Workers on Failure during Distributed Autograd

Test Plan: revert-hammer

Differential Revision:
D20164420

Original commit changeset: 3d4ed7423096

fbshipit-source-id: 67f0f9c11cee84df6dbe37db7821dd601227df66
Summary:
Pull Request resolved: pytorch#34066

Basic implementation of pytorch#30632

Test Plan: Imported from OSS

Differential Revision: D20260307

Pulled By: albanD

fbshipit-source-id: 7db5c2411ddc3e954ff8fbbe93eb3b96a2bcfb2f
…orch#34978)

Summary: Pull Request resolved: pytorch#34978

Differential Revision: D20535920

Pulled By: mrshenli

fbshipit-source-id: 3baa8608dd3b0dd5578bc32e56a2e6c1fe69492d
Summary:
CircleCI by default, chooses to run 0 jobs on tags meaning that when we
tag a build that no job is run if a dependent job does not contain the
correct filters.

This adds an explicit configuration to run the setup job on every branch
and every tag that CircleCI can run on.

For more information on CircleCI filters and what they do (and more
importantly what they do not do) visit:

https://circleci.com/docs/2.0/configuration-reference/#filters-1

Signed-off-by: Eli Uriegas <eliuriegas@fb.com>
Pull Request resolved: pytorch#35013

Differential Revision: D20535560

Pulled By: seemethere

fbshipit-source-id: 7ee5dddbc0a9416fd76ed198e5447318c53e1873
anjali411 and others added 17 commits March 20, 2020 06:57
Summary: Pull Request resolved: pytorch#35001

Differential Revision: D20548983

Pulled By: anjali411

fbshipit-source-id: 1f858635d0680c0109d1ef348b7df4d3844fe0a6
Summary:
Pull Request resolved: pytorch#34394

# SWA operator
In this diff, we added a new operator `SWA` which will be used in `AdaGradOptimizer`.

The algorithm looks like:

{F230902995}

# Background

In our testings, we found that this operator could improve our models' reproducibility a lot. (KT: 0.86 -> .92)

So we hope to land this operator and in future, enable this by default in our Models.

Test Plan:
Local build `aml.dper3:30f068668cfb408fbb40141fb17129f2` and bento kernel.
- Local test: n215857
- f174600345

Reviewed By: chocjy

Differential Revision: D20165239

fbshipit-source-id: c03cdd048cb10b091e5f06323f4c0f3999f95d8a
Summary:
This PR would fix pytorch#33986.

The meaning of cbid 13 and 211 can be found at here

https://github.com/ezyang/nvprof2json/blob/837c094852c9c5164344db7c19432da37d9a8b09/nvprof2json.py#L238

https://github.com/ezyang/nvprof2json/blob/837c094852c9c5164344db7c19432da37d9a8b09/nvprof2json.py#L436

or it can also be found in the header file at `/usr/local/cuda/extras/CUPTI/include/cupti_runtime_cbid.h`.

Please also check [this at stackoverflow](https://stackoverflow.com/questions/48552390/whats-the-difference-between-launching-with-an-api-call-vs-the-triple-chevron-s). I also executed the profiling code (in the issue) on CUDA 9.2, and the cbid is already changed to 211. Just in case someone would build pytorch against older CUDA versions, I leave both 13 and 211 in the assertion.

cc csarofeen ptrblck ezyang ngimel
Pull Request resolved: pytorch#35016

Differential Revision: D20550879

Pulled By: ezyang

fbshipit-source-id: 968efc5e1126f1dd31acc9f5f4463f351d8a4c4f
Summary:
This adds the `trunc_normal_` function to `torch.nn.init` which allows for modifying tensors in-place to values drawn from a truncated normal distribution. I chose to use the inverse CDF method to implement this. I have included the appropriate code in `test_nn.py` for verifying that the values are from the correct distribution.

Reasons I chose this method:
1. Easily implemented to operate on memory in place, as the other initializers are.
1. No resampling delays
1. This method's main weakness is unlikely to be an issue. While the inverse CDF method can fail to generate the correct distribution when `b < mean` or `mean < a`,  I expect users will choose `a` and `b` so that `a < mean < b`. This method is extremely effective in this case.
Pull Request resolved: pytorch#32397

Differential Revision: D20550996

Pulled By: ezyang

fbshipit-source-id: 298a325043a3fd7d1e24d266e3b9b6cc14f81829
Summary:
Benchmark: (Debian 10, Release build, gcc 8.3, no turbo, Intel(R) Xeon(R) E-2136 CPU @ 3.30GHz)

```python
import timeit
for op in ('gt', 'lt', 'ge', 'le', 'eq', 'ne'):
    for dtype in ('torch.float', 'torch.double', 'torch.int16', 'torch.int32', 'torch.int64'):
        for n, t in [(10_000, 100000),
                    (100_000, 10000)]:
            print(f'a.{op}_(b), numel() == {n} for {t} times, dtype={dtype}')
            print(timeit.timeit(f'a.{op}_(b)', setup=f'import torch; a = torch.arange(1, {n}, dtype={dtype}); b = torch.arange({n}, 1, -1, dtype={dtype})', number=t))
```

Before:

```
a.gt_(b), numel() == 10000 for 100000 times, dtype=torch.float
0.778998922000028
a.gt_(b), numel() == 100000 for 10000 times, dtype=torch.float
0.6359690249992127
a.gt_(b), numel() == 10000 for 100000 times, dtype=torch.double
1.0801493119997758
a.gt_(b), numel() == 100000 for 10000 times, dtype=torch.double
0.9360321379990637
a.gt_(b), numel() == 10000 for 100000 times, dtype=torch.int16
0.7341018620008981
a.gt_(b), numel() == 100000 for 10000 times, dtype=torch.int16
0.6345281440007966
a.gt_(b), numel() == 10000 for 100000 times, dtype=torch.int32
0.7396387640001194
a.gt_(b), numel() == 100000 for 10000 times, dtype=torch.int32
0.6429641230006382
a.gt_(b), numel() == 10000 for 100000 times, dtype=torch.int64
0.7759611700003006
a.gt_(b), numel() == 100000 for 10000 times, dtype=torch.int64
0.6672059659995284
a.lt_(b), numel() == 10000 for 100000 times, dtype=torch.float
0.7724312530008319
a.lt_(b), numel() == 100000 for 10000 times, dtype=torch.float
0.6392585769990546
a.lt_(b), numel() == 10000 for 100000 times, dtype=torch.double
0.7917451840003196
a.lt_(b), numel() == 100000 for 10000 times, dtype=torch.double
0.6455550159989798
a.lt_(b), numel() == 10000 for 100000 times, dtype=torch.int16
0.739991647998977
a.lt_(b), numel() == 100000 for 10000 times, dtype=torch.int16
0.6572993859990675
a.lt_(b), numel() == 10000 for 100000 times, dtype=torch.int32
0.7627949479992822
a.lt_(b), numel() == 100000 for 10000 times, dtype=torch.int32
0.6476544910001394
a.lt_(b), numel() == 10000 for 100000 times, dtype=torch.int64
0.7965036850000615
a.lt_(b), numel() == 100000 for 10000 times, dtype=torch.int64
0.6780715599998075
a.ge_(b), numel() == 10000 for 100000 times, dtype=torch.float
0.7653547080008138
a.ge_(b), numel() == 100000 for 10000 times, dtype=torch.float
0.6383065829995758
a.ge_(b), numel() == 10000 for 100000 times, dtype=torch.double
0.7895260240002244
a.ge_(b), numel() == 100000 for 10000 times, dtype=torch.double
0.6508346030004759
a.ge_(b), numel() == 10000 for 100000 times, dtype=torch.int16
0.7409299750015634
a.ge_(b), numel() == 100000 for 10000 times, dtype=torch.int16
0.6383492870008922
a.ge_(b), numel() == 10000 for 100000 times, dtype=torch.int32
0.7620547579990671
a.ge_(b), numel() == 100000 for 10000 times, dtype=torch.int32
0.6474270239996258
a.ge_(b), numel() == 10000 for 100000 times, dtype=torch.int64
0.8070051169997896
a.ge_(b), numel() == 100000 for 10000 times, dtype=torch.int64
0.6712598600006459
a.le_(b), numel() == 10000 for 100000 times, dtype=torch.float
0.7627660060006747
a.le_(b), numel() == 100000 for 10000 times, dtype=torch.float
0.6406353189995571
a.le_(b), numel() == 10000 for 100000 times, dtype=torch.double
1.0826010620003217
a.le_(b), numel() == 100000 for 10000 times, dtype=torch.double
0.9391552950000914
a.le_(b), numel() == 10000 for 100000 times, dtype=torch.int16
0.7427801039993938
a.le_(b), numel() == 100000 for 10000 times, dtype=torch.int16
0.6365172640016681
a.le_(b), numel() == 10000 for 100000 times, dtype=torch.int32
0.7679271510005492
a.le_(b), numel() == 100000 for 10000 times, dtype=torch.int32
0.6453389289999905
a.le_(b), numel() == 10000 for 100000 times, dtype=torch.int64
0.788032889000533
a.le_(b), numel() == 100000 for 10000 times, dtype=torch.int64
0.6708840760002204
a.eq_(b), numel() == 10000 for 100000 times, dtype=torch.float
1.078837263999958
a.eq_(b), numel() == 100000 for 10000 times, dtype=torch.float
0.9397531720005645
a.eq_(b), numel() == 10000 for 100000 times, dtype=torch.double
1.1031508050000411
a.eq_(b), numel() == 100000 for 10000 times, dtype=torch.double
0.9412319389994082
a.eq_(b), numel() == 10000 for 100000 times, dtype=torch.int16
0.7509566959997755
a.eq_(b), numel() == 100000 for 10000 times, dtype=torch.int16
0.638570957000411
a.eq_(b), numel() == 10000 for 100000 times, dtype=torch.int32
0.7592877549996047
a.eq_(b), numel() == 100000 for 10000 times, dtype=torch.int32
0.6458840529994632
a.eq_(b), numel() == 10000 for 100000 times, dtype=torch.int64
0.7984061539991671
a.eq_(b), numel() == 100000 for 10000 times, dtype=torch.int64
0.6776346309998189
a.ne_(b), numel() == 10000 for 100000 times, dtype=torch.float
0.7724407899986545
a.ne_(b), numel() == 100000 for 10000 times, dtype=torch.float
0.6581534130000364
a.ne_(b), numel() == 10000 for 100000 times, dtype=torch.double
0.8303323249983805
a.ne_(b), numel() == 100000 for 10000 times, dtype=torch.double
0.6954390920000151
a.ne_(b), numel() == 10000 for 100000 times, dtype=torch.int16
0.745512373998281
a.ne_(b), numel() == 100000 for 10000 times, dtype=torch.int16
0.6360954970004968
a.ne_(b), numel() == 10000 for 100000 times, dtype=torch.int32
0.7569978400006221
a.ne_(b), numel() == 100000 for 10000 times, dtype=torch.int32
0.6450422030011396
a.ne_(b), numel() == 10000 for 100000 times, dtype=torch.int64
0.7889118379989668
a.ne_(b), numel() == 100000 for 10000 times, dtype=torch.int64
0.6693385389989999
```

After:

```
a.gt_(b), numel() == 10000 for 100000 times, dtype=torch.float
0.2444220920006046
a.gt_(b), numel() == 100000 for 10000 times, dtype=torch.float
0.2031730359994981
a.gt_(b), numel() == 10000 for 100000 times, dtype=torch.double
0.35491806199934217
a.gt_(b), numel() == 100000 for 10000 times, dtype=torch.double
0.3905606850003096
a.gt_(b), numel() == 10000 for 100000 times, dtype=torch.int16
0.16665379499863775
a.gt_(b), numel() == 100000 for 10000 times, dtype=torch.int16
0.10095906300011848
a.gt_(b), numel() == 10000 for 100000 times, dtype=torch.int32
0.21650469999985944
a.gt_(b), numel() == 100000 for 10000 times, dtype=torch.int32
0.18737469400002738
a.gt_(b), numel() == 10000 for 100000 times, dtype=torch.int64
0.35481256200000644
a.gt_(b), numel() == 100000 for 10000 times, dtype=torch.int64
0.36696120199849247
a.lt_(b), numel() == 10000 for 100000 times, dtype=torch.float
0.21976138800164335
a.lt_(b), numel() == 100000 for 10000 times, dtype=torch.float
0.20275393200063263
a.lt_(b), numel() == 10000 for 100000 times, dtype=torch.double
0.3695997209997586
a.lt_(b), numel() == 100000 for 10000 times, dtype=torch.double
0.39441510399956314
a.lt_(b), numel() == 10000 for 100000 times, dtype=torch.int16
0.15657078300137073
a.lt_(b), numel() == 100000 for 10000 times, dtype=torch.int16
0.0992998069996247
a.lt_(b), numel() == 10000 for 100000 times, dtype=torch.int32
0.20425128799979575
a.lt_(b), numel() == 100000 for 10000 times, dtype=torch.int32
0.20352934599941364
a.lt_(b), numel() == 10000 for 100000 times, dtype=torch.int64
0.35883567900054913
a.lt_(b), numel() == 100000 for 10000 times, dtype=torch.int64
0.39059587599876977
a.ge_(b), numel() == 10000 for 100000 times, dtype=torch.float
0.21457727400047588
a.ge_(b), numel() == 100000 for 10000 times, dtype=torch.float
0.18836135499986995
a.ge_(b), numel() == 10000 for 100000 times, dtype=torch.double
0.35971907199927955
a.ge_(b), numel() == 100000 for 10000 times, dtype=torch.double
0.3688875009993353
a.ge_(b), numel() == 10000 for 100000 times, dtype=torch.int16
0.1576009280015569
a.ge_(b), numel() == 100000 for 10000 times, dtype=torch.int16
0.09524034199966991
a.ge_(b), numel() == 10000 for 100000 times, dtype=torch.int32
0.2064543649994448
a.ge_(b), numel() == 100000 for 10000 times, dtype=torch.int32
0.18726435600001423
a.ge_(b), numel() == 10000 for 100000 times, dtype=torch.int64
0.35351785300008487
a.ge_(b), numel() == 100000 for 10000 times, dtype=torch.int64
0.3680737989998306
a.le_(b), numel() == 10000 for 100000 times, dtype=torch.float
0.2132134399998904
a.le_(b), numel() == 100000 for 10000 times, dtype=torch.float
0.2140274829998816
a.le_(b), numel() == 10000 for 100000 times, dtype=torch.double
0.36539215199991304
a.le_(b), numel() == 100000 for 10000 times, dtype=torch.double
0.39128020300086064
a.le_(b), numel() == 10000 for 100000 times, dtype=torch.int16
0.15712150600120367
a.le_(b), numel() == 100000 for 10000 times, dtype=torch.int16
0.10149904400168452
a.le_(b), numel() == 10000 for 100000 times, dtype=torch.int32
0.2103407699996751
a.le_(b), numel() == 100000 for 10000 times, dtype=torch.int32
0.2134442910009966
a.le_(b), numel() == 10000 for 100000 times, dtype=torch.int64
0.35387034300038067
a.le_(b), numel() == 100000 for 10000 times, dtype=torch.int64
0.38917528399906587
a.eq_(b), numel() == 10000 for 100000 times, dtype=torch.float
0.2190484450002259
a.eq_(b), numel() == 100000 for 10000 times, dtype=torch.float
0.2030815980015177
a.eq_(b), numel() == 10000 for 100000 times, dtype=torch.double
0.3710030169986567
a.eq_(b), numel() == 100000 for 10000 times, dtype=torch.double
0.36419657899932645
a.eq_(b), numel() == 10000 for 100000 times, dtype=torch.int16
0.15986497499943653
a.eq_(b), numel() == 100000 for 10000 times, dtype=torch.int16
0.10145393699895067
a.eq_(b), numel() == 10000 for 100000 times, dtype=torch.int32
0.21011781599918322
a.eq_(b), numel() == 100000 for 10000 times, dtype=torch.int32
0.20121852699958254
a.eq_(b), numel() == 10000 for 100000 times, dtype=torch.int64
0.36681504499938455
a.eq_(b), numel() == 100000 for 10000 times, dtype=torch.int64
0.364472848999867
a.ne_(b), numel() == 10000 for 100000 times, dtype=torch.float
0.2290963309988001
a.ne_(b), numel() == 100000 for 10000 times, dtype=torch.float
0.21674784300012107
a.ne_(b), numel() == 10000 for 100000 times, dtype=torch.double
0.3829616689999966
a.ne_(b), numel() == 100000 for 10000 times, dtype=torch.double
0.39437660300063726
a.ne_(b), numel() == 10000 for 100000 times, dtype=torch.int16
0.1661020749997988
a.ne_(b), numel() == 100000 for 10000 times, dtype=torch.int16
0.10052955100036343
a.ne_(b), numel() == 10000 for 100000 times, dtype=torch.int32
0.21827425599985872
a.ne_(b), numel() == 100000 for 10000 times, dtype=torch.int32
0.21522501399886096
a.ne_(b), numel() == 10000 for 100000 times, dtype=torch.int64
0.37058242300008715
a.ne_(b), numel() == 100000 for 10000 times, dtype=torch.int64
0.39304063900090114
```
Pull Request resolved: pytorch#33252

Differential Revision: D20254663

Pulled By: ezyang

fbshipit-source-id: 68b7109ec4359434afbeb96df372e29608f501bb
Summary: Pull Request resolved: pytorch#34506

Test Plan: Imported from OSS

Differential Revision: D20559619

Pulled By: anjali411

fbshipit-source-id: c63cb3c07f694c10328fc17f99d69d7134e5c67a
Summary: Pull Request resolved: pytorch#35056

Differential Revision: D20559396

Pulled By: anjali411

fbshipit-source-id: 64b911f893e9c54aef89e8c1e643998d8b70e613
Summary:
Pull Request resolved: pytorch#35104

I missed this in pytorch#34959
after a rebase, fixing.

Test Plan:
running benchmarks no longer crashes
CI

Imported from OSS

Differential Revision: D20560908

fbshipit-source-id: a5494e23953d3c9007e9874d673896291b5322e0
Summary: Pull Request resolved: pytorch#35064

Test Plan: Imported from OSS

Differential Revision: D20543695

Pulled By: ZolotukhinM

fbshipit-source-id: 1cf294ab19465cb93557c2b195252c739b40a0f7
Summary: Pull Request resolved: pytorch#32474

Test Plan: Imported from OSS

Differential Revision: D20559815

Pulled By: IvanKobzarev

fbshipit-source-id: 69a4fe951d331eb311bf821f94b372ccecdf1fd6
Test Plan: revert-hammer

Differential Revision:
D20254663

Original commit changeset: 68b7109ec435

fbshipit-source-id: 73474d88a7bb96448428ea5ff780e77163a00f88
…ad. (pytorch#34967)

Summary: Pull Request resolved: pytorch#34967

Differential Revision: D20547478

Pulled By: resistor

fbshipit-source-id: da7df159fd6098d0f1278b8088bbbe6717b79cfc
Summary: Pull Request resolved: pytorch#35085

Test Plan: Imported from OSS

Differential Revision: D20552334

Pulled By: ZolotukhinM

fbshipit-source-id: 628fcf4719a879f18978ff8a0a64afbb045df645
@csarofeen csarofeen closed this Apr 5, 2020
@csarofeen
Copy link
Copy Markdown
Owner Author

We reset master to follow upstream, this is no longer needed.

@csarofeen csarofeen deleted the MinorRefactor branch May 17, 2020 14:10
shmsong pushed a commit to shmsong/pytorch that referenced this pull request Jun 9, 2021
Summary: added more statistic info for static runtime

Test Plan:
caffe2/benchmarks/static_runtime:static_runtime_cpptest

Expected output example:

Static runtime ms per iter: 0.939483. Iters per second: 1064.41
Node #0: 0.195671 ms/iter, %wide_offset.1 : Tensor = aten::add(%wide.1, %self._mu, %4)
Node #1: 0.169457 ms/iter, %wide_normalized.1 : Tensor = aten::mul(%wide_offset.1, %self._sigma)
Node #2: 0.118218 ms/iter, %wide_preproc.1 : Tensor = aten::clamp(%wide_normalized.1, %5, %6)
Node #3: 0.038814 ms/iter, %user_emb_t.1 : Tensor = aten::transpose(%user_emb.1, %4, %7)
Node #4: 0.0860747 ms/iter, %dp_unflatten.1 : Tensor = aten::bmm(%ad_emb_packed.1, %user_emb_t.1)
Node csarofeen#5: 0.0102666 ms/iter, %31 : Tensor = static_runtime::flatten_copy(%dp_unflatten.1, %4, %8)
Node csarofeen#6: 0.000476333 ms/iter, %19 : Tensor[] = prim::ListConstruct(%31, %wide_preproc.1)
Node csarofeen#7: 0.0707332 ms/iter, %input.1 : Tensor = aten::cat(%19, %4)
Node csarofeen#8: 0.123695 ms/iter, %fc1.1 : Tensor = aten::addmm(%self._fc_b, %input.1, %29, %4, %4)
Node csarofeen#9: 0.0309244 ms/iter, %23 : Tensor = aten::sigmoid(%fc1.1)
Node csarofeen#10: 0.0046297 ms/iter, %24 : (Tensor) = prim::TupleConstruct(%23)
Time per node type:
       0.195671 ms.    23.0483%. aten::add (1 nodes)
       0.169457 ms.    19.9605%. aten::mul (1 nodes, out variant)
       0.123695 ms.    14.5702%. aten::addmm (1 nodes, out variant)
       0.118218 ms.     13.925%. aten::clamp (1 nodes, out variant)
      0.0860747 ms.    10.1388%. aten::bmm (1 nodes, out variant)
      0.0707332 ms.    8.33175%. aten::cat (1 nodes, out variant)
       0.038814 ms.    4.57195%. aten::transpose (1 nodes)
      0.0309244 ms.    3.64263%. aten::sigmoid (1 nodes, out variant)
      0.0102666 ms.    1.20932%. static_runtime::flatten_copy (1 nodes, out variant)
      0.0046297 ms.   0.545338%. prim::TupleConstruct (1 nodes, out variant)
    0.000476333 ms.  0.0561079%. prim::ListConstruct (1 nodes, out variant)
       0.848959 ms. in Total
StaticRuntime setup time: 0.018925 ms
Memory allocation time: 0.019808 ms
Memory deallocation time: 0.0120445 ms
Outputs deallocation time: 0.0864947 ms
Total memory managed: 19328 bytes
Total number of reused tensors: 3
Total number of 'out' variant nodes/total number of nodes: 9/11 (81.8182%)

Reviewed By: hlu1

Differential Revision: D28553029

fbshipit-source-id: 55e7eab50b4b475ae219896100bdf4f6678875a4
shmsong pushed a commit to shmsong/pytorch that referenced this pull request Jul 6, 2021
Summary:
Pull Request resolved: pytorch#60987

We were seeing deadlocks as follows during shutdown:

```
Thread 1 (LWP 2432101):
#0  0x00007efca470190b in __pause_nocancel () from /lib64/libc.so.6
#1  0x00007efca49de485 in __pthread_mutex_lock_full () from /lib64/libpthread.so.0
#2  0x00007ef91d4c42c6 in __cuda_CallJitEntryPoint () from /lib64/libnvidia-ptxjitcompiler.so.1
#3  0x00007efc651ac8f1 in ?? () from /lib64/libcuda.so
#4  0x00007efc651aee03 in ?? () from /lib64/libcuda.so
csarofeen#5  0x00007efc64f76b84 in ?? () from /lib64/libcuda.so
csarofeen#6  0x00007efc64f77f5d in ?? () from /lib64/libcuda.so
csarofeen#7  0x00007efc64eac858 in ?? () from /lib64/libcuda.so
csarofeen#8  0x00007efc64eacfbc in ?? () from /lib64/libcuda.so
csarofeen#9  0x00007efc7810a924 in ?? () from /usr/local/cuda/lib64/libcublas.so.11
csarofeen#10 0x00007efc780fa2be in ?? () from /usr/local/cuda/lib64/libcublas.so.11
csarofeen#11 0x00007efc78111044 in ?? () from /usr/local/cuda/lib64/libcublas.so.11
csarofeen#12 0x00007efc7811580a in ?? () from /usr/local/cuda/lib64/libcublas.so.11
csarofeen#13 0x00007efc78115aa4 in ?? () from /usr/local/cuda/lib64/libcublas.so.11
csarofeen#14 0x00007efc781079ec in ?? () from /usr/local/cuda/lib64/libcublas.so.11
csarofeen#15 0x00007efc780e6a7a in ?? () from /usr/local/cuda/lib64/libcublas.so.11
csarofeen#16 0x00007efc7811cfa5 in ?? () from /usr/local/cuda/lib64/libcublas.so.11
csarofeen#17 0x00007efc777ea98c in ?? () from /usr/local/cuda/lib64/libcublas.so.11
csarofeen#18 0x00007efc777ebd80 in ?? () from /usr/local/cuda/lib64/libcublas.so.11
csarofeen#19 0x00007efc777ea2c9 in ?? () from /usr/local/cuda/lib64/libcublas.so.11
csarofeen#20 0x00007efc778c2e2d in cublasDestroy_v2 () from /usr/local/cuda/lib64/libcublas.so.11
csarofeen#21 0x00007efc51a3fb56 in std::_Sp_counted_ptr_inplace<at::cuda::(anonymous namespace)::DeviceThreadHandlePool<cublasContext*, &at::cuda::(anonymous namespace)::createCublasHandle, &at::cuda::(anonymous namespace)::destroyCublasHandle>, std::allocator<at::cuda::(anonymous namespace)::DeviceThreadHandlePool<cublasContext*, &at::cuda::(anonymous namespace)::createCublasHandle, &at::cuda::(anonymous namespace)::destroyCublasHandle> >, (__gnu_cxx::_Lock_policy)2>::_M_dispose() () from /data/users/pritam/pytorch/torch/lib/libtorch_cuda.so
csarofeen#22 0x00007efc51a3fc5f in std::shared_ptr<at::cuda::(anonymous namespace)::DeviceThreadHandlePool<cublasContext*, &at::cuda::(anonymous namespace)::createCublasHandle, &at::cuda::(anonymous namespace)::destroyCublasHandle> >::~shared_ptr() () from /data/users/pritam/pytorch/torch/lib/libtorch_cuda.so
csarofeen#23 0x00007efca4648b0c in __run_exit_handlers () from /lib64/libc.so.6
csarofeen#24 0x00007efca4648c40 in exit () from /lib64/libc.so.6
csarofeen#25 0x0000558c8852e5f9 in Py_Exit (sts=0) at /tmp/build/80754af9/python_1614362349910/work/Python/pylifecycle.c:2292
csarofeen#26 0x0000558c8852e6a7 in handle_system_exit () at /tmp/build/80754af9/python_1614362349910/work/Python/pythonrun.c:636
csarofeen#27 0x0000558c8852e742 in PyErr_PrintEx (set_sys_last_vars=<optimized out>, set_sys_last_vars=<optimized out>) at /tmp/build/80754af9/python_1614362349910/work/Python/pythonrun.c:646
csarofeen#28 0x0000558c88540dd6 in PyRun_SimpleStringFlags (command=0x7efca4dc9050 "from multiprocessing.spawn import spawn_main; spawn_main(tracker_fd=9, pipe_handle=13)\n", flags=0x7ffe3a986110) at /tmp/build/80754af9/python_1614362349910/work/Python/pythonrun.c:457
csarofeen#29 0x0000558c88540ead in pymain_run_command (cf=0x7ffe3a986110, command=<optimized out>) at /tmp/build/80754af9/python_1614362349910/work/Modules/main.c:420
csarofeen#30 pymain_run_python (pymain=0x7ffe3a986220) at /tmp/build/80754af9/python_1614362349910/work/Modules/main.c:2907
csarofeen#31 pymain_main (pymain=0x7ffe3a986220) at /tmp/build/80754af9/python_1614362349910/work/Modules/main.c:3460
csarofeen#32 0x0000558c8854122c in _Py_UnixMain (argc=<optimized out>, argv=<optimized out>) at /tmp/build/80754af9/python_1614362349910/work/Modules/main.c:3495
csarofeen#33 0x00007efca4632493 in __libc_start_main () from /lib64/libc.so.6
csarofeen#34 0x0000558c884e5e90 in _start () at ../sysdeps/x86_64/elf/start.S:103
```

This was likely caused due to a static singleton that wasn't leaky. Following
the guidance in https://isocpp.org/wiki/faq/ctors#construct-on-first-use-v2 to
use a leaky singleton instead.
ghstack-source-id: 132847448

Test Plan: Verified locally.

Reviewed By: malfet

Differential Revision: D29468866

fbshipit-source-id: 89250594c5cd2643417b1da584c658b742dc5a5c
jjsjann123 pushed a commit that referenced this pull request Jul 26, 2021
Summary:
Pull Request resolved: pytorch#61588

As part of debugging pytorch#60290,
we discovered the following deadlock:

```
Thread 79 (Thread 0x7f52ff7fe700 (LWP 205437)):
#0  pthread_cond_timedwait@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:225
#1  0x0000564880199152 in PyCOND_TIMEDWAIT (cond=0x564880346080 <gil_cond>, mut=0x564880346100 <gil_mutex>, us=5000) at /home/builder/ktietz/cos6/ci_cos6/python_1622833237666/work/Python/condvar.h:103
#2  take_gil (tstate=0x7f5254005ef0) at /home/builder/ktietz/cos6/ci_cos6/python_1622833237666/work/Python/ceval_gil.h:224
#3  0x0000564880217b62 in PyEval_AcquireThread (tstate=0x7f5254005ef0) at /home/builder/ktietz/cos6/ci_cos6/python_1622833237666/work/Python/ceval.c:278
#4  0x00007f557d54aabd in pybind11::gil_scoped_acquire::gil_scoped_acquire() () from /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_python.so
#5  0x00007f557da7792f in (anonymous namespace)::concrete_decref_fn(c10::impl::PyInterpreter const*, _object*) () from /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_python.so
#6  0x00007f5560dadba6 in c10::TensorImpl::release_resources() () from /opt/conda/lib/python3.6/site-packages/torch/lib/libc10.so
#7  0x00007f5574c885bc in std::_Sp_counted_ptr_inplace<torch::distributed::autograd::DistAutogradContext, std::allocator<torch::distributed::autograd::DistAutogradContext>, (__gnu_cxx::_Lock_policy)2>::_M_dispose() () from /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so
#8  0x00007f5574c815e9 in std::__detail::_Hashtable_alloc<std::allocator<std::__detail::_Hash_node<std::pair<long const, std::shared_ptr<torch::distributed::autograd::DistAutogradContext> >, false> > >::_M_deallocate_node(std::__detail::_Hash_node<std::pair<long const, std::shared_ptr<torch::distributed::autograd::DistAutogradContext> >, false>*) [clone .isra.325] () from /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so
#9  0x00007f5574c81bf1 in torch::distributed::autograd::DistAutogradContainer::eraseContextIdAndReset(torch::distributed::autograd::DistAutogradContainer::ContextsShard&, long) () from /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so
#10 0x00007f5574c86e83 in torch::distributed::autograd::DistAutogradContainer::releaseContextIfPresent(long) () from /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so
#11 0x00007f5574cc6395 in torch::distributed::rpc::RequestCallbackNoPython::processCleanupAutogradContextReq(torch::distributed::rpc::RpcCommandBase&) const () from /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so
#12 0x00007f5574cccf15 in torch::distributed::rpc::RequestCallbackNoPython::processRpc(torch::distributed::rpc::RpcCommandBase&, torch::distributed::rpc::MessageType const&, std::vector<c10::Stream, std::allocator<c10::Stream> >) const () from /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so

Thread 72 (Thread 0x7f53077fe700 (LWP 205412)):
#0  __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
#1  0x00007f55bc62adbd in __GI___pthread_mutex_lock (mutex=0x564884396440) at ../nptl/pthread_mutex_lock.c:80
#2  0x00007f5574c82a2f in torch::distributed::autograd::DistAutogradContainer::retrieveContext(long) () from /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so
#3  0x00007f557de9bb2f in pybind11::cpp_function::initialize<torch::distributed::autograd::(anonymous namespace)::dist_autograd_init(_object*, _object*)::{lambda(long)#11}, pybind11::dict, long, pybind11::name, pybind11::scope, pybind11::sibling, char [931], pybind11::arg>(torch::distributed::autograd::(anonymous namespace)::dist_autograd_init(_object*, _object*)::{lambda(long)#11}&&, pybind11::dict (*)(long), pybind11::name const&, pybind11::scope const&, pybind11::sibling const&, char const (&) [931], pybind11::arg const&)::{lambda(pybind11::detail::function_call&)#3}::_FUN(pybind11::detail::function_call) () from /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_python.so

```

Basically Thread 72, holds GIL and tries to acquire the lock for
DistAutogradContainer to perform a lookup on a map. On the other hand,
Thread 79 holds the lock on DistAutogradContainer to remove a Tensor and as
part of TensorImpl destructor, concrete_decref_fn is called which waits for
GIL. As a result, we have a deadlock.

To fix this issue, I've ensured we release GIL when we call `retrieveContext`
and acquire it later when needed.
ghstack-source-id: 133493659

Test Plan: waitforbuildbot

Reviewed By: mrshenli

Differential Revision: D29682624

fbshipit-source-id: f68a1fb39040ca0447a26e456a97bce64af6b79c
jjsjann123 pushed a commit that referenced this pull request Aug 23, 2021
…ytorch#63339)

Summary:
Pull Request resolved: pytorch#63339

# Context
https://fb.workplace.com/groups/pytorch.dev/permalink/900474523864362/?comment_id=901125403799274&reply_comment_id=905023386742809

##### WHAT IS A STACK TRACE?
A stack trace (also called stack backtrace or stack traceback) is a report of the active stack frames at a certain point in time during the execution of a program.

Typically when an exception is thrown, one would expect to see the code (file:line) that threw the exception, and every intermediate frame up to and including the main function.

We are enabling android stack trace to help debugging on android devices.

Test Plan:
## Steps to test
```
buck build fbsource//xplat/caffe2/mode/aibench_pytorch_android -c pt.enable_qpl=0 -c pt.has_backtraces=1 fbsource//xplat/caffe2/fb/lite_predictor:lite_predictorAndroid#android-x86_64

one_world android emulator android-28

adb push ~/fbsource/buck-out/gen/xplat/caffe2/fb/lite_predictor/lite_predictorAndroid#android-x86_64 /data/local/tmp

cd /data/local/tmp
./lite_predictorAndroid#android-x86_64

./lite_predictorAndroid#android-x86_64 --model ./detect.bc --input_dims "1,3,192,192" --input_type float --warmup 20 --iter 5 --report_pep true
```

## See how model file is not found stack traces is:

### before
```
./lite_predictorAndroid#android-x86_64 --model ./detect.bc --input_dims "1,3,192,192" --input_type float --warmup 20 --iter 5 --report_pep true

Run with 2 threads
Run with 2 threads
Loading model...
terminating with uncaught exception of type c10::Error: open file failed, file path: ./detect.bc
Exception raised from RAIIFile at xplat/caffe2/caffe2/serialize/file_adapter.cc:13 (most recent call first):
(no backtrace available)
Aborted
```

### after
```
134|generic_x86_64:/data/local/tmp $ ./lite_predictorAndroid#android-x86_64 --model ./detect.bc --input_dims "1,3,192,192" --input_type float --warmup 20 --iter 5 --report_pep true
Run with 2 threads
Run with 2 threads
Loading model...
terminating with uncaught exception of type c10::Error: open file failed, file path: ./detect.bc
Exception raised from RAIIFile at xplat/caffe2/caffe2/serialize/file_adapter.cc:13 (most recent call first):
 frame #0       c10::get_backtrace(unsigned long, unsigned long, bool)[0x59494274f10e]
 frame #1       [0x5949427b1eee]
 frame #2       [0x5949427b1eb2]
 frame #3       [0x5949427b1cdc]
 frame #4       std::__ndk1::function<std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> > ()>::operator()() const[0x5949427afc34]
 frame #5       c10::Error::Error(c10::SourceLocation, std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> >)[0x5949427b05b1]
 frame #6       c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> > const&)[0x5949427aca5f]
 frame #7       caffe2::serialize::FileAdapter::RAIIFile::RAIIFile(std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> > const&)[0x5949426b37b2]
 frame #8       caffe2::serialize::FileAdapter::FileAdapter(std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> > const&)[0x5949426b3903]
 frame #9       torch::jit::_load_for_mobile(std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> > const&, c10::optional<c10::Device>, std::__ndk1::unordered_map<std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> >, std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> >, std::__ndk1::hash<std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> > >, std::__ndk1::equal_to<std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> > >, std::__ndk1::allocator<std::__ndk1::pair<std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> > const, std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> > > > >&)[0x5949422737bd]
 frame #10      torch::jit::_load_for_mobile(std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> > const&, c10::optional<c10::Device>)[0x594942273769]
 frame #11      benchmark(std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> > const&, int, std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> > const&, std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> > const&, std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> > const&, bool, int, int, int, bool, int, bool, int, double, bool, bool, bool, std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> > const&)[0x59494189b21d]
 frame #12      main[0x594941882aff]
 frame #13      __libc_init[0x7b699d08578d]
```

### what we get for os:linux
```
(base) [pavithran@devvm1803.vll0 /data/users/pavithran/fbsource] ./buck-out/gen/xplat/caffe2/fb/lite_predictor/lite_predictor --model ./detect.bc --input_dims "1,3,192,192" --input_type float --warmup 20 --iter 5 --report_pep true
Run with 24 threads
Run with 24 threads
Loading model...
terminate called after throwing an instance of 'c10::Error'
  what():  open file failed, file path: ./detect.bc
Exception raised from RAIIFile at xplat/caffe2/caffe2/serialize/file_adapter.cc:13 (most recent call first):
frame #0: ./buck-out/gen/xplat/caffe2/fb/lite_predictor/lite_predictor() [0x20cb7fe]
frame #1: ./buck-out/gen/xplat/caffe2/fb/lite_predictor/lite_predictor() [0x20cb6c6]
frame #2: std::function<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > ()>::operator()() const + 0x54 (0x20ca4e4 in ./buck-out/gen/xplat/caffe2/fb/lite_predictor/lite_predictor)
frame #3: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) + 0x57 (0x20ca9a7 in ./buck-out/gen/xplat/caffe2/fb/lite_predictor/lite_predictor)
frame #4: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 0x7a (0x20c823a in ./buck-out/gen/xplat/caffe2/fb/lite_predictor/lite_predictor)
frame #5: caffe2::serialize::FileAdapter::RAIIFile::RAIIFile(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 0x96 (0x206f3d6 in ./buck-out/gen/xplat/caffe2/fb/lite_predictor/lite_predictor)
frame #6: caffe2::serialize::FileAdapter::FileAdapter(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 0x42 (0x206f502 in ./buck-out/gen/xplat/caffe2/fb/lite_predictor/lite_predictor)
frame #7: torch::jit::_load_for_mobile(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, c10::optional<c10::Device>, std::unordered_map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >&) + 0x30 (0x1be826c in ./buck-out/gen/xplat/caffe2/fb/lite_predictor/lite_predictor)
frame #8: torch::jit::_load_for_mobile(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, c10::optional<c10::Device>) + 0x35 (0x1be8214 in ./buck-out/gen/xplat/caffe2/fb/lite_predictor/lite_predictor)
frame #9: benchmark(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool, int, int, int, bool, int, bool, int, double, bool, bool, bool, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 0x16d (0x12093ad in ./buck-out/gen/xplat/caffe2/fb/lite_predictor/lite_predictor)
frame #10: main + 0x25c (0x11f933c in ./buck-out/gen/xplat/caffe2/fb/lite_predictor/lite_predictor)
frame #11: __libc_start_main + 0x105 (0x7fc7b9f2ed95 in /usr/local/fbcode/platform009/lib/libc.so.6)
frame #12: _start + 0x2a (0x11f902a in ./buck-out/gen/xplat/caffe2/fb/lite_predictor/lite_predictor)

Aborted (core dumped)
````

Reviewed By: dhruvbird

Differential Revision: D30135947

fbshipit-source-id: f50c634ef4545843305cad4b4a14a8776b1aec76
jjsjann123 pushed a commit that referenced this pull request Jun 8, 2022
… of libtorch_python (pytorch#78028)

Summary:
This moves torch::class_<WorkerInfo> into `rpc_agent.cpp` so it gets registered in libtorch instead of libtorch_python. This is intermediate work to getting torch::deploy to load an unmodified copy of libtorch. Current RPC is incompatible due to duplicate registrations.

```
unknown file: Failure
C++ exception with description "Exception Caught inside torch::deploy embedded library:
Custom class with name __torch__.torch.classes.dist_rpc.WorkerInfo is already registered. Ensure that registration with torch::class_ is only called once.
Exception raised from registerCustomClass at ../aten/src/ATen/core/custom_class.cpp:61 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x3e (0x7f3bd9adb92e in /home/tristanr/venvs/multipy/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::string const&) + 0x5c (0x7f3bd9ab7068 in /home/tristanr/venvs/multipy/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #2: torch::registerCustomClass(std::shared_ptr<c10::ClassType>) + 0x110 (0x7f3bc2258980 in /home/tristanr/venvs/multipy/lib/python3.8/site-packages/torch/lib/libtorch_cpu.so)
frame #3: torch::detail::class_base::class_base(std::string const&, std::string const&, std::string, std::type_info const&, std::type_info const&) + 0x3b9 (0x7f3bc225a419 in /home/tristanr/venvs/multipy/lib/python3.8/site-packages/torch/lib/libtorch_cpu.so)
frame #4: [0x7f3ba45cfea1]
frame #5: <unknown function> + 0x1b5334 (0x5652bdab9334 in ./test_deploy)
frame #6: <unknown function> + 0x1b4f3e (0x5652bdab8f3e in ./test_deploy)
frame #7: <unknown function> + 0x1b519b (0x5652bdab919b in ./test_deploy)
frame #8: loadSearchFile(char const*) + 0x23e (0x7f3ba62f37f8 in /tmp/torch_deploy9ATEFg)
frame #9: deploy_set_self + 0x51 (0x7f3ba62f38f9 in /tmp/torch_deploy9ATEFg)
frame #10: torch::deploy::Interpreter::Interpreter(torch::deploy::InterpreterManager*, std::shared_ptr<torch::deploy::Environment>) + 0x274 (0x5652bdaaa790 in ./test_deploy)
frame #11: void __gnu_cxx::new_allocator<torch::deploy::Interpreter>::construct<torch::deploy::Interpreter, torch::deploy::InterpreterManager*, std::shared_ptr<torch::deploy::Environment>&>(torch::deploy::Interpreter*, torch::deploy::InterpreterManager*&&, std::shared_ptr<torch::deploy::Environment>&) + 0x81 (0x5652bdaaf58b in ./test_deploy)
frame #12: void std::allocator_traits<std::allocator<torch::deploy::Interpreter> >::construct<torch::deploy::Interpreter, torch::deploy::InterpreterManager*, std::shared_ptr<torch::deploy::Environment>&>(std::allocator<torch::deploy::Interpreter>&, torch::deploy::Interpreter*, torch::deploy::InterpreterManager*&&, std::shared_ptr<torch::deploy::Environment>&) + 0x4a (0x5652bdaae320 in ./test_deploy)
frame #13: void std::vector<torch::deploy::Interpreter, std::allocator<torch::deploy::Interpreter> >::_M_realloc_insert<torch::deploy::InterpreterManager*, std::shared_ptr<torch::deploy::Environment>&>(__gnu_cxx::__normal_iterator<torch::deploy::Interpreter*, std::vector<torch::deploy::Interpreter, std::allocator<torch::deploy::Interpreter> > >, torch::deploy::InterpreterManager*&&, std::shared_ptr<torch::deploy::Environment>&) + 0xee (0x5652bdaae4a0 in ./test_deploy)
frame #14: void std::vector<torch::deploy::Interpreter, std::allocator<torch::deploy::Interpreter> >::emplace_back<torch::deploy::InterpreterManager*, std::shared_ptr<torch::deploy::Environment>&>(torch::deploy::InterpreterManager*&&, std::shared_ptr<torch::deploy::Environment>&) + 0xb6 (0x5652bdaad258 in ./test_deploy)
frame #15: torch::deploy::InterpreterManager::InterpreterManager(unsigned long, std::shared_ptr<torch::deploy::Environment>) + 0x123 (0x5652bdaa83b1 in ./test_deploy)
frame #16: TorchpyTest_InitTwice_Test::TestBody() + 0x65 (0x5652bda075a9 in ./test_deploy)
frame #17: void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) + 0x65 (0x5652bda944b7 in ./test_deploy)
frame #18: void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) + 0x5a (0x5652bda8cfe7 in ./test_deploy)
frame #19: testing::Test::Run() + 0x100 (0x5652bda68622 in ./test_deploy)
frame #20: testing::TestInfo::Run() + 0x10f (0x5652bda68fb3 in ./test_deploy)
frame #21: testing::TestSuite::Run() + 0x121 (0x5652bda6980d in ./test_deploy)
frame #22: testing::internal::UnitTestImpl::RunAllTests() + 0x38e (0x5652bda756e6 in ./test_deploy)
frame #23: bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) + 0x65 (0x5652bda9586b in ./test_deploy)
frame #24: bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) + 0x5a (0x5652bda8e0f7 in ./test_deploy)
frame #25: testing::UnitTest::Run() + 0xc9 (0x5652bda73fd1 in ./test_deploy)
frame #26: RUN_ALL_TESTS() + 0x11 (0x5652bda169fa in ./test_deploy)
frame #27: main + 0x27 (0x5652bda10ce2 in ./test_deploy)
frame #28: <unknown function> + 0x2d310 (0x7f3bc0431310 in /usr/lib/libc.so.6)
frame #29: __libc_start_main + 0x81 (0x7f3bc04313c1 in /usr/lib/libc.so.6)
frame #30: _start + 0x25 (0x5652bda063b5 in ./test_deploy)
```

Test Plan: CI

Differential Revision: D36564258

Pull Request resolved: pytorch#78028
Approved by: https://github.com/rohan-varma
jjsjann123 pushed a commit that referenced this pull request Aug 29, 2022
Hi!

I was playing with libfuzzer and found bug when loading a model from file via `torch::jit::load` function.
There is an unhandled exception in caffe2/serialize when calling a `stoull` function on unsanitized version string.

The bug can be reproduced with `aot_model_compiler` binary:
```
aot_model_compiler --model=crash-stoull --model_name=name --model_version=1 --input_dims='1,3,224,224;2,2' --input_types='float;float'
```

Crash file is provided in [crash.zip](https://github.com/pytorch/pytorch/files/8701504/crash.zip).

gdb output:
```
Temporary breakpoint 1, main (argc=6, argv=0x7ffcd160f9f8) at /pytorch_master/binaries/aot_model_compiler.cc:87
87	      "Run NNC AOT compiler for pytorch model. Example usage:\n"
(gdb) c
Continuing.
terminate called after throwing an instance of 'std::invalid_argument'
  what():  stoull

Program received signal SIGABRT, Aborted.
__GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
50	../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00007fa637f16859 in __GI_abort () at abort.c:79
#2  0x00007fa6381c1911 in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6
#3  0x00007fa6381cd38c in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6
#4  0x00007fa6381cd3f7 in std::terminate() () from /lib/x86_64-linux-gnu/libstdc++.so.6
#5  0x00007fa6381cd6a9 in __cxa_throw () from /lib/x86_64-linux-gnu/libstdc++.so.6
#6  0x00007fa6381c42ce in std::__throw_invalid_argument(char const*) () from /lib/x86_64-linux-gnu/libstdc++.so.6
#7  0x000000000247d567 in __gnu_cxx::__stoa<unsigned long long, unsigned long long, char, int> (__str=0x7ffcd160f228 "ZZ", __idx=0x0, __base=10, __convf=<optimized out>, __name=<optimized out>)
    at /usr/bin/../lib/gcc/x86_64-linux-gnu/10/../../../../include/c++/10/ext/string_conversions.h:83
#8  std::__cxx11::stoull (__str="ZZ", __idx=0x0, __base=10) at /usr/bin/../lib/gcc/x86_64-linux-gnu/10/../../../../include/c++/10/bits/basic_string.h:6577
#9  caffe2::serialize::PyTorchStreamReader::init (this=this@entry=0x8c11ce0) at /pytorch_master/caffe2/serialize/inline_container.cc:145
#10 0x000000000247d9c7 in caffe2::serialize::PyTorchStreamReader::PyTorchStreamReader (this=0x8c11ce0, in=std::shared_ptr<class caffe2::serialize::ReadAdapterInterface> (empty) = {...})
    at /pytorch_master/caffe2/serialize/inline_container.cc:88
#11 0x00000000035b7ba4 in __gnu_cxx::new_allocator<caffe2::serialize::PyTorchStreamReader>::construct<caffe2::serialize::PyTorchStreamReader, std::shared_ptr<caffe2::serialize::ReadAdapterInterface> > (
    __p=0x2, __args=..., this=<optimized out>) at /usr/bin/../lib/gcc/x86_64-linux-gnu/10/../../../../include/c++/10/ext/new_allocator.h:150
#12 std::allocator_traits<std::allocator<caffe2::serialize::PyTorchStreamReader> >::construct<caffe2::serialize::PyTorchStreamReader, std::shared_ptr<caffe2::serialize::ReadAdapterInterface> > (__a=...,
    __p=0x2, __p@entry=0x8c11ce0, __args=...) at /usr/bin/../lib/gcc/x86_64-linux-gnu/10/../../../../include/c++/10/bits/alloc_traits.h:512
#13 0x00000000035b1988 in std::_Sp_counted_ptr_inplace<caffe2::serialize::PyTorchStreamReader, std::allocator<caffe2::serialize::PyTorchStreamReader>, (__gnu_cxx::_Lock_policy)2>::_Sp_counted_ptr_inplace<std::shared_ptr<caffe2::serialize::ReadAdapterInterface> > (this=0x8c11cd0, __a=..., __args=...) at /usr/bin/../lib/gcc/x86_64-linux-gnu/10/../../../../include/c++/10/bits/shared_ptr_base.h:551
#14 std::__shared_count<(__gnu_cxx::_Lock_policy)2>::__shared_count<caffe2::serialize::PyTorchStreamReader, std::allocator<caffe2::serialize::PyTorchStreamReader>, std::shared_ptr<caffe2::serialize::ReadAdapterInterface> > (this=0x7ffcd160f3a8, __p=@0x7ffcd160f3a0: 0x10, __args=..., __a=...) at /usr/bin/../lib/gcc/x86_64-linux-gnu/10/../../../../include/c++/10/bits/shared_ptr_base.h:683
#15 std::__shared_ptr<caffe2::serialize::PyTorchStreamReader, (__gnu_cxx::_Lock_policy)2>::__shared_ptr<std::allocator<caffe2::serialize::PyTorchStreamReader>, std::shared_ptr<caffe2::serialize::ReadAdapterInterface> > (this=0x7ffcd160f3a0, __args=..., __tag=...) at /usr/bin/../lib/gcc/x86_64-linux-gnu/10/../../../../include/c++/10/bits/shared_ptr_base.h:1371
#16 std::shared_ptr<caffe2::serialize::PyTorchStreamReader>::shared_ptr<std::allocator<caffe2::serialize::PyTorchStreamReader>, std::shared_ptr<caffe2::serialize::ReadAdapterInterface> > (this=0x7ffcd160f3a0,
    __args=..., __tag=...) at /usr/bin/../lib/gcc/x86_64-linux-gnu/10/../../../../include/c++/10/bits/shared_ptr.h:408
#17 std::allocate_shared<caffe2::serialize::PyTorchStreamReader, std::allocator<caffe2::serialize::PyTorchStreamReader>, std::shared_ptr<caffe2::serialize::ReadAdapterInterface> > (__args=..., __a=...)
    at /usr/bin/../lib/gcc/x86_64-linux-gnu/10/../../../../include/c++/10/bits/shared_ptr.h:859
#18 std::make_shared<caffe2::serialize::PyTorchStreamReader, std::shared_ptr<caffe2::serialize::ReadAdapterInterface> > (__args=...)
    at /usr/bin/../lib/gcc/x86_64-linux-gnu/10/../../../../include/c++/10/bits/shared_ptr.h:875
#19 torch::jit::load (rai=std::shared_ptr<class caffe2::serialize::ReadAdapterInterface> (empty) = {...}, device=device@entry=..., Python Exception <class 'gdb.error'> No type named std::__detail::_Hash_node<struct std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, true>.:
extra_files=std::unordered_map with 0 elements)
    at /pytorch_master/torch/csrc/jit/serialization/import.cpp:474
#20 0x00000000035b1ef6 in torch::jit::load (filename="crash-stoull", device=device@entry=..., Python Exception <class 'gdb.error'> No type named std::__detail::_Hash_node<struct std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, true>.:
extra_files=std::unordered_map with 0 elements) at /pytorch_master/torch/csrc/jit/serialization/import.cpp:444
#21 0x00000000035b1d22 in torch::jit::load (filename="", device=device@entry=...) at /pytorch_master/torch/csrc/jit/serialization/import.cpp:424
#22 0x00000000008f9be3 in main (argc=1, argv=0x7ffcd160f9f8) at /pytorch_master/binaries/aot_model_compiler.cc:128
```

Pull Request resolved: pytorch#77557
Approved by: https://github.com/Gamrix
ftxj pushed a commit to ftxj/pytorch that referenced this pull request May 12, 2023
…#94297)

Hi!

I've been fuzzing different pytorch modules, and found a crash inside one of them.

Specifically, I'm talking about a module that processes `script_call` rpc requests and a function `ScriptCall::fromIValues(std::vector<at::IValue>& ivalues)`.

Running this test case causes a crash that occurs when `ivalues.back()` is called [script_call.cpp:90](https://github.com/pytorch/pytorch/blob/abc54f93145830b502400faa92bec86e05422fbd/torch/csrc/distributed/rpc/script_call.cpp#L90). The crash occurs because the vector `ivalues` is empty.

All tests were performed on this pytorch version: [abc54f9](https://github.com/pytorch/pytorch/tree/abc54f93145830b502400faa92bec86e05422fbd)

The provided patch checks if there are enough elements in the ivalues vector.

### How to reproduce

1. To reproduce the crash, use provided docker: [Dockerfile](https://github.com/ispras/oss-sydr-fuzz/tree/master/projects/pytorch)

2. Build the container: `docker build -t oss-sydr-fuzz-pytorch-reproduce .`

3. Copy crash file to the current directory:

    - [crash-9f76d4e37a2391136a4ce07d47269db1e063e4b4.zip](https://github.com/pytorch/pytorch/files/10674059/crash-9f76d4e37a2391136a4ce07d47269db1e063e4b4.zip)

4. Run the container: ``docker run --privileged --network host -v `pwd`:/homedir --rm -it oss-sydr-fuzz-pytorch-reproduce /bin/bash``

5. And execute the binary: `/message_deserialize_fuzz /homedir/crash-9f76d4e37a2391136a4ce07d47269db1e063e4b4`

After execution completes you will see this stacktrace:

```asan
AddressSanitizer:DEADLYSIGNAL
=================================================================
==57==ERROR: AddressSanitizer: SEGV on unknown address (pc 0x0000008e7b19 bp 0x7ffd2fdded70 sp 0x7ffd2fddec40 T0)
==57==The signal is caused by a READ memory access.
==57==Hint: this fault was caused by a dereference of a high value address (see register values below).  Disassemble the provided pc to learn which register was used.
    #0 0x8e7b19 in c10::IValue::isString() const /pytorch_fuzz/aten/src/ATen/core/ivalue.h:639:27
    #1 0x8e7b19 in c10::IValue::toStringRef[abi:cxx11]() const /pytorch_fuzz/aten/src/ATen/core/ivalue_inl.h:2179:3
    csarofeen#2 0xe04fb58 in torch::distributed::rpc::ScriptCall::fromIValues(std::vector<c10::IValue, std::allocator<c10::IValue> >&) /pytorch_fuzz/torch/csrc/distributed/rpc/script_call.cpp:90:53
    csarofeen#3 0xe0511f0 in torch::distributed::rpc::ScriptCall::fromMessage(torch::distributed::rpc::Message const&) /pytorch_fuzz/torch/csrc/distributed/rpc/script_call.cpp:133:10
    csarofeen#4 0xe0ff71e in torch::distributed::rpc::deserializeRequest(torch::distributed::rpc::Message const&) /pytorch_fuzz/torch/csrc/distributed/rpc/utils.cpp:102:14
    csarofeen#5 0x602a41 in LLVMFuzzerTestOneInput /message_deserialize_fuzz.cc:192:27
    csarofeen#6 0x52ce61 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) /llvm-project/compiler-rt/lib/fuzzer/FuzzerLoop.cpp:611:15
    csarofeen#7 0x516d7c in fuzzer::RunOneTest(fuzzer::Fuzzer*, char const*, unsigned long) /llvm-project/compiler-rt/lib/fuzzer/FuzzerDriver.cpp:324:6
    csarofeen#8 0x51cacb in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) /llvm-project/compiler-rt/lib/fuzzer/FuzzerDriver.cpp:860:9
    csarofeen#9 0x546062 in main /llvm-project/compiler-rt/lib/fuzzer/FuzzerMain.cpp:20:10
    csarofeen#10 0x7f41e42a8082 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x24082)
    csarofeen#11 0x51169d in _start (/message_deserialize_fuzz+0x51169d)

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV /pytorch_fuzz/aten/src/ATen/core/ivalue.h:639:27 in c10::IValue::isString() const
==57==ABORTING
```
Pull Request resolved: pytorch#94297
Approved by: https://github.com/ezyang
ftxj pushed a commit to ftxj/pytorch that referenced this pull request May 12, 2023
…ytorch#94300)

Hi!

I've been fuzzing different pytorch modules, and found a crash inside one of them.

Specifically, I'm talking about a module for unpickling and a function called `Unpickler::readInstruction()`. Running this function with provided crash file results in a crash, which occurs while calling `auto dict = stack_.at(dict_pos).toGenericDict();` [unpickler.cpp:561](https://github.com/pytorch/pytorch/blob/0e94fbc0c8ab1572c88159c1a4c397b6eb824c01/torch/csrc/jit/serialization/unpickler.cpp#L561). The crash occurs, because the index `dict_pos` is out of bounds (which itself happens because the stack size is 0).

Besides this pull-request, there is another one related to unpickler hardening: pytorch#84343

All tests were performed on this pytorch version: [abc54f9](https://github.com/pytorch/pytorch/tree/abc54f93145830b502400faa92bec86e05422fbd)

### How to reproduce

1. To reproduce the crash, use provided docker: [Dockerfile](https://github.com/ispras/oss-sydr-fuzz/tree/master/projects/pytorch)

2. Build the container: `docker build -t oss-sydr-fuzz-pytorch-reproduce .`

3. Copy crash file to the current directory:

    - [crash-042dff5e121580425d9d34d0f293918f3c9fbf1e.zip](https://github.com/pytorch/pytorch/files/10674361/crash-042dff5e121580425d9d34d0f293918f3c9fbf1e.zip)

4. Run the container: ``docker run --privileged --network host -v `pwd`:/homedir --rm -it oss-sydr-fuzz-pytorch-reproduce /bin/bash``

5. And execute the binary: `/message_deserialize_sydr /homedir/crash-042dff5e121580425d9d34d0f293918f3c9fbf1e`

After execution completes you will see this error message:

```txt
terminate called after throwing an instance of 'std::out_of_range'
  what():  vector::_M_range_check: __n (which is 18446744073709551613) >= this->size() (which is 0)
```

And this stacktrace:

```asan
erminate called after throwing an instance of 'std::out_of_range'
  what():  vector::_M_range_check: __n (which is 18446744073709551613) >= this->size() (which is 0)
==39== ERROR: libFuzzer: deadly signal
    #0 0x5d0df1 in __sanitizer_print_stack_trace /llvm-project/compiler-rt/lib/asan/asan_stack.cpp:87:3
    #1 0x545727 in fuzzer::PrintStackTrace() /llvm-project/compiler-rt/lib/fuzzer/FuzzerUtil.cpp:210:5
    csarofeen#2 0x52b933 in fuzzer::Fuzzer::CrashCallback() /llvm-project/compiler-rt/lib/fuzzer/FuzzerLoop.cpp:233:3
    csarofeen#3 0x7f9118e0341f  (/lib/x86_64-linux-gnu/libpthread.so.0+0x1441f)
    csarofeen#4 0x7f9118c2300a in raise (/lib/x86_64-linux-gnu/libc.so.6+0x4300a)
    csarofeen#5 0x7f9118c02858 in abort (/lib/x86_64-linux-gnu/libc.so.6+0x22858)
    csarofeen#6 0x7f9119040910  (/lib/x86_64-linux-gnu/libstdc++.so.6+0x9e910)
    csarofeen#7 0x7f911904c38b  (/lib/x86_64-linux-gnu/libstdc++.so.6+0xaa38b)
    csarofeen#8 0x7f911904c3f6 in std::terminate() (/lib/x86_64-linux-gnu/libstdc++.so.6+0xaa3f6)
    csarofeen#9 0x7f911904c6a8 in __cxa_throw (/lib/x86_64-linux-gnu/libstdc++.so.6+0xaa6a8)
    csarofeen#10 0x7f91190433aa  (/lib/x86_64-linux-gnu/libstdc++.so.6+0xa13aa)
    csarofeen#11 0x63acdf in std::vector<c10::IValue, std::allocator<c10::IValue> >::_M_range_check(unsigned long) const /usr/bin/../lib/gcc/x86_64-linux-gnu/10/../../../../include/c++/10/bits/stl_vector.h:1073:4
    csarofeen#12 0xce8f93e in std::vector<c10::IValue, std::allocator<c10::IValue> >::at(unsigned long) /usr/bin/../lib/gcc/x86_64-linux-gnu/10/../../../../include/c++/10/bits/stl_vector.h:1094:2
    csarofeen#13 0xce8f93e in torch::jit::Unpickler::readInstruction() /pytorch_fuzz/torch/csrc/jit/serialization/unpickler.cpp:546:26
    csarofeen#14 0xce8d527 in torch::jit::Unpickler::run() /pytorch_fuzz/torch/csrc/jit/serialization/unpickler.cpp:235:27
    csarofeen#15 0xce8d1c2 in torch::jit::Unpickler::parse_ivalue() /pytorch_fuzz/torch/csrc/jit/serialization/unpickler.cpp:192:3
    csarofeen#16 0xcdf0792 in torch::jit::unpickle(std::function<unsigned long (char*, unsigned long)>, std::function<c10::StrongTypePtr (c10::QualifiedName const&)>, c10::ArrayRef<at::Tensor>, c10::Type::SingletonOrSharedTypePtr<c10::Type> (*)(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)) /pytorch_fuzz/torch/csrc/jit/serialization/pickle.cpp:127:20
    csarofeen#17 0xcdf104d in torch::jit::unpickle(char const*, unsigned long, std::function<c10::StrongTypePtr (c10::QualifiedName const&)>, c10::ArrayRef<at::Tensor>, c10::Type::SingletonOrSharedTypePtr<c10::Type> (*)(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)) /pytorch_fuzz/torch/csrc/jit/serialization/pickle.cpp:137:10
    csarofeen#18 0xe0532db in torch::distributed::rpc::ScriptRemoteCall::fromMessage(torch::distributed::rpc::Message const&) /pytorch_fuzz/torch/csrc/distributed/rpc/script_remote_call.cpp:74:16
    csarofeen#19 0xe0ffa10 in torch::distributed::rpc::deserializeRequest(torch::distributed::rpc::Message const&) /pytorch_fuzz/torch/csrc/distributed/rpc/utils.cpp:108:14
    csarofeen#20 0x602a41 in LLVMFuzzerTestOneInput /message_deserialize_fuzz.cc:192:27
    csarofeen#21 0x52ce61 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) /llvm-project/compiler-rt/lib/fuzzer/FuzzerLoop.cpp:611:15
    csarofeen#22 0x516d7c in fuzzer::RunOneTest(fuzzer::Fuzzer*, char const*, unsigned long) /llvm-project/compiler-rt/lib/fuzzer/FuzzerDriver.cpp:324:6
    csarofeen#23 0x51cacb in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) /llvm-project/compiler-rt/lib/fuzzer/FuzzerDriver.cpp:860:9
    csarofeen#24 0x546062 in main /llvm-project/compiler-rt/lib/fuzzer/FuzzerMain.cpp:20:10
    csarofeen#25 0x7f9118c04082 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x24082)
    csarofeen#26 0x51169d in _start (/message_deserialize_fuzz+0x51169d)

NOTE: libFuzzer has rudimentary signal handlers.
      Combine libFuzzer with AddressSanitizer or similar for better crash reports.
SUMMARY: libFuzzer: deadly signal
```
Pull Request resolved: pytorch#94300
Approved by: https://github.com/malfet, https://github.com/apach301
ftxj pushed a commit to ftxj/pytorch that referenced this pull request May 25, 2023
When tensor is resized, reference array to it's sizes may become invalid. Make a copy in advance.

<details>
<summary>ASAN report</summary>

```
=================================================================
==1115867==ERROR: AddressSanitizer: heap-use-after-free on address 0x61000013d790 at pc 0x03ff8e7da360 bp 0x03fff53c83a0 sp 0x03fff53c8390
READ of size 8 at 0x61000013d790 thread T0
    #0 0x3ff8e7da35f in c10::SymInt::is_heap_allocated() const /home/user/pytorch/c10/core/SymInt.h:154
    #1 0x3ff8e7da35f in c10::SymInt::maybe_as_int() const /home/user/pytorch/c10/core/SymInt.h:215
    csarofeen#2 0x3ff8e7d0a6d in c10::SymInt::sym_eq(c10::SymInt const&) const /home/user/pytorch/c10/core/SymInt.cpp:69
    csarofeen#3 0x3ff7a9ab0bd in c10::SymInt::operator==(c10::SymInt const&) const /home/user/pytorch/c10/core/SymInt.h:177
    csarofeen#4 0x3ff7a9aaedd in bool std::__equal<false>::equal<c10::SymInt const*, c10::SymInt const*>(c10::SymInt const*, c10::SymInt const*, c10::SymInt const*) /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-
v11/bits/stl_algobase.h:1162
    csarofeen#5 0x3ff7a9aae4b in bool std::__equal_aux1<c10::SymInt const*, c10::SymInt const*>(c10::SymInt const*, c10::SymInt const*, c10::SymInt const*) /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/
stl_algobase.h:1211
    csarofeen#6 0x3ff7a9aae05 in bool std::__equal_aux<c10::SymInt const*, c10::SymInt const*>(c10::SymInt const*, c10::SymInt const*, c10::SymInt const*) /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/s
tl_algobase.h:1219
    csarofeen#7 0x3ff7a9aad97 in bool std::equal<c10::SymInt const*, c10::SymInt const*>(c10::SymInt const*, c10::SymInt const*, c10::SymInt const*) /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/stl_alg
obase.h:1556
    csarofeen#8 0x3ff4b23c771 in c10::ArrayRef<c10::SymInt>::equals(c10::ArrayRef<c10::SymInt>) const /home/user/pytorch/c10/util/ArrayRef.h:188
    csarofeen#9 0x3ff4cb91bc1 in bool c10::operator!=<c10::SymInt>(c10::ArrayRef<c10::SymInt>, c10::ArrayRef<c10::SymInt>) /home/user/pytorch/c10/util/ArrayRef.h:341
    csarofeen#10 0x3ff6d1b57ff in torch::ADInplaceOrView::resize_(c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>) /home/user/pytorch/torch/csrc/autograd/Variab
leTypeManual.cpp:408
    csarofeen#11 0x3ff6d1e59c7 in c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor const& (c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c1
0::MemoryFormat>), &torch::ADInplaceOrView::resize_>, at::Tensor const&, c10::guts::typelist::typelist<c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>
> >::operator()(c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>) /home/user/pytorch/aten/src/ATen/core/boxing/impl/WrapFunctionIntoFunctor.h:13
    csarofeen#12 0x3ff6d1e59c7 in c10::impl::wrap_kernel_functor_unboxed_<c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor const& (c10::DispatchKeySet, at::Tensor const&, c10:
:ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>), &torch::ADInplaceOrView::resize_>, at::Tensor const&, c10::guts::typelist::typelist<c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::Sy
mInt>, c10::optional<c10::MemoryFormat> > >, at::Tensor const& (c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>)>::call(c10::OperatorKernel*, c10::Disp
atchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>) /home/user/pytorch/aten/src/ATen/core/boxing/impl/make_boxed_from_unboxed_functor.h:480
    csarofeen#13 0x3ff51ca5129 in at::Tensor const& c10::callUnboxedKernelFunction<at::Tensor const&, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat> >(void*, c10::OperatorKernel*,
c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>&&, c10::optional<c10::MemoryFormat>&&) /home/user/pytorch/aten/src/ATen/core/boxing/KernelFunction_impl.h:50
    csarofeen#14 0x3ff51ca6e8f in at::Tensor const& c10::KernelFunction::call<at::Tensor const&, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat> >(c10::OperatorHandle const&, c10::D
ispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>) const /home/user/pytorch/aten/src/ATen/core/boxing/KernelFunction_impl.h:90
    csarofeen#15 0x3ff51ca6e8f in at::Tensor const& c10::Dispatcher::redispatch<at::Tensor const&, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat> >(c10::TypedOperatorHandle<at::Ten
sor const& (at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>)> const&, c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>)
const /home/user/pytorch/aten/src/ATen/core/dispatch/Dispatcher.h:656
    csarofeen#16 0x3ff5182006b in c10::TypedOperatorHandle<at::Tensor const& (at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>)>::redispatch(c10::DispatchKeySet, at::Tensor const&, c
10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>) const /home/user/pytorch/aten/src/ATen/core/dispatch/Dispatcher.h:492
    csarofeen#17 0x3ff5182006b in at::_ops::resize_::redispatch(c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>) aten/src/ATen/Operators_4.cpp:2144
    csarofeen#18 0x3ff6d1d5e07 in at::redispatch::resize__symint(c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>) aten/src/ATen/RedispatchFunctions.h:2847
    csarofeen#19 0x3ff6d1bbb67 in torch::autograd::VariableType::(anonymous namespace)::resize_(c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>) /home/user/pyto
rch/torch/csrc/autograd/VariableTypeManual.cpp:243
    csarofeen#20 0x3ff6d1bd197 in c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor const& (c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c1
0::MemoryFormat>), &torch::autograd::VariableType::(anonymous namespace)::resize_>, at::Tensor const&, c10::guts::typelist::typelist<c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10
::optional<c10::MemoryFormat> > >::operator()(c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>) /home/user/pytorch/aten/src/ATen/core/boxing/impl/WrapFu
nctionIntoFunctor.h:13
    csarofeen#21 0x3ff6d1bd197 in c10::impl::wrap_kernel_functor_unboxed_<c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor const& (c10::DispatchKeySet, at::Tensor const&, c10:
:ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>), &torch::autograd::VariableType::(anonymous namespace)::resize_>, at::Tensor const&, c10::guts::typelist::typelist<c10::DispatchKeySet, at::Tensor
 const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat> > >, at::Tensor const& (c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>)>::call(c
10::OperatorKernel*, c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>) /home/user/pytorch/aten/src/ATen/core/boxing/impl/make_boxed_from_unboxed_functor
.h:480
    csarofeen#22 0x3ff51ca5129 in at::Tensor const& c10::callUnboxedKernelFunction<at::Tensor const&, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat> >(void*, c10::OperatorKernel*,
c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>&&, c10::optional<c10::MemoryFormat>&&) /home/user/pytorch/aten/src/ATen/core/boxing/KernelFunction_impl.h:50
    csarofeen#23 0x3ff5181ead1 in at::Tensor const& c10::KernelFunction::call<at::Tensor const&, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat> >(c10::OperatorHandle const&, c10::D
ispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>) const /home/user/pytorch/aten/src/ATen/core/boxing/KernelFunction_impl.h:90
    csarofeen#24 0x3ff5181ead1 in at::Tensor const& c10::Dispatcher::call<at::Tensor const&, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat> >(c10::TypedOperatorHandle<at::Tensor co
nst& (at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>)> const&, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>) const /home/user/pytorch/at
en/src/ATen/core/dispatch/Dispatcher.h:639
    csarofeen#25 0x3ff5181ead1 in c10::TypedOperatorHandle<at::Tensor const& (at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>)>::call(at::Tensor const&, c10::ArrayRef<c10::SymInt>,
c10::optional<c10::MemoryFormat>) const /home/user/pytorch/aten/src/ATen/core/dispatch/Dispatcher.h:487
    csarofeen#26 0x3ff5181ead1 in at::_ops::resize_::call(at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>) aten/src/ATen/Operators_4.cpp:2137
    csarofeen#27 0x3ff79b44fcf in at::Tensor::resize__symint(c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>) const aten/src/ATen/core/TensorBody.h:2452
    csarofeen#28 0x3ff79a802db in torch::autograd::THPVariable_resize_(_object*, _object*, _object*)::$_0::operator()(at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>) const /home/us
er/pytorch/torch/csrc/autograd/generated/python_variable_methods.cpp:13417
    csarofeen#29 0x3ff7999f1eb in torch::autograd::THPVariable_resize_(_object*, _object*, _object*) /home/user/pytorch/torch/csrc/autograd/generated/python_variable_methods.cpp:13419
    csarofeen#30 0x3ffa2c9b009 in method_vectorcall_VARARGS_KEYWORDS Objects/descrobject.c:344
    csarofeen#31 0x3ffa2df00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    csarofeen#32 0x3ffa2df013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    csarofeen#33 0x3ffa2e05447 in call_function Python/ceval.c:5891
    csarofeen#34 0x3ffa2dff7d7 in _PyEval_EvalFrameDefault Python/ceval.c:4198
    csarofeen#35 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    csarofeen#36 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065
    csarofeen#37 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    csarofeen#38 0x3ffa2c8ab15 in PyVectorcall_Call Objects/call.c:255
    csarofeen#39 0x3ffa2c8ac65 in _PyObject_Call Objects/call.c:290
    csarofeen#40 0x3ffa2c8ada9 in PyObject_Call Objects/call.c:317
    csarofeen#41 0x3ffa2e059c7 in do_call_core Python/ceval.c:5943
    csarofeen#42 0x3ffa2dffd39 in _PyEval_EvalFrameDefault Python/ceval.c:4277
    csarofeen#43 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    csarofeen#44 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065
    csarofeen#45 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    csarofeen#46 0x3ffa2c8ab15 in PyVectorcall_Call Objects/call.c:255
    csarofeen#47 0x3ffa2c8ac65 in _PyObject_Call Objects/call.c:290
    csarofeen#48 0x3ffa2c8ada9 in PyObject_Call Objects/call.c:317
    csarofeen#49 0x3ffa2e059c7 in do_call_core Python/ceval.c:5943
    csarofeen#50 0x3ffa2dffd39 in _PyEval_EvalFrameDefault Python/ceval.c:4277
    csarofeen#51 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    csarofeen#52 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065
    csarofeen#53 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    csarofeen#54 0x3ffa2df00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    csarofeen#55 0x3ffa2df013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    csarofeen#56 0x3ffa2e05447 in call_function Python/ceval.c:5891
    csarofeen#57 0x3ffa2dff7d7 in _PyEval_EvalFrameDefault Python/ceval.c:4198
    csarofeen#58 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    csarofeen#59 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065
    csarofeen#60 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    csarofeen#61 0x3ffa2c8e941 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    csarofeen#62 0x3ffa2c8eddd in method_vectorcall Objects/classobject.c:53
    csarofeen#63 0x3ffa2df00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    csarofeen#64 0x3ffa2df013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    csarofeen#65 0x3ffa2e05447 in call_function Python/ceval.c:5891
    csarofeen#66 0x3ffa2dff905 in _PyEval_EvalFrameDefault Python/ceval.c:4213
    csarofeen#67 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    csarofeen#68 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065
    csarofeen#69 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    csarofeen#70 0x3ffa2df00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    csarofeen#71 0x3ffa2df013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    csarofeen#72 0x3ffa2e05447 in call_function Python/ceval.c:5891
    csarofeen#73 0x3ffa2dff7d7 in _PyEval_EvalFrameDefault Python/ceval.c:4198
    csarofeen#74 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    csarofeen#75 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065
    csarofeen#76 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    csarofeen#77 0x3ffa2c8e941 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    csarofeen#78 0x3ffa2c8eddd in method_vectorcall Objects/classobject.c:53
    csarofeen#79 0x3ffa2df00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    csarofeen#80 0x3ffa2df013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    csarofeen#81 0x3ffa2e05447 in call_function Python/ceval.c:5891
    csarofeen#82 0x3ffa2dffa57 in _PyEval_EvalFrameDefault Python/ceval.c:4231
    csarofeen#83 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    csarofeen#84 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065
    csarofeen#85 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    csarofeen#86 0x3ffa2c8e941 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    csarofeen#87 0x3ffa2c8eddd in method_vectorcall Objects/classobject.c:53
    csarofeen#88 0x3ffa2df00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    csarofeen#89 0x3ffa2df013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    csarofeen#90 0x3ffa2e05447 in call_function Python/ceval.c:5891
    csarofeen#91 0x3ffa2dffa57 in _PyEval_EvalFrameDefault Python/ceval.c:4231
    csarofeen#92 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    csarofeen#93 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065
    csarofeen#94 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    csarofeen#95 0x3ffa2c8e941 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    csarofeen#96 0x3ffa2c8eddd in method_vectorcall Objects/classobject.c:53
    csarofeen#97 0x3ffa2c8ab9b in PyVectorcall_Call Objects/call.c:267
    csarofeen#98 0x3ffa2c8ac65 in _PyObject_Call Objects/call.c:290
    csarofeen#99 0x3ffa2c8ada9 in PyObject_Call Objects/call.c:317
    csarofeen#100 0x3ffa2e059c7 in do_call_core Python/ceval.c:5943
    csarofeen#101 0x3ffa2dffd39 in _PyEval_EvalFrameDefault Python/ceval.c:4277
    csarofeen#102 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    csarofeen#103 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065
    csarofeen#104 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    csarofeen#105 0x3ffa2c8a695 in _PyObject_FastCallDictTstate Objects/call.c:153
    csarofeen#106 0x3ffa2c8b271 in _PyObject_Call_Prepend Objects/call.c:431
    csarofeen#107 0x3ffa2d3f307 in slot_tp_call Objects/typeobject.c:7494
    csarofeen#108 0x3ffa2c8a933 in _PyObject_MakeTpCall Objects/call.c:215
    csarofeen#109 0x3ffa2df0081 in _PyObject_VectorcallTstate Include/cpython/abstract.h:112
    csarofeen#110 0x3ffa2df013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    csarofeen#111 0x3ffa2e05447 in call_function Python/ceval.c:5891
    csarofeen#112 0x3ffa2dffa57 in _PyEval_EvalFrameDefault Python/ceval.c:4231
    csarofeen#113 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    csarofeen#114 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065
    csarofeen#115 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    csarofeen#116 0x3ffa2df00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    csarofeen#117 0x3ffa2df013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    csarofeen#118 0x3ffa2e05447 in call_function Python/ceval.c:5891
    csarofeen#119 0x3ffa2dff7d7 in _PyEval_EvalFrameDefault Python/ceval.c:4198
    csarofeen#120 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    csarofeen#121 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065
    csarofeen#122 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    csarofeen#123 0x3ffa2c8ab15 in PyVectorcall_Call Objects/call.c:255
    csarofeen#124 0x3ffa2c8ac65 in _PyObject_Call Objects/call.c:290
    csarofeen#125 0x3ffa2c8ada9 in PyObject_Call Objects/call.c:317
    csarofeen#126 0x3ffa2e059c7 in do_call_core Python/ceval.c:5943
    csarofeen#127 0x3ffa2dffd39 in _PyEval_EvalFrameDefault Python/ceval.c:4277
    csarofeen#128 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    csarofeen#129 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065
    csarofeen#130 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    csarofeen#131 0x3ffa2df00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    csarofeen#132 0x3ffa2df013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    csarofeen#133 0x3ffa2e05447 in call_function Python/ceval.c:5891
    csarofeen#134 0x3ffa2dff779 in _PyEval_EvalFrameDefault Python/ceval.c:4181
    csarofeen#135 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    csarofeen#136 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065
    csarofeen#137 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    csarofeen#138 0x3ffa2c8e941 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    csarofeen#139 0x3ffa2c8eddd in method_vectorcall Objects/classobject.c:53
    csarofeen#140 0x3ffa2df00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    csarofeen#141 0x3ffa2df013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    csarofeen#142 0x3ffa2e05447 in call_function Python/ceval.c:5891
    csarofeen#143 0x3ffa2dff779 in _PyEval_EvalFrameDefault Python/ceval.c:4181
    csarofeen#144 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    csarofeen#145 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065
    csarofeen#146 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    csarofeen#147 0x3ffa2c8a695 in _PyObject_FastCallDictTstate Objects/call.c:153
    csarofeen#148 0x3ffa2c8b271 in _PyObject_Call_Prepend Objects/call.c:431
    csarofeen#149 0x3ffa2d3f307 in slot_tp_call Objects/typeobject.c:7494
    csarofeen#150 0x3ffa2c8ad17 in _PyObject_Call Objects/call.c:305
    csarofeen#151 0x3ffa2c8ada9 in PyObject_Call Objects/call.c:317
    csarofeen#152 0x3ffa2e059c7 in do_call_core Python/ceval.c:5943
    csarofeen#153 0x3ffa2dffd39 in _PyEval_EvalFrameDefault Python/ceval.c:4277
    csarofeen#154 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    csarofeen#155 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065
    csarofeen#156 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    csarofeen#157 0x3ffa2df00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    csarofeen#158 0x3ffa2df013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    csarofeen#159 0x3ffa2e05447 in call_function Python/ceval.c:5891
    csarofeen#160 0x3ffa2dff905 in _PyEval_EvalFrameDefault Python/ceval.c:4213
    csarofeen#161 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    csarofeen#162 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065
    csarofeen#163 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    csarofeen#164 0x3ffa2c8e941 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    csarofeen#165 0x3ffa2c8eddd in method_vectorcall Objects/classobject.c:53
    csarofeen#166 0x3ffa2df00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    csarofeen#167 0x3ffa2df013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    csarofeen#168 0x3ffa2e05447 in call_function Python/ceval.c:5891
    csarofeen#169 0x3ffa2dffa57 in _PyEval_EvalFrameDefault Python/ceval.c:4231
    csarofeen#170 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    csarofeen#171 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065
    csarofeen#172 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    csarofeen#173 0x3ffa2c8ab15 in PyVectorcall_Call Objects/call.c:255
    csarofeen#174 0x3ffa2c8ac65 in _PyObject_Call Objects/call.c:290
    csarofeen#175 0x3ffa2c8ada9 in PyObject_Call Objects/call.c:317
    csarofeen#176 0x3ffa2e059c7 in do_call_core Python/ceval.c:5943
    csarofeen#177 0x3ffa2dffd39 in _PyEval_EvalFrameDefault Python/ceval.c:4277
    csarofeen#178 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    csarofeen#179 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065
    csarofeen#180 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    csarofeen#181 0x3ffa2df00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    csarofeen#182 0x3ffa2df013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    csarofeen#183 0x3ffa2e05447 in call_function Python/ceval.c:5891
    csarofeen#184 0x3ffa2dff905 in _PyEval_EvalFrameDefault Python/ceval.c:4213
    csarofeen#185 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    csarofeen#186 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065
    csarofeen#187 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    csarofeen#188 0x3ffa2df00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    csarofeen#189 0x3ffa2df013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    csarofeen#190 0x3ffa2e05447 in call_function Python/ceval.c:5891
    csarofeen#191 0x3ffa2dffa57 in _PyEval_EvalFrameDefault Python/ceval.c:4231
    csarofeen#192 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    csarofeen#193 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065
    csarofeen#194 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    csarofeen#195 0x3ffa2c8ab15 in PyVectorcall_Call Objects/call.c:255
    csarofeen#196 0x3ffa2c8ac65 in _PyObject_Call Objects/call.c:290
    csarofeen#197 0x3ffa2c8ada9 in PyObject_Call Objects/call.c:317
    csarofeen#198 0x3ffa2e059c7 in do_call_core Python/ceval.c:5943
    csarofeen#199 0x3ffa2dffd39 in _PyEval_EvalFrameDefault Python/ceval.c:4277
    csarofeen#200 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    csarofeen#201 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065
    csarofeen#202 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    csarofeen#203 0x3ffa2df00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    csarofeen#204 0x3ffa2df013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    csarofeen#205 0x3ffa2e05447 in call_function Python/ceval.c:5891
    csarofeen#206 0x3ffa2dff779 in _PyEval_EvalFrameDefault Python/ceval.c:4181
    csarofeen#207 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    csarofeen#208 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065
    csarofeen#209 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    csarofeen#210 0x3ffa2c8e941 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    csarofeen#211 0x3ffa2c8eddd in method_vectorcall Objects/classobject.c:53
    csarofeen#212 0x3ffa2df00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    csarofeen#213 0x3ffa2df013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    csarofeen#214 0x3ffa2e05447 in call_function Python/ceval.c:5891
    csarofeen#215 0x3ffa2dff779 in _PyEval_EvalFrameDefault Python/ceval.c:4181
    csarofeen#216 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    csarofeen#217 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065
    csarofeen#218 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    csarofeen#219 0x3ffa2c8a695 in _PyObject_FastCallDictTstate Objects/call.c:153
    csarofeen#220 0x3ffa2c8b271 in _PyObject_Call_Prepend Objects/call.c:431
    csarofeen#221 0x3ffa2d3f307 in slot_tp_call Objects/typeobject.c:7494
    csarofeen#222 0x3ffa2c8a933 in _PyObject_MakeTpCall Objects/call.c:215
    csarofeen#223 0x3ffa2df0081 in _PyObject_VectorcallTstate Include/cpython/abstract.h:112
    csarofeen#224 0x3ffa2df013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    csarofeen#225 0x3ffa2e05447 in call_function Python/ceval.c:5891
    csarofeen#226 0x3ffa2dffa57 in _PyEval_EvalFrameDefault Python/ceval.c:4231
    csarofeen#227 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    csarofeen#228 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065
    csarofeen#229 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    csarofeen#230 0x3ffa2c8ab15 in PyVectorcall_Call Objects/call.c:255
    csarofeen#231 0x3ffa2c8ac65 in _PyObject_Call Objects/call.c:290
    csarofeen#232 0x3ffa2c8ada9 in PyObject_Call Objects/call.c:317
    csarofeen#233 0x3ffa2e059c7 in do_call_core Python/ceval.c:5943
    csarofeen#234 0x3ffa2dffd39 in _PyEval_EvalFrameDefault Python/ceval.c:4277
    csarofeen#235 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    csarofeen#236 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065
    csarofeen#237 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    csarofeen#238 0x3ffa2df00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    csarofeen#239 0x3ffa2df013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    csarofeen#240 0x3ffa2e05447 in call_function Python/ceval.c:5891
    csarofeen#241 0x3ffa2dff779 in _PyEval_EvalFrameDefault Python/ceval.c:4181
    csarofeen#242 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    csarofeen#243 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065
    csarofeen#244 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    csarofeen#245 0x3ffa2c8e941 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    csarofeen#246 0x3ffa2c8eddd in method_vectorcall Objects/classobject.c:53
    csarofeen#247 0x3ffa2df00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    csarofeen#248 0x3ffa2df013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    csarofeen#249 0x3ffa2e05447 in call_function Python/ceval.c:5891
    csarofeen#250 0x3ffa2dff779 in _PyEval_EvalFrameDefault Python/ceval.c:4181
    csarofeen#251 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    csarofeen#252 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065
    csarofeen#253 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    csarofeen#254 0x3ffa2c8a695 in _PyObject_FastCallDictTstate Objects/call.c:153
    csarofeen#255 0x3ffa2c8b271 in _PyObject_Call_Prepend Objects/call.c:431
    csarofeen#256 0x3ffa2d3f307 in slot_tp_call Objects/typeobject.c:7494
    csarofeen#257 0x3ffa2c8a933 in _PyObject_MakeTpCall Objects/call.c:215

0x61000013d790 is located 80 bytes inside of 192-byte region [0x61000013d740,0x61000013d800)
freed by thread T0 here:
    #0 0x3ffa3237de5 in operator delete(void*) /var/tmp/portage/sys-devel/gcc-11.3.1_p20230303/work/gcc-11-20230303/libsanitizer/asan/asan_new_delete.cpp:160
    #1 0x3ff8e7e3221 in c10::TensorImpl::~TensorImpl() /home/user/pytorch/c10/core/TensorImpl.cpp:75

previously allocated by thread T0 here:
    #0 0x3ffa323734f in operator new(unsigned long) /var/tmp/portage/sys-devel/gcc-11.3.1_p20230303/work/gcc-11-20230303/libsanitizer/asan/asan_new_delete.cpp:99
    #1 0x3ff4aeeb3d1 in c10::intrusive_ptr<c10::TensorImpl, c10::detail::intrusive_target_default_null_type<c10::TensorImpl> > c10::intrusive_ptr<c10::TensorImpl, c10::detail::intrusive_target_default_nul
l_type<c10::TensorImpl> >::make<c10::intrusive_ptr<c10::StorageImpl, c10::detail::intrusive_target_default_null_type<c10::StorageImpl> >, c10::DispatchKeySet&, caffe2::TypeMeta&>(c10::intrusive_ptr<c10::S
torageImpl, c10::detail::intrusive_target_default_null_type<c10::StorageImpl> >&&, c10::DispatchKeySet&, caffe2::TypeMeta&) /home/user/pytorch/c10/util/intrusive_ptr.h:498
    csarofeen#2 0x3ff76f79e17  (/home/user/pytorch/build/lib.linux-s390x-cpython-310/torch/lib/libtorch_cpu.so+0x2fb79e17)

SUMMARY: AddressSanitizer: heap-use-after-free /home/user/pytorch/c10/core/SymInt.h:154 in c10::SymInt::is_heap_allocated() const
Shadow bytes around the buggy address:
  0x100c2000027aa0: fa fa fa fa fa fa fa fa fd fd fd fd fd fd fd fd
  0x100c2000027ab0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x100c2000027ac0: fa fa fa fa fa fa fa fa fd fd fd fd fd fd fd fd
  0x100c2000027ad0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x100c2000027ae0: fa fa fa fa fa fa fa fa fd fd fd fd fd fd fd fd
=>0x100c2000027af0: fd fd[fd]fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x100c2000027b00: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00
  0x100c2000027b10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x100c2000027b20: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00
  0x100c2000027b30: 00 00 00 00 04 fa fa fa fa fa fa fa fa fa fa fa
  0x100c2000027b40: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
  Shadow gap:              cc
==1115867==ABORTING
```
</details>

<details>
<summary>Additional backtraces (not full)</summary>

Memory deallocation:
```
#0  operator delete (ptr=0x61000013d740) at /var/tmp/portage/sys-devel/gcc-11.3.1_p20230303/work/gcc-11-20230303/libsanitizer/asan/asan_new_delete.cpp:160
#1  0x000003ffa77e3222 in c10::TensorImpl::~TensorImpl (this=0x61000013d740) at /home/user/pytorch/c10/core/TensorImpl.cpp:75
csarofeen#2  0x000003ff63e76e8c in c10::intrusive_ptr<c10::TensorImpl, c10::UndefinedTensorImpl>::reset_ (this=0x3ffd7ec8230) at /home/user/pytorch/c10/util/intrusive_ptr.h:291
csarofeen#3  0x000003ff63e76910 in c10::intrusive_ptr<c10::TensorImpl, c10::UndefinedTensorImpl>::~intrusive_ptr (this=0x3ffd7ec8230) at /home/user/pytorch/c10/util/intrusive_ptr.h:370
csarofeen#4  0x000003ff63e67240 in at::TensorBase::~TensorBase (this=0x3ffd7ec8230) at /home/user/pytorch/aten/src/ATen/core/TensorBase.h:80
csarofeen#5  0x000003ff63e85ee0 in at::Tensor::~Tensor (this=0x3ffd7ec8230) at aten/src/ATen/core/TensorBody.h:90
csarofeen#6  0x000003ff63f67304 in resize__functionalization (dispatchKeySet=..., self=..., size=..., memory_format=...) at /home/user/pytorch/aten/src/ATen/FunctionalizeFallbackKernel.cpp:173
csarofeen#7  0x000003ff63f89258 in c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor const& (c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<long>, c10::optional<c10::MemoryFormat>), &(resize__functionalization(c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<long>, c10::optional<c10::MemoryFormat>))>, at::Tensor const&, c10::guts::typelist::typelist<c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<long>, c10::optional<c10::MemoryFormat> > >::operator()(c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<long>, c10::optional<c10::MemoryFormat>) (
    this=0x6030000390a0, args=..., args=..., args=..., args=...) at /home/user/pytorch/aten/src/ATen/core/boxing/impl/WrapFunctionIntoFunctor.h:13
csarofeen#8  c10::impl::wrap_kernel_functor_unboxed_<c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor const& (c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<long>, c10::optional<c10::MemoryFormat>), &(resize__functionalization(c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<long>, c10::optional<c10::MemoryFormat>))>, at::Tensor const&, c10::guts::typelist::typelist<c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<long>, c10::optional<c10::MemoryFormat> > >, at::Tensor const& (c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<long>, c10::optional<c10::MemoryFormat>)>::call(c10::OperatorKernel*, c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<long>, c10::optional<c10::MemoryFormat>) (functor=0x6030000390a0, dispatchKeySet=..., args=..., args=...,
    args=...) at /home/user/pytorch/aten/src/ATen/core/boxing/impl/make_boxed_from_unboxed_functor.h:480
csarofeen#9  0x000003ff6aca560a in c10::callUnboxedKernelFunction<at::Tensor const&, at::Tensor const&, c10::ArrayRef<long>, c10::optional<c10::MemoryFormat> > (
    unboxed_kernel_func=0x3ff63f88a80 <c10::impl::wrap_kernel_functor_unboxed_<c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor const& (c10::DispatchKeySet, at::Tenso
r const&, c10::ArrayRef<long>, c10::optional<c10::MemoryFormat>), &(resize__functionalization(c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<long>, c10::optional<c10::MemoryFormat>))>, at::Tensor const&, c10::guts::typelist::typelist<c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<long>, c10::optional<c10::MemoryFormat> > >, at::Tensor const& (c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<long>, c10::optional<c10::MemoryFormat>)>::call(c10::OperatorKernel*, c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<long>, c10::optional<c10::MemoryFormat>)>, functor=0x6030000390a0,
    dispatchKeySet=..., args=..., args=..., args=...) at /home/user/pytorch/aten/src/ATen/core/boxing/KernelFunction_impl.h:50
csarofeen#10 0x000003ff6aca715c in c10::KernelFunction::call<at::Tensor const&, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat> > (this=0x6210005e1b28, opHandle=...,
    dispatchKeySet=..., args=..., args=..., args=...) at /home/user/pytorch/aten/src/ATen/core/boxing/KernelFunction_impl.h:96
csarofeen#11 c10::Dispatcher::redispatch<at::Tensor const&, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat> >(c10::TypedOperatorHandle<at::Tensor const& (at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>)> const&, c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>) const (
    this=0x3ff919400e0 <c10::Dispatcher::realSingleton()::_singleton>, op=..., currentDispatchKeySet=..., args=..., args=..., args=...) at /home/user/pytorch/aten/src/ATen/core/dispatch/Dispatcher.h:656
csarofeen#12 0x000003ff6a82006c in c10::TypedOperatorHandle<at::Tensor const& (at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>)>::redispatch(c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>) const (
    this=0x3ff919a07e0 <at::_ops::resize_::redispatch(c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>)::op>, currentDispatchKeySet=..., args=...,
    args=..., args=...) at /home/user/pytorch/aten/src/ATen/core/dispatch/Dispatcher.h:492
csarofeen#13 at::_ops::resize_::redispatch (dispatchKeySet=..., self=..., size=..., memory_format=...) at /home/user/pytorch/build/aten/src/ATen/Operators_4.cpp:2144
csarofeen#14 0x000003ff861d5e08 in at::redispatch::resize__symint (dispatchKeySet=..., self=..., size=..., memory_format=...) at aten/src/ATen/RedispatchFunctions.h:2847
csarofeen#15 0x000003ff861b579e in torch::ADInplaceOrView::resize_ (ks=..., self=..., size=..., optional_memory_format=...) at /home/user/pytorch/torch/csrc/autograd/VariableTypeManual.cpp:401
```

Memory access:
```
#0  c10::SymInt::maybe_as_int (this=0x61000013d790) at /home/user/pytorch/c10/core/SymInt.h:215
#1  0x000003ff734d0a6e in c10::SymInt::sym_eq (this=0x61000013d790, sci=...) at /home/user/pytorch/c10/core/SymInt.cpp:69
csarofeen#2  0x000003ff5f6ab0be in c10::SymInt::operator== (this=0x61000013d790, o=...) at /home/user/pytorch/c10/core/SymInt.h:177
csarofeen#3  0x000003ff5f6aaede in std::__equal<false>::equal<c10::SymInt const*, c10::SymInt const*> (__first1=0x61000013d790, __last1=0x61000013d7a0, __first2=0x602000015c30)
    at /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/stl_algobase.h:1162
csarofeen#4  0x000003ff5f6aae4c in std::__equal_aux1<c10::SymInt const*, c10::SymInt const*> (__first1=0x61000013d790, __last1=0x61000013d7a0, __first2=0x602000015c30)
    at /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/stl_algobase.h:1211
csarofeen#5  0x000003ff5f6aae06 in std::__equal_aux<c10::SymInt const*, c10::SymInt const*> (__first1=0x61000013d790, __last1=0x61000013d7a0, __first2=0x602000015c30)
    at /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/stl_algobase.h:1219
csarofeen#6  0x000003ff5f6aad98 in std::equal<c10::SymInt const*, c10::SymInt const*> (__first1=0x61000013d790, __last1=0x61000013d7a0, __first2=0x602000015c30)
    at /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/stl_algobase.h:1556
csarofeen#7  0x000003ff2ff3c772 in c10::ArrayRef<c10::SymInt>::equals (this=0x3ffed7c9900, RHS=...) at /home/user/pytorch/c10/util/ArrayRef.h:188
csarofeen#8  0x000003ff31891bc2 in c10::operator!=<c10::SymInt> (a1=..., a2=...) at /home/user/pytorch/c10/util/ArrayRef.h:341
csarofeen#9  0x000003ff51eb5800 in torch::ADInplaceOrView::resize_ (ks=..., self=..., size=..., optional_memory_format=...) at /home/user/pytorch/torch/csrc/autograd/VariableTypeManual.cpp:408
csarofeen#10 0x000003ff51ee59c8 in c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor const& (c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c
10::MemoryFormat>), &torch::ADInplaceOrView::resize_>, at::Tensor const&, c10::guts::typelist::typelist<c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>
 > >::operator()(c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>) (this=0x6030007dca40, args=..., args=..., args=..., args=...)
    at /home/user/pytorch/aten/src/ATen/core/boxing/impl/WrapFunctionIntoFunctor.h:13
csarofeen#11 c10::impl::wrap_kernel_functor_unboxed_<c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor const& (c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt
>, c10::optional<c10::MemoryFormat>), &torch::ADInplaceOrView::resize_>, at::Tensor const&, c10::guts::typelist::typelist<c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<
c10::MemoryFormat> > >, at::Tensor const& (c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>)>::call(c10::OperatorKernel*, c10::DispatchKeySet, at::Tenso
r const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>) (functor=0x6030007dca40, dispatchKeySet=..., args=..., args=..., args=...)
    at /home/user/pytorch/aten/src/ATen/core/boxing/impl/make_boxed_from_unboxed_functor.h:480
csarofeen#12 0x000003ff369a512a in c10::callUnboxedKernelFunction<at::Tensor const&, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat> > (
    unboxed_kernel_func=0x3ff51ee51f0 <c10::impl::wrap_kernel_functor_unboxed_<c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor const& (c10::DispatchKeySet, at::Tenso
r const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>), &torch::ADInplaceOrView::resize_>, at::Tensor const&, c10::guts::typelist::typelist<c10::DispatchKeySet, at::Tensor const&, c10::Ar
rayRef<c10::SymInt>, c10::optional<c10::MemoryFormat> > >, at::Tensor const& (c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>)>::call(c10::OperatorKern
el*, c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>)>, functor=0x6030007dca40, dispatchKeySet=..., args=..., args=..., args=...)
    at /home/user/pytorch/aten/src/ATen/core/boxing/KernelFunction_impl.h:50
csarofeen#13 0x000003ff369a6e90 in c10::KernelFunction::call<at::Tensor const&, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat> > (this=0x6210005e1bc8, opHandle=...,
    dispatchKeySet=..., args=..., args=..., args=...) at /home/user/pytorch/aten/src/ATen/core/boxing/KernelFunction_impl.h:90
csarofeen#14 c10::Dispatcher::redispatch<at::Tensor const&, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat> >(c10::TypedOperatorHandle<at::Tensor const& (at::Tensor const&, c10::Arr
ayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>)> const&, c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>) const (
    this=0x3ff5d6400e0 <c10::Dispatcher::realSingleton()::_singleton>, op=..., currentDispatchKeySet=..., args=..., args=..., args=...) at /home/user/pytorch/aten/src/ATen/core/dispatch/Dispatcher.h:656
csarofeen#15 0x000003ff3652006c in c10::TypedOperatorHandle<at::Tensor const& (at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>)>::redispatch(c10::DispatchKeySet, at::Tensor const&,
c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>) const (
    this=0x3ff5d6a07e0 <at::_ops::resize_::redispatch(c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>)::op>, currentDispatchKeySet=..., args=...,
    args=..., args=...) at /home/user/pytorch/aten/src/ATen/core/dispatch/Dispatcher.h:492
csarofeen#16 at::_ops::resize_::redispatch (dispatchKeySet=..., self=..., size=..., memory_format=...) at /home/user/pytorch/build/aten/src/ATen/Operators_4.cpp:2144
csarofeen#17 0x000003ff51ed5e08 in at::redispatch::resize__symint (dispatchKeySet=..., self=..., size=..., memory_format=...) at aten/src/ATen/RedispatchFunctions.h:2847
csarofeen#18 0x000003ff51ebbb68 in torch::autograd::VariableType::(anonymous namespace)::resize_ (ks=..., self=..., size=..., optional_memory_format=...)
    at /home/user/pytorch/torch/csrc/autograd/VariableTypeManual.cpp:243
```
</details>
Pull Request resolved: pytorch#101064
Approved by: https://github.com/Skylion007, https://github.com/albanD
ftxj pushed a commit to ftxj/pytorch that referenced this pull request May 25, 2023
arguments() returns vector member of object returned by schema() call.
When object returned by schema() call is destroyed, the vector is deallocated as well,
it's lifetime isn't extended.

This issue detected while running `pytest -v test/mobile/test_lite_script_type.py -k test_nest_typing_namedtuple_custom_classtype` with ASAN.

<details>
<summary>ASAN output</summary>

```
==1134126==ERROR: AddressSanitizer: heap-use-after-free on address 0x60d0005a5790 at pc 0x03ff844488d8 bp 0x03fff584afe8 sp 0x03fff584afd8
READ of size 8 at 0x60d0005a5790 thread T0
    #0 0x3ff844488d7 in __gnu_cxx::__normal_iterator<c10::Argument const*, std::vector<c10::Argument, std::allocator<c10::Argument> > >::__normal_iterator(c10::Argument const* const&) /usr/lib/gcc/s390x-i
bm-linux-gnu/11/include/g++-v11/bits/stl_iterator.h:1028
    #1 0x3ff8444293f in std::vector<c10::Argument, std::allocator<c10::Argument> >::begin() const /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/stl_vector.h:821
    csarofeen#2 0x3ff84d807d1 in torch::jit::toPyObject(c10::IValue) /home/user/pytorch/torch/csrc/jit/python/pybind_utils.cpp:617
    csarofeen#3 0x3ff84d80305 in torch::jit::toPyObject(c10::IValue) /home/user/pytorch/torch/csrc/jit/python/pybind_utils.cpp:604
    csarofeen#4 0x3ff84856871 in pybind11::detail::type_caster<c10::IValue, void>::cast(c10::IValue, pybind11::return_value_policy, pybind11::handle) /home/user/pytorch/torch/csrc/jit/python/pybind.h:138
    csarofeen#5 0x3ff85318191 in pybind11::cpp_function::initialize<torch::jit::initJitScriptBindings(_object*)::$_45, c10::IValue, torch::jit::mobile::Module&, pybind11::tuple const&, pybind11::name, pybind11::is
_method, pybind11::sibling, pybind11::arg>(torch::jit::initJitScriptBindings(_object*)::$_45&&, c10::IValue (*)(torch::jit::mobile::Module&, pybind11::tuple const&), pybind11::name const&, pybind11::is_me
thod const&, pybind11::sibling const&, pybind11::arg const&)::{lambda(pybind11::detail::function_call&)#1}::operator()(pybind11::detail::function_call&) const /home/user/pytorch/cmake/../third_party/pybin
d11/include/pybind11/pybind11.h:249
    csarofeen#6 0x3ff85317cfd in pybind11::cpp_function::initialize<torch::jit::initJitScriptBindings(_object*)::$_45, c10::IValue, torch::jit::mobile::Module&, pybind11::tuple const&, pybind11::name, pybind11::is
_method, pybind11::sibling, pybind11::arg>(torch::jit::initJitScriptBindings(_object*)::$_45&&, c10::IValue (*)(torch::jit::mobile::Module&, pybind11::tuple const&), pybind11::name const&, pybind11::is_me
thod const&, pybind11::sibling const&, pybind11::arg const&)::{lambda(pybind11::detail::function_call&)#1}::__invoke(pybind11::detail::function_call&) /home/user/pytorch/cmake/../third_party/pybind11/incl
ude/pybind11/pybind11.h:224
    csarofeen#7 0x3ff82ee52e9 in pybind11::cpp_function::dispatcher(_object*, _object*, _object*) /home/user/pytorch/cmake/../third_party/pybind11/include/pybind11/pybind11.h:929
    csarofeen#8 0x3ffab002903 in cfunction_call Objects/methodobject.c:543
    csarofeen#9 0x3ffaaf8a933 in _PyObject_MakeTpCall Objects/call.c:215
    csarofeen#10 0x3ffaaf8e919 in _PyObject_VectorcallTstate Include/cpython/abstract.h:112
    csarofeen#11 0x3ffaaf8eddd in method_vectorcall Objects/classobject.c:53
    csarofeen#12 0x3ffab0f00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    csarofeen#13 0x3ffab0f013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    csarofeen#14 0x3ffab105447 in call_function Python/ceval.c:5891
    csarofeen#15 0x3ffab0ff779 in _PyEval_EvalFrameDefault Python/ceval.c:4181
    csarofeen#16 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    csarofeen#17 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065
    csarofeen#18 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    csarofeen#19 0x3ffaaf8a615 in _PyObject_FastCallDictTstate Objects/call.c:142
    csarofeen#20 0x3ffaaf8b271 in _PyObject_Call_Prepend Objects/call.c:431
    csarofeen#21 0x3ffab03f307 in slot_tp_call Objects/typeobject.c:7494
    csarofeen#22 0x3ffaaf8a933 in _PyObject_MakeTpCall Objects/call.c:215
    csarofeen#23 0x3ffab0f0081 in _PyObject_VectorcallTstate Include/cpython/abstract.h:112
    csarofeen#24 0x3ffab0f013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    csarofeen#25 0x3ffab105447 in call_function Python/ceval.c:5891
    csarofeen#26 0x3ffab0ff905 in _PyEval_EvalFrameDefault Python/ceval.c:4213
    csarofeen#27 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    csarofeen#28 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065
    csarofeen#29 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    csarofeen#30 0x3ffaaf8e941 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    csarofeen#31 0x3ffaaf8eddd in method_vectorcall Objects/classobject.c:53
    csarofeen#32 0x3ffab0f00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    csarofeen#33 0x3ffab0f013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    csarofeen#34 0x3ffab105447 in call_function Python/ceval.c:5891
    csarofeen#35 0x3ffab0ff905 in _PyEval_EvalFrameDefault Python/ceval.c:4213
    csarofeen#36 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    csarofeen#37 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065
    csarofeen#38 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    csarofeen#39 0x3ffab0f00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    csarofeen#40 0x3ffab0f013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    csarofeen#41 0x3ffab105447 in call_function Python/ceval.c:5891
    csarofeen#42 0x3ffab0ff7d7 in _PyEval_EvalFrameDefault Python/ceval.c:4198
    csarofeen#43 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    csarofeen#44 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065
    csarofeen#45 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    csarofeen#46 0x3ffaaf8e941 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    csarofeen#47 0x3ffaaf8eddd in method_vectorcall Objects/classobject.c:53
    csarofeen#48 0x3ffab0f00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    csarofeen#49 0x3ffab0f013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    csarofeen#50 0x3ffab105447 in call_function Python/ceval.c:5891
    csarofeen#51 0x3ffab0ffa57 in _PyEval_EvalFrameDefault Python/ceval.c:4231
    csarofeen#52 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    csarofeen#53 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065
    csarofeen#54 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    csarofeen#55 0x3ffaaf8e941 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    csarofeen#56 0x3ffaaf8eddd in method_vectorcall Objects/classobject.c:53
    csarofeen#57 0x3ffab0f00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    csarofeen#58 0x3ffab0f013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    csarofeen#59 0x3ffab105447 in call_function Python/ceval.c:5891
    csarofeen#60 0x3ffab0ffa57 in _PyEval_EvalFrameDefault Python/ceval.c:4231
    csarofeen#61 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    csarofeen#62 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065
    csarofeen#63 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    csarofeen#64 0x3ffaaf8e941 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    csarofeen#65 0x3ffaaf8eddd in method_vectorcall Objects/classobject.c:53
    csarofeen#66 0x3ffaaf8ab9b in PyVectorcall_Call Objects/call.c:267
    csarofeen#67 0x3ffaaf8ac65 in _PyObject_Call Objects/call.c:290
    csarofeen#68 0x3ffaaf8ada9 in PyObject_Call Objects/call.c:317
    csarofeen#69 0x3ffab1059c7 in do_call_core Python/ceval.c:5943
    csarofeen#70 0x3ffab0ffd39 in _PyEval_EvalFrameDefault Python/ceval.c:4277
    csarofeen#71 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    csarofeen#72 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065
    csarofeen#73 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    csarofeen#74 0x3ffaaf8a695 in _PyObject_FastCallDictTstate Objects/call.c:153
    csarofeen#75 0x3ffaaf8b271 in _PyObject_Call_Prepend Objects/call.c:431
    csarofeen#76 0x3ffab03f307 in slot_tp_call Objects/typeobject.c:7494
    csarofeen#77 0x3ffaaf8a933 in _PyObject_MakeTpCall Objects/call.c:215
    csarofeen#78 0x3ffab0f0081 in _PyObject_VectorcallTstate Include/cpython/abstract.h:112
    csarofeen#79 0x3ffab0f013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    csarofeen#80 0x3ffab105447 in call_function Python/ceval.c:5891
    csarofeen#81 0x3ffab0ffa57 in _PyEval_EvalFrameDefault Python/ceval.c:4231
    csarofeen#82 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    csarofeen#83 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065
    csarofeen#84 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    csarofeen#85 0x3ffab0f00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    csarofeen#86 0x3ffab0f013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    csarofeen#87 0x3ffab105447 in call_function Python/ceval.c:5891
    csarofeen#88 0x3ffab0ff7d7 in _PyEval_EvalFrameDefault Python/ceval.c:4198
    csarofeen#89 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    csarofeen#90 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065
    csarofeen#91 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    csarofeen#92 0x3ffaaf8ab15 in PyVectorcall_Call Objects/call.c:255
    csarofeen#93 0x3ffaaf8ac65 in _PyObject_Call Objects/call.c:290
    csarofeen#94 0x3ffaaf8ada9 in PyObject_Call Objects/call.c:317
    csarofeen#95 0x3ffab1059c7 in do_call_core Python/ceval.c:5943
    csarofeen#96 0x3ffab0ffd39 in _PyEval_EvalFrameDefault Python/ceval.c:4277
    csarofeen#97 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    csarofeen#98 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065
    csarofeen#99 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    csarofeen#100 0x3ffab0f00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    csarofeen#101 0x3ffab0f013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    csarofeen#102 0x3ffab105447 in call_function Python/ceval.c:5891
    csarofeen#103 0x3ffab0ff779 in _PyEval_EvalFrameDefault Python/ceval.c:4181
    csarofeen#104 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    csarofeen#105 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065
    csarofeen#106 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    csarofeen#107 0x3ffaaf8e941 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    csarofeen#108 0x3ffaaf8eddd in method_vectorcall Objects/classobject.c:53
    csarofeen#109 0x3ffab0f00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    csarofeen#110 0x3ffab0f013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    csarofeen#111 0x3ffab105447 in call_function Python/ceval.c:5891
    csarofeen#112 0x3ffab0ff779 in _PyEval_EvalFrameDefault Python/ceval.c:4181
    csarofeen#113 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    csarofeen#114 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065
    csarofeen#115 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    csarofeen#116 0x3ffaaf8a695 in _PyObject_FastCallDictTstate Objects/call.c:153
    csarofeen#117 0x3ffaaf8b271 in _PyObject_Call_Prepend Objects/call.c:431
    csarofeen#118 0x3ffab03f307 in slot_tp_call Objects/typeobject.c:7494
    csarofeen#119 0x3ffaaf8ad17 in _PyObject_Call Objects/call.c:305
    csarofeen#120 0x3ffaaf8ada9 in PyObject_Call Objects/call.c:317
    csarofeen#121 0x3ffab1059c7 in do_call_core Python/ceval.c:5943
    csarofeen#122 0x3ffab0ffd39 in _PyEval_EvalFrameDefault Python/ceval.c:4277
    csarofeen#123 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    csarofeen#124 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065
    csarofeen#125 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    csarofeen#126 0x3ffab0f00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    csarofeen#127 0x3ffab0f013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    csarofeen#128 0x3ffab105447 in call_function Python/ceval.c:5891
    csarofeen#129 0x3ffab0ff905 in _PyEval_EvalFrameDefault Python/ceval.c:4213
    csarofeen#130 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    csarofeen#131 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065
    csarofeen#132 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    csarofeen#133 0x3ffaaf8e941 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    csarofeen#134 0x3ffaaf8eddd in method_vectorcall Objects/classobject.c:53
    csarofeen#135 0x3ffab0f00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    csarofeen#136 0x3ffab0f013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    csarofeen#137 0x3ffab105447 in call_function Python/ceval.c:5891
    csarofeen#138 0x3ffab0ffa57 in _PyEval_EvalFrameDefault Python/ceval.c:4231
    csarofeen#139 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    csarofeen#140 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065
    csarofeen#141 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    csarofeen#142 0x3ffaaf8ab15 in PyVectorcall_Call Objects/call.c:255
    csarofeen#143 0x3ffaaf8ac65 in _PyObject_Call Objects/call.c:290
    csarofeen#144 0x3ffaaf8ada9 in PyObject_Call Objects/call.c:317
    csarofeen#145 0x3ffab1059c7 in do_call_core Python/ceval.c:5943
    csarofeen#146 0x3ffab0ffd39 in _PyEval_EvalFrameDefault Python/ceval.c:4277
    csarofeen#147 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    csarofeen#148 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065
    csarofeen#149 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    csarofeen#150 0x3ffab0f00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    csarofeen#151 0x3ffab0f013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    csarofeen#152 0x3ffab105447 in call_function Python/ceval.c:5891
    csarofeen#153 0x3ffab0ff905 in _PyEval_EvalFrameDefault Python/ceval.c:4213
    csarofeen#154 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    csarofeen#155 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065
    csarofeen#156 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    csarofeen#157 0x3ffab0f00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    csarofeen#158 0x3ffab0f013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    csarofeen#159 0x3ffab105447 in call_function Python/ceval.c:5891
    csarofeen#160 0x3ffab0ffa57 in _PyEval_EvalFrameDefault Python/ceval.c:4231
    csarofeen#161 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    csarofeen#162 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065
    csarofeen#163 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    csarofeen#164 0x3ffaaf8ab15 in PyVectorcall_Call Objects/call.c:255
    csarofeen#165 0x3ffaaf8ac65 in _PyObject_Call Objects/call.c:290
    csarofeen#166 0x3ffaaf8ada9 in PyObject_Call Objects/call.c:317
    csarofeen#167 0x3ffab1059c7 in do_call_core Python/ceval.c:5943
    csarofeen#168 0x3ffab0ffd39 in _PyEval_EvalFrameDefault Python/ceval.c:4277
    csarofeen#169 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    csarofeen#170 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065
    csarofeen#171 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    csarofeen#172 0x3ffab0f00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    csarofeen#173 0x3ffab0f013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    csarofeen#174 0x3ffab105447 in call_function Python/ceval.c:5891
    csarofeen#175 0x3ffab0ff779 in _PyEval_EvalFrameDefault Python/ceval.c:4181
    csarofeen#176 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    csarofeen#177 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065
    csarofeen#178 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    csarofeen#179 0x3ffaaf8e941 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    csarofeen#180 0x3ffaaf8eddd in method_vectorcall Objects/classobject.c:53
    csarofeen#181 0x3ffab0f00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    csarofeen#182 0x3ffab0f013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    csarofeen#183 0x3ffab105447 in call_function Python/ceval.c:5891
    csarofeen#184 0x3ffab0ff779 in _PyEval_EvalFrameDefault Python/ceval.c:4181
    csarofeen#185 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    csarofeen#186 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065
    csarofeen#187 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    csarofeen#188 0x3ffaaf8a695 in _PyObject_FastCallDictTstate Objects/call.c:153
    csarofeen#189 0x3ffaaf8b271 in _PyObject_Call_Prepend Objects/call.c:431
    csarofeen#190 0x3ffab03f307 in slot_tp_call Objects/typeobject.c:7494
    csarofeen#191 0x3ffaaf8a933 in _PyObject_MakeTpCall Objects/call.c:215
    csarofeen#192 0x3ffab0f0081 in _PyObject_VectorcallTstate Include/cpython/abstract.h:112
    csarofeen#193 0x3ffab0f013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    csarofeen#194 0x3ffab105447 in call_function Python/ceval.c:5891
    csarofeen#195 0x3ffab0ffa57 in _PyEval_EvalFrameDefault Python/ceval.c:4231
    csarofeen#196 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    csarofeen#197 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065
    csarofeen#198 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    csarofeen#199 0x3ffaaf8ab15 in PyVectorcall_Call Objects/call.c:255
    csarofeen#200 0x3ffaaf8ac65 in _PyObject_Call Objects/call.c:290
    csarofeen#201 0x3ffaaf8ada9 in PyObject_Call Objects/call.c:317
    csarofeen#202 0x3ffab1059c7 in do_call_core Python/ceval.c:5943
    csarofeen#203 0x3ffab0ffd39 in _PyEval_EvalFrameDefault Python/ceval.c:4277
    csarofeen#204 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    csarofeen#205 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065
    csarofeen#206 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    csarofeen#207 0x3ffab0f00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    csarofeen#208 0x3ffab0f013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    csarofeen#209 0x3ffab105447 in call_function Python/ceval.c:5891
    csarofeen#210 0x3ffab0ff779 in _PyEval_EvalFrameDefault Python/ceval.c:4181
    csarofeen#211 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    csarofeen#212 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065
    csarofeen#213 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    csarofeen#214 0x3ffaaf8e941 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    csarofeen#215 0x3ffaaf8eddd in method_vectorcall Objects/classobject.c:53
    csarofeen#216 0x3ffab0f00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    csarofeen#216 0x3ffab0f00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    csarofeen#217 0x3ffab0f013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    csarofeen#218 0x3ffab105447 in call_function Python/ceval.c:5891
    csarofeen#219 0x3ffab0ff779 in _PyEval_EvalFrameDefault Python/ceval.c:4181
    csarofeen#220 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    csarofeen#221 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065
    csarofeen#222 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    csarofeen#223 0x3ffaaf8a695 in _PyObject_FastCallDictTstate Objects/call.c:153
    csarofeen#224 0x3ffaaf8b271 in _PyObject_Call_Prepend Objects/call.c:431
    csarofeen#225 0x3ffab03f307 in slot_tp_call Objects/typeobject.c:7494
    csarofeen#226 0x3ffaaf8a933 in _PyObject_MakeTpCall Objects/call.c:215
    csarofeen#227 0x3ffab0f0081 in _PyObject_VectorcallTstate Include/cpython/abstract.h:112
    csarofeen#228 0x3ffab0f013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    csarofeen#229 0x3ffab105447 in call_function Python/ceval.c:5891
    csarofeen#230 0x3ffab0ffa57 in _PyEval_EvalFrameDefault Python/ceval.c:4231
    csarofeen#231 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    csarofeen#232 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065
    csarofeen#233 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    csarofeen#234 0x3ffab0f00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    csarofeen#235 0x3ffab0f013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    csarofeen#236 0x3ffab105447 in call_function Python/ceval.c:5891
    csarofeen#237 0x3ffab0ff905 in _PyEval_EvalFrameDefault Python/ceval.c:4213
    csarofeen#238 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    csarofeen#239 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065
    csarofeen#240 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    csarofeen#241 0x3ffab0f00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    csarofeen#242 0x3ffab0f013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    csarofeen#243 0x3ffab105447 in call_function Python/ceval.c:5891
    csarofeen#244 0x3ffab0ff905 in _PyEval_EvalFrameDefault Python/ceval.c:4213
    csarofeen#245 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    csarofeen#246 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065
    csarofeen#247 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    csarofeen#248 0x3ffaaf8ab15 in PyVectorcall_Call Objects/call.c:255
    csarofeen#249 0x3ffaaf8ac65 in _PyObject_Call Objects/call.c:290

0x60d0005a5790 is located 80 bytes inside of 136-byte region [0x60d0005a5740,0x60d0005a57c8)
freed by thread T0 here:
    #0 0x3ffab537de5 in operator delete(void*) /var/tmp/portage/sys-devel/gcc-11.3.1_p20230303/work/gcc-11-20230303/libsanitizer/asan/asan_new_delete.cpp:160
    #1 0x3ff55984fdb in __gnu_cxx::new_allocator<std::_Sp_counted_ptr_inplace<c10::FunctionSchema, std::allocator<c10::FunctionSchema>, (__gnu_cxx::_Lock_policy)2> >::deallocate(std::_Sp_counted_ptr_inplace<c10::FunctionSchema, std::allocator<c10::FunctionSchema>, (__gnu_cxx::_Lock_policy)2>*, unsigned long) /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/ext/new_allocator.h:145

previously allocated by thread T0 here:
    #0 0x3ffab53734f in operator new(unsigned long) /var/tmp/portage/sys-devel/gcc-11.3.1_p20230303/work/gcc-11-20230303/libsanitizer/asan/asan_new_delete.cpp:99
    #1 0x3ff5598443f in __gnu_cxx::new_allocator<std::_Sp_counted_ptr_inplace<c10::FunctionSchema, std::allocator<c10::FunctionSchema>, (__gnu_cxx::_Lock_policy)2> >::allocate(unsigned long, void const*) /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/ext/new_allocator.h:127
    csarofeen#2 0x3fff5849ecf  ([stack]+0xb2ecf)

SUMMARY: AddressSanitizer: heap-use-after-free /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/stl_iterator.h:1028 in __gnu_cxx::__normal_iterator<c10::Argument const*, std::vector<c10::Argument, std::allocator<c10::Argument> > >::__normal_iterator(c10::Argument const* const&)
Shadow bytes around the buggy address:
  0x100c1a000b4aa0: fd fd fd fd fd fd fd fd fd fd fd fa fa fa fa fa
  0x100c1a000b4ab0: fa fa fa fa fd fd fd fd fd fd fd fd fd fd fd fd
  0x100c1a000b4ac0: fd fd fd fd fd fa fa fa fa fa fa fa fa fa fd fd
  0x100c1a000b4ad0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fa
  0x100c1a000b4ae0: fa fa fa fa fa fa fa fa fd fd fd fd fd fd fd fd
=>0x100c1a000b4af0: fd fd[fd]fd fd fd fd fd fd fa fa fa fa fa fa fa
  0x100c1a000b4b00: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x100c1a000b4b10: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x100c1a000b4b20: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x100c1a000b4b30: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x100c1a000b4b40: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
  Shadow gap:              cc
==1134126==ABORTING
```

Additional backtraces (not full):
Allocation:
```
#0  __memset_z196 () at ../sysdeps/s390/memset-z900.S:144
#1  0x000003ff96f3072a in __asan::Allocator::Allocate (this=this@entry=0x3ff97041eb8 <__asan::instance>, size=size@entry=136, alignment=8, alignment@entry=0, stack=<optimized out>,
    stack@entry=0x3ffdbb45d78, alloc_type=<optimized out>, can_fill=true) at /var/tmp/portage/sys-devel/gcc-11.3.1_p20230303/work/gcc-11-20230303/libsanitizer/asan/asan_allocator.cpp:599
csarofeen#2  0x000003ff96f2c088 in __asan::asan_memalign (alignment=alignment@entry=0, size=size@entry=136, stack=stack@entry=0x3ffdbb45d78, alloc_type=alloc_type@entry=__asan::FROM_NEW)
    at /var/tmp/portage/sys-devel/gcc-11.3.1_p20230303/work/gcc-11-20230303/libsanitizer/asan/asan_allocator.cpp:1039
csarofeen#3  0x000003ff96fb73b0 in operator new (size=136) at /var/tmp/portage/sys-devel/gcc-11.3.1_p20230303/work/gcc-11-20230303/libsanitizer/asan/asan_new_delete.cpp:99
csarofeen#4  0x000003ff41404440 in __gnu_cxx::new_allocator<std::_Sp_counted_ptr_inplace<c10::FunctionSchema, std::allocator<c10::FunctionSchema>, (__gnu_cxx::_Lock_policy)2> >::allocate (this=0x3ffdbb468c0,
    __n=1) at /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/ext/new_allocator.h:127
csarofeen#5  0x000003ff414042a0 in std::allocator_traits<std::allocator<std::_Sp_counted_ptr_inplace<c10::FunctionSchema, std::allocator<c10::FunctionSchema>, (__gnu_cxx::_Lock_policy)2> > >::allocate (__a=...,
    __n=1) at /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/alloc_traits.h:464
csarofeen#6  0x000003ff41403b66 in std::__allocate_guarded<std::allocator<std::_Sp_counted_ptr_inplace<c10::FunctionSchema, std::allocator<c10::FunctionSchema>, (__gnu_cxx::_Lock_policy)2> > > (__a=...)
    at /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/allocated_ptr.h:98
csarofeen#7  0x000003ff4140372a in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::__shared_count<c10::FunctionSchema, std::allocator<c10::FunctionSchema>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::vector<c10::Argument, std::allocator<c10::Argument> >, std::vector<c10::Argument, std::allocator<c10::Argument> > > (this=0x3ffdbb47888, __p=@0x3ffdbb47880: 0x0, __a=..., __args=..., __args=..., __args=..., __args=...)
    at /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/shared_ptr_base.h:648
csarofeen#8  0x000003ff41403328 in std::__shared_ptr<c10::FunctionSchema, (__gnu_cxx::_Lock_policy)2>::__shared_ptr<std::allocator<c10::FunctionSchema>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::vector<c10::Argument, std::allocator<c10::Argument> >, std::vector<c10::Argument, std::allocator<c10::Argument> > > (this=0x3ffdbb47880, __tag=..., __args=..., __args=..., __args=..., __args=...) at /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/shared_ptr_base.h:1342
csarofeen#9  0x000003ff41402f06 in std::shared_ptr<c10::FunctionSchema>::shared_ptr<std::allocator<c10::FunctionSchema>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::vector<c10::Argument, std::allocator<c10::Argument> >, std::vector<c10::Argument, std::allocator<c10::Argument> > > (
    this=0x3ffdbb47880, __tag=..., __args=..., __args=..., __args=..., __args=...) at /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/shared_ptr.h:409
csarofeen#10 0x000003ff41402b6e in std::allocate_shared<c10::FunctionSchema, std::allocator<c10::FunctionSchema>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::vector<c10::Argument, std::allocator<c10::Argument> >, std::vector<c10::Argument, std::allocator<c10::Argument> > > (__a=...,
    __args=..., __args=..., __args=..., __args=...) at /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/shared_ptr.h:862
csarofeen#11 0x000003ff4140215c in std::make_shared<c10::FunctionSchema, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::vector<c10::Argument, std::allocator<c10::Argument> >, std::vector<c10::Argument, std::allocator<c10::Argument> > > (__args=..., __args=..., __args=..., __args=...)
    at /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/shared_ptr.h:878
csarofeen#12 0x000003ff413d180c in c10::TupleType::createWithSpec<c10::basic_string_view<char> > (qualName=..., field_names=std::vector of length 1, capacity 1 = {...},
    field_types=std::vector of length 1, capacity 1 = {...}, field_defaults=std::vector of length 0, capacity 0) at /home/user/pytorch/aten/src/ATen/core/type.cpp:769
csarofeen#13 0x000003ff413b9ca6 in c10::TupleType::createNamed (qualName=..., field_names=std::vector of length 1, capacity 1 = {...}, field_types=std::vector of length 1, capacity 1 = {...})
    at /home/user/pytorch/aten/src/ATen/core/type.cpp:725
csarofeen#14 0x000003ff4115fbac in c10::ivalue::TupleTypeFactory<c10::TupleType>::fallback (type=...) at /home/user/pytorch/aten/src/ATen/core/dynamic_type.cpp:383
csarofeen#15 0x000003ff708217fe in c10::ivalue::Tuple::type<c10::TupleType> (this=0x6080004b8520) at /home/user/pytorch/aten/src/ATen/core/ivalue_inl.h:781
csarofeen#16 0x000003ff70800740 in torch::jit::toPyObject (ivalue=...) at /home/user/pytorch/torch/csrc/jit/python/pybind_utils.cpp:613
csarofeen#17 0x000003ff70800306 in torch::jit::toPyObject (ivalue=...) at /home/user/pytorch/torch/csrc/jit/python/pybind_utils.cpp:604
csarofeen#18 0x000003ff702d6872 in pybind11::detail::type_caster<c10::IValue, void>::cast (src=...) at /home/user/pytorch/torch/csrc/jit/python/pybind.h:138
csarofeen#19 0x000003ff70d98192 in pybind11::cpp_function::initialize<torch::jit::initJitScriptBindings(_object*)::$_45, c10::IValue, torch::jit::mobile::Module&, pybind11::tuple const&, pybind11::name, pybind11::is_method, pybind11::sibling, pybind11::arg>(torch::jit::initJitScriptBindings(_object*)::$_45&&, c10::IValue (*)(torch::jit::mobile::Module&, pybind11::tuple const&), pybind11::name const&, pybind11::is_method const&, pybind11::sibling const&, pybind11::arg const&)::{lambda(pybind11::detail::function_call&)#1}::operator()(pybind11::detail::function_call&) const (this=0x3ffdbb4ca20, call=...)
    at /home/user/pytorch/cmake/../third_party/pybind11/include/pybind11/pybind11.h:249
csarofeen#20 0x000003ff70d97cfe in pybind11::cpp_function::initialize<torch::jit::initJitScriptBindings(_object*)::$_45, c10::IValue, torch::jit::mobile::Module&, pybind11::tuple const&, pybind11::name, pybind11::is_method, pybind11::sibling, pybind11::arg>(torch::jit::initJitScriptBindings(_object*)::$_45&&, c10::IValue (*)(torch::jit::mobile::Module&, pybind11::tuple const&), pybind11::name const&, pybind11::is_method const&, pybind11::sibling const&, pybind11::arg const&)::{lambda(pybind11::detail::function_call&)#1}::__invoke(pybind11::detail::function_call&) (call=...)
    at /home/user/pytorch/cmake/../third_party/pybind11/include/pybind11/pybind11.h:224
csarofeen#21 0x000003ff6e9652ea in pybind11::cpp_function::dispatcher (self=<PyCapsule at remote 0x3ff83e27720>,
    args_in=(<torch._C.LiteScriptModule at remote 0x3ff811844b0>, (<Tensor at remote 0x3ff814efb00>,)), kwargs_in=0x0) at /home/user/pytorch/cmake/../third_party/pybind11/include/pybind11/pybind11.h:929
```

Deallocation:
```
#0  operator delete (ptr=0x60d0005a5740) at /var/tmp/portage/sys-devel/gcc-11.3.1_p20230303/work/gcc-11-20230303/libsanitizer/asan/asan_new_delete.cpp:160
#1  0x000003ff44904fdc in __gnu_cxx::new_allocator<std::_Sp_counted_ptr_inplace<c10::FunctionSchema, std::allocator<c10::FunctionSchema>, (__gnu_cxx::_Lock_policy)2> >::deallocate (this=0x3ffc5dc8020,
    __p=0x60d0005a5740, __t=1) at /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/ext/new_allocator.h:145
csarofeen#2  0x000003ff44904fa8 in std::allocator_traits<std::allocator<std::_Sp_counted_ptr_inplace<c10::FunctionSchema, std::allocator<c10::FunctionSchema>, (__gnu_cxx::_Lock_policy)2> > >::deallocate (
    __a=..., __p=0x60d0005a5740, __n=1) at /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/alloc_traits.h:496
csarofeen#3  0x000003ff449041f2 in std::__allocated_ptr<std::allocator<std::_Sp_counted_ptr_inplace<c10::FunctionSchema, std::allocator<c10::FunctionSchema>, (__gnu_cxx::_Lock_policy)2> > >::~__allocated_ptr (
    this=0x3ffc5dc8030) at /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/allocated_ptr.h:74
csarofeen#4  0x000003ff44904888 in std::_Sp_counted_ptr_inplace<c10::FunctionSchema, std::allocator<c10::FunctionSchema>, (__gnu_cxx::_Lock_policy)2>::_M_destroy (this=0x60d0005a5740)
    at /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/shared_ptr_base.h:538
csarofeen#5  0x000003ff43895a62 in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x60d0005a5740) at /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/shared_ptr_base.h:184
csarofeen#6  0x000003ff43895420 in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count (this=0x611000c40648) at /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/shared_ptr_base.h:705
csarofeen#7  0x000003ff4466e7f4 in std::__shared_ptr<c10::FunctionSchema, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr (this=0x611000c40640)
    at /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/shared_ptr_base.h:1154
csarofeen#8  0x000003ff4466d820 in std::shared_ptr<c10::FunctionSchema>::~shared_ptr (this=0x611000c40640) at /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/shared_ptr.h:122
csarofeen#9  0x000003ff448d82f6 in c10::TupleType::~TupleType (this=0x611000c40580) at /home/user/pytorch/aten/src/ATen/core/jit_type.h:1142
csarofeen#10 0x000003ff448d8346 in c10::TupleType::~TupleType (this=0x611000c40580) at /home/user/pytorch/aten/src/ATen/core/jit_type.h:1142
csarofeen#11 0x000003ff731296a4 in std::_Sp_counted_ptr<c10::TupleType*, (__gnu_cxx::_Lock_policy)2>::_M_dispose (this=0x603000c43ae0)
    at /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/shared_ptr_base.h:348
csarofeen#12 0x000003ff71eaf666 in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x603000c43ae0) at /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/shared_ptr_base.h:168
csarofeen#13 0x000003ff71eaf330 in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count (this=0x3ffc5dc9368) at /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/shared_ptr_base.h:705
csarofeen#14 0x000003ff73129ee4 in std::__shared_ptr<c10::TupleType, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr (this=0x3ffc5dc9360)
    at /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/shared_ptr_base.h:1154
csarofeen#15 0x000003ff73122390 in std::shared_ptr<c10::TupleType>::~shared_ptr (this=0x3ffc5dc9360) at /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/shared_ptr.h:122
csarofeen#16 0x000003ff73d00788 in torch::jit::toPyObject (ivalue=...) at /home/user/pytorch/torch/csrc/jit/python/pybind_utils.cpp:613
csarofeen#17 0x000003ff73d00306 in torch::jit::toPyObject (ivalue=...) at /home/user/pytorch/torch/csrc/jit/python/pybind_utils.cpp:604
```
</details>
Pull Request resolved: pytorch#101400
Approved by: https://github.com/zou3519
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.