[pytorch][PR] Commentary about size constraints on TensorImpl.#13126
Closed
ezyang wants to merge 51 commits intoexport-D10404407from
Closed
[pytorch][PR] Commentary about size constraints on TensorImpl.#13126ezyang wants to merge 51 commits intoexport-D10404407from
ezyang wants to merge 51 commits intoexport-D10404407from
Conversation
Summary: Pull Request resolved: #12994 Reviewed By: anderspapitto Differential Revision: D10515291 Pulled By: pjh5 fbshipit-source-id: 191054cdacff308b63e9063d22d62314398e4f88
Summary: Pull Request resolved: #13064 Differential Revision: D10561008 Pulled By: yf225 fbshipit-source-id: c48364662efa82865a1bc1a7e2db3a9fb8af10d5
Summary: Pull Request resolved: #13059 Reviewed By: llyfacebook Differential Revision: D10560147 Pulled By: sf-wind fbshipit-source-id: c8f38b30c9acdf6ae494e56a5876fd4493696e5d
Summary: Pull Request resolved: #12969 Differential Revision: D10560824 Pulled By: ezyang fbshipit-source-id: 86c21149682db5ebfd9610df9e9845688a3db3b0
Summary: Pull Request resolved: #13014 Tensor method renaming using clangr Reviewed By: ezyang Differential Revision: D10467556 fbshipit-source-id: 7d7eaf5fc59bbb493c057d5b8bfdda03b140c97e
Summary: Pull Request resolved: #13002 Batch dim wasn't handled in the CPU impl (will fail for inputs with N > 1). Fixing that here. Differential Revision: D10515159 fbshipit-source-id: ee7e4f489d2d4de793f550b31db7c0e2ba3651e8
Summary: Pull Request resolved: #13015 att Reviewed By: ezyang Differential Revision: D10469310 fbshipit-source-id: f4621fe5d17bb4663192860f81effe6bdfe21bea
Summary: rather than pass a list through a text file Pull Request resolved: #12951 Differential Revision: D10528309 Pulled By: anderspapitto fbshipit-source-id: d94befcd61b6304815859694b623046f256462df
Summary: Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: #12975 Differential Revision: D10513493 Pulled By: ezyang fbshipit-source-id: ac183aeb4ae7f0a5f91f1a369b595ae92c3e844d
Summary: Pull Request resolved: #12616 Focusing on operators in common use on mobile. Also use GRADIENT_OPERATOR_SCHEMA. Reviewed By: Yangqing Differential Revision: D10245216 fbshipit-source-id: 5cc023da170149b637fe3c729d3756af948aa265
Summary: Pull Request resolved: #13062 Gold (the linker) isn't able to gc unreferenced string constants, but converting these to arrays puts them in their own data sections and reduces (Android) binary size as a result. I'm told even in server builds, this reduces binary size by a few dozen bytes and speeds up startup by a few hundred ns. :-P Reviewed By: Yangqing Differential Revision: D10510808 fbshipit-source-id: 247ba9574e7a9b6a8204d33052994b08c401c197
Summary: Adding gemmlowp dependency in thrid-party folder Pull Request resolved: #12947 Differential Revision: D10794559 Pulled By: harouwu fbshipit-source-id: 7f8a649c739ccb6c307327080711379b1db8c3e0
Summary: Pull Request resolved: #13063 Expose the following operators GatherRanges Slice MergeIdLists Reviewed By: itomatik Differential Revision: D10560138 fbshipit-source-id: 90f74d7d4c2bfca40788a5fcec4c73d71b156d3b
Summary: Pull Request resolved: #13028 Codemod generated with clangr shard mode, 25 files per diff, for renaming dims() to sizes() Reviewed By: ezyang Differential Revision: D10476220 fbshipit-source-id: 3c3b3d5e2082cd6a1f0ff4a3c8641b30e6f16896
Summary: Pull Request resolved: #13029 Codemod generated with clangr shard mode, 25 files per diff, for renaming dims() to sizes() Reviewed By: ezyang Differential Revision: D10476225 fbshipit-source-id: 5e63ca80b3843967ea1661ada447bbc18661378d
Summary: Adding sse2neon in thrid-party as dependencies Pull Request resolved: #12948 Differential Revision: D10801574 Pulled By: harouwu fbshipit-source-id: 8b4f9f361cc1722f631830f7675b9d209a9f22ef
Summary: Pull Request resolved: #13030 Codemod generated with clangr shard mode, 25 files per diff, for renaming dims() to sizes() Reviewed By: ezyang Differential Revision: D10476226 fbshipit-source-id: 757583e3bde8d5246565433883bd328ab34f3e09
Summary: Pull Request resolved: #13031 Codemod generated with clangr shard mode, 25 files per diff, for renaming dims() to sizes() Reviewed By: ezyang Differential Revision: D10476232 fbshipit-source-id: cb4ad76be068065eb2c5e7d87f33d04423cf93c4
Summary: Pull Request resolved: #13032 Codemod generated with clangr shard mode, 25 files per diff, for renaming dims() to sizes() Reviewed By: ezyang Differential Revision: D10476235 fbshipit-source-id: 263ad75689d864b414dae63cb9a30cb3285dae31
Summary: Pull Request resolved: #13071 In the case where a process got stuck and timed out on joining, we would see a None != 1 assertion error in the code path where the exit statuses are compared. This implies that the first process exited with exit code 1 and another one didn't exit at all. With this commit the error message is more descriptive. Differential Revision: D10785266 fbshipit-source-id: c8cc02d07ea4fdc6f5374afd9a0aac72218fe61d
Summary: Pull Request resolved: #13007 No reason to use the hook if it's set, this helps fbcode traces. This slightly pessimizes the stack trace for ATen functions, because we are no longer skipping all of the frames we should. This is probably OK. Reviewed By: Yangqing Differential Revision: D10518499 fbshipit-source-id: be54e490df3c3fde7ff894b5b1473442ffc7ded3
Summary: Codemod generated with clangr shard mode, 25 files per diff, for renaming dims() to sizes() Reviewed By: ezyang Differential Revision: D10842900 fbshipit-source-id: 8d58ed4d403fb0308a8fa286659f8e830b040bec
Summary: `warnings.warn` is used commonly thoughout `nn.functional`, so this adds support for it by forwarding its arguments to `print` Pull Request resolved: #12964 Differential Revision: D10559427 Pulled By: driazati fbshipit-source-id: 5b591f6f446c906418f9fc7730c17e301f263d9b
Summary: Our convolution ops and such expect three dimensional images, but the images in the MNIST dataset of the C++ frontend currently only have two. apaszke ebetica soumith Pull Request resolved: #13060 Differential Revision: D10560754 Pulled By: goldsborough fbshipit-source-id: a2cc877b4f43434482bec902c941fafb7a157d5d
Summary: added loc and scale args. Pull Request resolved: #13044 Differential Revision: D10560762 Pulled By: ezyang fbshipit-source-id: 6c98ecc04975df8993364b06c480d015a25e2061
Summary: This is mostly for reusing all the cudnn test cases in our python operator_tests. Pull Request resolved: #12278 Differential Revision: D10842592 Pulled By: bddppq fbshipit-source-id: 4b3ed91fca64ff02060837b3270393bc2f9a9898
Summary: Original commit changeset: 82583d0ad4b8 Reviewed By: enosair, ilia-cher Differential Revision: D10560741 fbshipit-source-id: e289a37d441bd2243b369810abf451292891d9ee
Summary: Pull Request resolved: #13091 Original commit changeset: 8b4f9f361cc1 Reviewed By: Maratyszcza Differential Revision: D10846301 fbshipit-source-id: 2798f1fca5c1a2362979977ef5eb724dd37c4e6d
Summary: Pull Request resolved: #13068 Basic ops.def update and converter.cc updates This is the standard way to ingest networks into nomnigraph redo of D10412639 Reviewed By: ZolotukhinM Differential Revision: D10560324 fbshipit-source-id: c8ccb0aabde6ee8f823657ee5cd3ed9ed6c45549
Summary: Pull Request resolved: #13087 API changes that simplify subgraph replacement drastically Reviewed By: duc0 Differential Revision: D10444011 fbshipit-source-id: 22c699bb5bc0f21538c70fe9401899d4f7e1b055
Summary: Fixes #12578 #9395. * Fix and simplify print logic * Follow numpy print rule https://github.com/numpy/numpy/blob/eb2bd11870731ea19a0eee72e616c7deb00f6c54/numpy/core/arrayprint.py#L859 > scientific notation is used when absolute value of the smallest number is < 1e-4 or maximum > 1e8 or the ratio of the maximum absolute value to the minimum is > 1e3 I hope I didn't break anything since there seems to be a lot of edge cases here... Here are some easy sanity checks. ``` In [5]: torch.tensor(1) Out[5]: tensor(1) Out[2]: array(1) # numpy In [6]: torch.tensor(10) Out[6]: tensor(10) Out[3]: array(10) # numpy In [8]: torch.tensor(99000000) Out[8]: tensor(99000000) Out[5]: array(99000000) # numpy In [9]: torch.tensor(100000000) Out[9]: tensor(100000000) Out[6]: array(100000000) # numpy In [10]: torch.tensor(100000001) Out[10]: tensor(100000001) Out[7]: array(100000001) # numpy In [11]: torch.tensor(1000000000) Out[11]: tensor(1000000000) Out[8]: array(1000000000) # numpy In [12]: torch.tensor([1, 1000]) Out[12]: tensor([ 1, 1000]) Out[9]: array([ 1, 1000]) # numpy In [13]: torch.tensor([1, 1010]) Out[13]: tensor([ 1, 1010]) Out[10]: array([ 1, 1010]) # numpy ``` For floating points, we use scientific when `max/min > 1000 || max > 1e8 || min < 1e-4` Lines with "old" are old behaviors that either has precision issue, or not aligned with numpy ``` In [14]: torch.tensor(0.01) Out[14]: tensor(0.0100) Out[11]: array(0.01) # numpy In [15]: torch.tensor(0.1) Out[15]: tensor(0.1000) Out[12]: array(0.1) # numpy In [16]: torch.tensor(0.0001) Out[16]: tensor(0.0001) Out[14]: array(0.0001) # numpy In [17]: torch.tensor(0.00002) Out[17]: tensor(2.0000e-05) Out[15]: array(2e-05) # numpy Out[5]: tensor(0.0000) # old In [18]: torch.tensor(1e8) Out[18]: tensor(100000000.) Out[16]: array(100000000.0) # numpy In [19]: torch.tensor(1.1e8) Out[19]: tensor(1.1000e+08) Out[17]: array(1.1e8) # numpy 1.14.5, In <= 1.13 this was not using scientific print Out[10]: tensor(110000000.) # old In [20]: torch.tensor([0.01, 10.]) Out[20]: tensor([ 0.0100, 10.0000]) Out[18]: array([ 0.01, 10. ]) # numpy In [21]: torch.tensor([0.01, 11.]) Out[21]: tensor([1.0000e-02, 1.1000e+01]) Out[19]: array([ 1.00000000e-02, 1.10000000e+01]) # numpy Out[7]: tensor([ 0.0100, 11.0000]) # old ``` When print floating number in int mode, we still need to respect rules to use scientific mode first ``` In [22]: torch.tensor([1., 1000.]) Out[22]: tensor([ 1., 1000.]) Out[20]: array([ 1., 1000.]) # numpy In [23]: torch.tensor([1., 1010.]) Out[23]: tensor([1.0000e+00, 1.0100e+03]) Out[21]: array([ 1.00000000e+00, 1.01000000e+03]) # numpy Out[9]: tensor([ 1., 1010.]) # old ``` Pull Request resolved: #12746 Differential Revision: D10443800 Pulled By: ailzhang fbshipit-source-id: f5e4e3fe9bf0b44af2c64c93a9ed42b73fa613f5
Summary: Pull Request resolved: #13090 Original commit changeset: 7f8a649c739c Reviewed By: Maratyszcza Differential Revision: D10846367 fbshipit-source-id: a5a5aad29b51287dc1cb80c707eb5a0008ec78f5
….cu (#13046) Summary: * Enable test_nn embedding tests and use correct warp size in Embedding.cu * Fix embedding_backward_feature_kernel kernel for HIP For attention: bddppq ezyang Pull Request resolved: #13046 Differential Revision: D10560721 Pulled By: bddppq fbshipit-source-id: e6c3cbeb980a34ff52a92dba8bde745a2e03f2fd
Summary: As the title says, we should always use the current stream on device in NCCL. This can unblock ezyang on his further work Pull Request resolved: #13089 Reviewed By: ezyang Differential Revision: D10847172 Pulled By: teng-li fbshipit-source-id: 7fc7c4248b5efa1971d2af4d43f62d3379debfe4
Summary: We want to move _C into the same cmake invocation that builds libcaffe2 and libtorch. However, _C depends on THD and c10d, which in turn depend on libcaffe2. That means that we can't move _C into that cmake file unless we do these two first. This change does so. Pull Request resolved: #12775 Differential Revision: D10457374 Pulled By: anderspapitto fbshipit-source-id: 2c1aa3b8a418a73d2112e93c7da53a2e70cf7bba
…ams for memcpy (#12954) Summary: - Moved sync_reduction to C++ - Use a dedicated CUDA stream for memcpy - Also use a dedicated CUDA stream for memcpy in queue_reduction Added test as well. CI should cover both DDP and unittest Pull Request resolved: #12954 Differential Revision: D10520069 Pulled By: teng-li fbshipit-source-id: 64348e4e43c15f9695a4c28b036c232587ecfb65
Summary: Pull Request resolved: #12953 Differential Revision: D10850274 Pulled By: anderspapitto fbshipit-source-id: 42296e6e49ad8c1845040e031eab95ddbaf58ae4
Summary: Pull Request resolved: #13094 Expose operator_def property Reviewed By: duc0 Differential Revision: D10847125 fbshipit-source-id: 67a066555b690715e1f5f04125fd446ab197f45a
Summary: Pull Request resolved: #13043 memset on nullptr is undefined-behavior and as a result filament_test is failing in dev build. This diff is making operator to handle empty output properly, so we can return that test back. I'm not sure either this is even valid to call this op with input that would require empty memset (empty batch?). Will leave this to ninghz and sunnieshang to decide. Reviewed By: xianjiec Differential Revision: D10525605 fbshipit-source-id: a911cdbd62fc3d948328981fd01cd205ec2ad99f
Summary: It's empty. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: #13078 Differential Revision: D10843892 Pulled By: ezyang fbshipit-source-id: 39e6f73b3a8be3e7573c1af727b65da246d4515b
Summary: Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: #13074 Differential Revision: D10852728 Pulled By: ezyang fbshipit-source-id: 6b96c941f4655ba240adaa0678844efa2af81d06
Summary: Pull Request resolved: #12656 I originally wanted to do this in two steps, but deleting the Storage-only constructor also changes the default numel state (which breaks tests), so easiest to do it all in one go.) - I still need a way to compute the correct TensorTypeId for all of the Caffe2 constructors; rather than hard-code it, I wrote a function in at::detail::computeTensorTypeId() to do this calculation. Maybe this function could be used more widely, but for now, it's used by Caffe2 only. - Added a pile more TensorTypeId for all of Caffe2's supported DeviceTypes - Because I still can't put arbitrary TypeMeta in TensorOptions, the TensorTypeId() calculation doesn't respect dtype. For now, this is not a problem, but this might block work to split non-POD dtypes into their own TensorTypeId. Reviewed By: li-roy Differential Revision: D10380678 fbshipit-source-id: 10c5d12020596fc9f27d5579adffad00513af363
Summary: Pull Request resolved: #13109 The "right" strategy of creating a socket, binding to an undefined port, closing the socket, and reusing the port it was bound to, was subject to a race condition. Another process could bind to that same port sooner than the tests would, causing an "Address already in use" failure when rank 0 would try and bind to that same port. The THD tests have been using a fixed port since forever. Time will tell if this fixes #12876. Differential Revision: D10850614 fbshipit-source-id: c19f12bb4916141187ee8ddb52880f5f418310dc
Differential Revision: D10404407 Differential Version: 61660077
Differential Revision: D10454455 Differential Version: 61660076
zdevito
pushed a commit
to zdevito/ATen
that referenced
this pull request
Oct 25, 2018
Summary: Pull Request resolved: pytorch/pytorch#13126 Signed-off-by: Edward Z. Yang <ezyang@fb.com> Differential Revision: D10454455 Pulled By: ezyang fbshipit-source-id: 7018a41b94e316305751f2f8ad2c2d049799f5d4
laurentdupin
pushed a commit
to laurentdupin/pytorch
that referenced
this pull request
Apr 24, 2026
Summary: Pull Request resolved: pytorch#13126 Signed-off-by: Edward Z. Yang <ezyang@fb.com> Differential Revision: D10454455 Pulled By: ezyang fbshipit-source-id: 7018a41b94e316305751f2f8ad2c2d049799f5d4
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Stack:
:white_circle: #12713 Documentation on TensorImpl. 💚
:black_circle: #13126 [pytorch][PR] Commentary about size constraints on TensorImpl. 💚
Signed-off-by: Edward Z. Yang ezyang@fb.com
GitHub Author: Edward Z. Yang ezyang@fb.com
Differential Revision: D10454455