Skip to content

[PyTorch] Use DimVector in at::matmul#72230

Closed
swolchok wants to merge 5 commits intogh/swolchok/444/basefrom
gh/swolchok/444/head
Closed

[PyTorch] Use DimVector in at::matmul#72230
swolchok wants to merge 5 commits intogh/swolchok/444/basefrom
gh/swolchok/444/head

Conversation

@swolchok
Copy link
Contributor

@swolchok swolchok commented Feb 3, 2022

Stack from ghstack:

Here's a small PR that only fixes the extra heap allocations for shapes. Hopefully won't get stuck like #64387.

Differential Revision: D33962610

Here's a small PR that only fixes the extra heap allocations for shapes. Hopefully won't get stuck like #64387.

Differential Revision: [D33962610](https://our.internmc.facebook.com/intern/diff/D33962610/)

[ghstack-poisoned]
@facebook-github-bot
Copy link
Contributor

facebook-github-bot commented Feb 3, 2022

🔗 Helpful links

💊 CI failures summary and remediations

As of commit bb30773 (more details on the Dr. CI page):


  • 3/3 failures introduced in this PR

🕵️ 3 new failures recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

See GitHub Actions build win-vs2019-cpu-py3 / test (default, 2, 2, windows.4xlarge) (1/3)

Step: "Setup Python3" (full log | diagnosis details | 🔁 rerun)

2022-02-14T19:23:10.5579722Z C:\actions-runner\...9d18c23dd12.sh: line 1: python3: command not found
2022-02-14T19:23:10.5029367Z   SHARD_NUMBER: 2
2022-02-14T19:23:10.5029682Z   NUM_TEST_SHARDS: 2
2022-02-14T19:23:10.5029990Z   TEST_CONFIG: default
2022-02-14T19:23:10.5031097Z   http_proxy: http://internal-tf-lb-20210727220640487900000002-835786077.us-east-1.elb.amazonaws.com:3128
2022-02-14T19:23:10.5032927Z   https_proxy: http://internal-tf-lb-20210727220640487900000002-835786077.us-east-1.elb.amazonaws.com:3128
2022-02-14T19:23:10.5034056Z   PYTORCH_IGNORE_DISABLED_ISSUES: 
2022-02-14T19:23:10.5034398Z   BRANCH: 
2022-02-14T19:23:10.5034638Z   TAG: 
2022-02-14T19:23:10.5034916Z   WORKFLOW_ID: 1842658102
2022-02-14T19:23:10.5035208Z ##[endgroup]
2022-02-14T19:23:10.5579722Z C:\actions-runner\_work\_temp\c5af4a52-2237-4dfb-95cf-b9d18c23dd12.sh: line 1: python3: command not found
2022-02-14T19:23:10.5603499Z ##[error]Process completed with exit code 127.
2022-02-14T19:23:10.5707295Z ##[group]Run rm -rf ./*
2022-02-14T19:23:10.5707673Z �[36;1mrm -rf ./*�[0m
2022-02-14T19:23:10.5728429Z shell: C:\Program Files\Git\usr\bin\bash.EXE --noprofile --norc -e -o pipefail {0}
2022-02-14T19:23:10.5728905Z env:
2022-02-14T19:23:10.5729273Z   BUILD_ENVIRONMENT: win-vs2019-cpu-py3
2022-02-14T19:23:10.5729664Z   BUILD_WHEEL: 1
2022-02-14T19:23:10.5729933Z   MAX_JOBS: 8
2022-02-14T19:23:10.5730224Z   CUDA_VERSION: cpu
2022-02-14T19:23:10.5730489Z   IN_CI: 1

See GitHub Actions build linux-bionic-py3.7-clang9 / test (xla, 1, 1, linux.2xlarge) (2/3)

Step: "Test" (full log | diagnosis details | 🔁 rerun)

2022-02-14T18:21:35.2291349Z /var/lib/jenkins/w... at::Tensor&, const at::Tensor&, c10::string_view)
2022-02-14T18:21:35.2287750Z  at::Tensor XLANativeFunctions::gelu(const at::Tensor& self) {
2022-02-14T18:21:35.2287870Z             ^~~~~~~~~~~~~~~~~~
2022-02-14T18:21:35.2288141Z In file included from /var/lib/jenkins/workspace/xla/torch_xla/csrc/aten_xla_type.cpp:13:0:
2022-02-14T18:21:35.2288602Z /var/lib/jenkins/workspace/xla/torch_xla/csrc/XLANativeFunctions.h:182:19: error: candidate is: static at::Tensor torch_xla::XLANativeFunctions::gelu(const at::Tensor&, c10::string_view)
2022-02-14T18:21:35.2288848Z  static at::Tensor gelu(const at::Tensor & self, c10::string_view approximate);
2022-02-14T18:21:35.2288965Z                    ^~~~
2022-02-14T18:21:35.2289941Z /var/lib/jenkins/workspace/xla/torch_xla/csrc/aten_xla_type.cpp:1514:12: error: prototype for ‘at::Tensor torch_xla::XLANativeFunctions::gelu_backward(const at::Tensor&, const at::Tensor&)’ does not match any in class ‘torch_xla::XLANativeFunctions’
2022-02-14T18:21:35.2290186Z  at::Tensor XLANativeFunctions::gelu_backward(const at::Tensor& grad,
2022-02-14T18:21:35.2290289Z             ^~~~~~~~~~~~~~~~~~
2022-02-14T18:21:35.2290813Z In file included from /var/lib/jenkins/workspace/xla/torch_xla/csrc/aten_xla_type.cpp:13:0:
2022-02-14T18:21:35.2291349Z /var/lib/jenkins/workspace/xla/torch_xla/csrc/XLANativeFunctions.h:183:19: error: candidate is: static at::Tensor torch_xla::XLANativeFunctions::gelu_backward(const at::Tensor&, const at::Tensor&, c10::string_view)
2022-02-14T18:21:35.2291662Z  static at::Tensor gelu_backward(const at::Tensor & grad_output, const at::Tensor & self, c10::string_view approximate);
2022-02-14T18:21:35.2291787Z                    ^~~~~~~~~~~~~
2022-02-14T18:21:35.2336087Z [99/179] c++ -MMD -MF /var/lib/jenkins/workspace/xla/build/temp.linux-x86_64-3.7/torch_xla/csrc/ops/index_select.o.d -pthread -B /opt/conda/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/var/lib/jenkins/workspace/xla -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-bin -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow/external/protobuf_archive/src -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow/external/com_google_protobuf/src -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow/external/eigen_archive -I/var/lib/jenkins/workspace/xla/third_party/tensorflow/bazel-tensorflow/external/com_google_absl -I/var/lib/jenkins/workspace -I/var/lib/jenkins/workspace/torch/csrc -I/var/lib/jenkins/workspace/torch/lib/tmp_install/include -I/opt/conda/lib/python3.7/site-packages/torch/include -I/opt/conda/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/lib/python3.7/site-packages/torch/include/TH -I/opt/conda/lib/python3.7/site-packages/torch/include/THC -I/opt/conda/include/python3.7m -c -c /var/lib/jenkins/workspace/xla/torch_xla/csrc/ops/index_select.cpp -o /var/lib/jenkins/workspace/xla/build/temp.linux-x86_64-3.7/torch_xla/csrc/ops/index_select.o -std=c++14 -Wno-sign-compare -Wno-deprecated-declarations -Wno-return-type -DNDEBUG -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_clang"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1002"' -DTORCH_EXTENSION_NAME=_XLAC -D_GLIBCXX_USE_CXX11_ABI=1
2022-02-14T18:21:35.2336594Z cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
2022-02-14T18:21:35.2336842Z In file included from /var/lib/jenkins/workspace/c10/util/Logging.h:28:0,
2022-02-14T18:21:35.2337205Z                  from /var/lib/jenkins/workspace/c10/core/TensorImpl.h:14,
2022-02-14T18:21:35.2337643Z                  from /opt/conda/lib/python3.7/site-packages/torch/include/ATen/core/TensorBody.h:21,
2022-02-14T18:21:35.2338044Z                  from /opt/conda/lib/python3.7/site-packages/torch/include/ATen/Tensor.h:3,
2022-02-14T18:21:35.2338265Z                  from /var/lib/jenkins/workspace/torch/csrc/lazy/core/hash.h:12,
2022-02-14T18:21:35.2338473Z                  from /var/lib/jenkins/workspace/xla/torch_xla/csrc/ir.h:19,

See GitHub Actions build linux-xenial-py3.7-clang7-onnx / test (default, 1, 2, linux.2xlarge) (3/3)

Step: "Test" (full log | diagnosis details | 🔁 rerun)

2022-02-14T18:18:29.7688000Z �[31mERROR: pip's ... the source of the following dependency conflicts.
2022-02-14T18:18:28.2357836Z ++ stat --format %U /opt/conda/bin/pip
2022-02-14T18:18:28.2367902Z + PIP_USER=jenkins
2022-02-14T18:18:28.2370095Z ++ id -u -n
2022-02-14T18:18:28.2379788Z + CURRENT_USER=jenkins
2022-02-14T18:18:28.2379965Z + [[ jenkins = root ]]
2022-02-14T18:18:28.2380285Z + pip -q uninstall -y hypothesis
2022-02-14T18:18:28.6033753Z + pip -q uninstall -y coverage
2022-02-14T18:18:28.8920970Z �[33mWARNING: Skipping coverage as it is not installed.�[0m
2022-02-14T18:18:28.9340242Z + pip -q install attrs==18.1.0 -f https://s3.amazonaws.com/ossci-linux/wheels/attrs-18.1.0-py2.py3-none-any.whl
2022-02-14T18:18:29.2675860Z �[33mWARNING: Skipping page https://s3.amazonaws.com/ossci-linux/wheels/attrs-18.1.0-py2.py3-none-any.whl because the HEAD request got Content-Type: binary/octet-stream.The only supported Content-Type is text/html�[0m
2022-02-14T18:18:29.7688000Z �[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
2022-02-14T18:18:29.7688519Z pytest 7.0.1 requires attrs>=19.2.0, but you have attrs 18.1.0 which is incompatible.�[0m
2022-02-14T18:18:29.8283367Z + pip -q install coverage==4.5.1 -f https://s3.amazonaws.com/ossci-linux/wheels/coverage-4.5.1-cp36-cp36m-macosx_10_12_x86_64.whl
2022-02-14T18:18:30.1594272Z �[33mWARNING: Skipping page https://s3.amazonaws.com/ossci-linux/wheels/coverage-4.5.1-cp36-cp36m-macosx_10_12_x86_64.whl because the HEAD request got Content-Type: binary/octet-stream.The only supported Content-Type is text/html�[0m
2022-02-14T18:18:31.1436640Z + pip -q install hypothesis==3.44.6 -f https://s3.amazonaws.com/ossci-linux/wheels/hypothesis-3.44.6-py3-none-any.whl
2022-02-14T18:18:31.4614504Z �[33mWARNING: Skipping page https://s3.amazonaws.com/ossci-linux/wheels/hypothesis-3.44.6-py3-none-any.whl because the HEAD request got Content-Type: binary/octet-stream.The only supported Content-Type is text/html�[0m
2022-02-14T18:18:46.4808350Z �[33mWARNING: Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProxyError('Cannot connect to proxy.', timeout('timed out'))': /simple/hypothesis/�[0m
2022-02-14T18:18:47.6895063Z + EXTRA_TESTS=()
2022-02-14T18:18:47.6895568Z + [[ linux-xenial-py3.7-clang7-onnx == *-cuda* ]]
2022-02-14T18:18:47.6896060Z + [[ linux-xenial-py3.7-clang7-onnx == *-rocm* ]]
2022-02-14T18:18:47.6896283Z + rocm_ignore_test=()

This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

@pytorch-bot
Copy link

pytorch-bot bot commented Feb 3, 2022

CI Flow Status

⚛️ CI Flow

Ruleset - Version: v1
Ruleset - File: https://github.com/pytorch/pytorch/blob/66626bad62f9a15045b0de6cd7938db33db9ae47/.github/generated-ciflow-ruleset.json
PR ciflow labels: ciflow/default
Add ciflow labels to this PR to trigger more builds:

Workflows Labels (bold enabled) Status
Triggered Workflows
linux-binary-conda ciflow/binaries, ciflow/binaries_conda, ciflow/default ✅ triggered
linux-binary-libtorch-cxx11-abi ciflow/binaries, ciflow/binaries_libtorch, ciflow/default ✅ triggered
linux-binary-libtorch-pre-cxx11 ciflow/binaries, ciflow/binaries_libtorch, ciflow/default ✅ triggered
linux-binary-manywheel ciflow/binaries, ciflow/binaries_wheel, ciflow/default ✅ triggered
linux-bionic-py3.7-clang9 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/noarch, ciflow/trunk, ciflow/xla ✅ triggered
linux-bionic-rocm4.5-py3.7 ciflow/all, ciflow/default, ciflow/linux, ciflow/rocm, ciflow/trunk ✅ triggered
linux-docs ciflow/all, ciflow/cpu, ciflow/default, ciflow/docs, ciflow/linux, ciflow/trunk ✅ triggered
linux-vulkan-bionic-py3.7-clang9 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk, ciflow/vulkan ✅ triggered
linux-xenial-cuda11.3-py3.7-gcc7 ciflow/all, ciflow/cuda, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
linux-xenial-cuda11.3-py3.7-gcc7-bazel-test ciflow/all, ciflow/bazel, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
linux-xenial-py3-clang5-mobile-build ciflow/all, ciflow/default, ciflow/linux, ciflow/mobile, ciflow/trunk ✅ triggered
linux-xenial-py3-clang5-mobile-custom-build-static ciflow/all, ciflow/default, ciflow/linux, ciflow/mobile, ciflow/trunk ✅ triggered
linux-xenial-py3.7-clang7-asan ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/sanitizers, ciflow/trunk ✅ triggered
linux-xenial-py3.7-clang7-onnx ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/onnx, ciflow/trunk ✅ triggered
linux-xenial-py3.7-gcc5.4 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
linux-xenial-py3.7-gcc7 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
linux-xenial-py3.7-gcc7-no-ops ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-gradle-custom-build-single ciflow/all, ciflow/android, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-gradle-custom-build-single-full-jit ciflow/all, ciflow/android, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
win-vs2019-cpu-py3 ciflow/all, ciflow/cpu, ciflow/default, ciflow/trunk, ciflow/win ✅ triggered
win-vs2019-cuda11.3-py3 ciflow/all, ciflow/cuda, ciflow/default, ciflow/trunk, ciflow/win ✅ triggered
windows-binary-libtorch-cxx11-abi ciflow/binaries, ciflow/binaries_libtorch, ciflow/default ✅ triggered
windows-binary-libtorch-pre-cxx11 ciflow/binaries, ciflow/binaries_libtorch, ciflow/default ✅ triggered
windows-binary-wheel ciflow/binaries, ciflow/binaries_wheel, ciflow/default ✅ triggered
Skipped Workflows
caffe2-linux-xenial-py3.7-gcc5.4 ciflow/all, ciflow/cpu, ciflow/linux, ciflow/trunk 🚫 skipped
docker-builds ciflow/all, ciflow/trunk 🚫 skipped
ios-12-5-1-arm64 ciflow/all, ciflow/ios, ciflow/macos, ciflow/trunk 🚫 skipped
ios-12-5-1-arm64-coreml ciflow/all, ciflow/ios, ciflow/macos, ciflow/trunk 🚫 skipped
ios-12-5-1-arm64-custom-ops ciflow/all, ciflow/ios, ciflow/macos, ciflow/trunk 🚫 skipped
ios-12-5-1-arm64-full-jit ciflow/all, ciflow/ios, ciflow/macos, ciflow/trunk 🚫 skipped
ios-12-5-1-arm64-metal ciflow/all, ciflow/ios, ciflow/macos, ciflow/trunk 🚫 skipped
ios-12-5-1-x86-64 ciflow/all, ciflow/ios, ciflow/macos, ciflow/trunk 🚫 skipped
ios-12-5-1-x86-64-coreml ciflow/all, ciflow/ios, ciflow/macos, ciflow/trunk 🚫 skipped
ios-12-5-1-x86-64-full-jit ciflow/all, ciflow/ios, ciflow/macos, ciflow/trunk 🚫 skipped
libtorch-linux-xenial-cuda10.2-py3.7-gcc7 ciflow/all, ciflow/cuda, ciflow/libtorch, ciflow/linux, ciflow/trunk 🚫 skipped
libtorch-linux-xenial-cuda11.3-py3.7-gcc7 ciflow/all, ciflow/cuda, ciflow/libtorch, ciflow/linux, ciflow/trunk 🚫 skipped
linux-bionic-cuda10.2-py3.9-gcc7 ciflow/all, ciflow/cuda, ciflow/linux, ciflow/slow, ciflow/trunk 🚫 skipped
linux-docs-push ciflow/all, ciflow/cpu, ciflow/linux, ciflow/scheduled 🚫 skipped
linux-xenial-cuda11.3-py3.7-gcc7-no-ops ciflow/all, ciflow/cuda, ciflow/linux, ciflow/trunk 🚫 skipped
macos-10-15-py3-arm64 ciflow/all, ciflow/macos, ciflow/trunk 🚫 skipped
macos-10-15-py3-lite-interpreter-x86-64 ciflow/all, ciflow/macos, ciflow/trunk 🚫 skipped
macos-11-py3-x86-64 ciflow/all, ciflow/macos, ciflow/trunk 🚫 skipped
parallelnative-linux-xenial-py3.7-gcc5.4 ciflow/all, ciflow/cpu, ciflow/linux, ciflow/trunk 🚫 skipped
periodic-libtorch-linux-bionic-cuda11.5-py3.7-gcc7 ciflow/all, ciflow/cuda, ciflow/libtorch, ciflow/linux, ciflow/scheduled 🚫 skipped
periodic-libtorch-linux-xenial-cuda11.1-py3.7-gcc7 ciflow/all, ciflow/cuda, ciflow/libtorch, ciflow/linux, ciflow/scheduled 🚫 skipped
periodic-linux-bionic-cuda11.5-py3.7-gcc7 ciflow/all, ciflow/cuda, ciflow/linux, ciflow/scheduled 🚫 skipped
periodic-linux-xenial-cuda10.2-py3-gcc7-slow-gradcheck ciflow/all, ciflow/cuda, ciflow/linux, ciflow/scheduled, ciflow/slow, ciflow/slow-gradcheck 🚫 skipped
periodic-linux-xenial-cuda11.1-py3.7-gcc7-debug ciflow/all, ciflow/cuda, ciflow/linux, ciflow/scheduled 🚫 skipped
periodic-win-vs2019-cuda11.1-py3 ciflow/all, ciflow/cuda, ciflow/scheduled, ciflow/win 🚫 skipped
periodic-win-vs2019-cuda11.5-py3 ciflow/all, ciflow/cuda, ciflow/scheduled, ciflow/win 🚫 skipped
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-build ciflow/all, ciflow/android, ciflow/cpu, ciflow/linux, ciflow/trunk 🚫 skipped

swolchok added a commit that referenced this pull request Feb 3, 2022
Here's a small PR that only fixes the extra heap allocations for shapes. Hopefully won't get stuck like #64387.

Differential Revision: [D33962610](https://our.internmc.facebook.com/intern/diff/D33962610/)

ghstack-source-id: 148261442
Pull Request resolved: #72230
@lezcano
Copy link
Collaborator

lezcano commented Feb 3, 2022

I should spend some time splitting that PR into more manageable chunks to be able to isolate that mobile failure...

Copy link
Collaborator

@IvanYashchuk IvanYashchuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe tensor.sizes().vec() should be disallowed? :)

Here's a small PR that only fixes the extra heap allocations for shapes. Hopefully won't get stuck like #64387.

Differential Revision: [D33962610](https://our.internmc.facebook.com/intern/diff/D33962610/)

[ghstack-poisoned]
swolchok added a commit that referenced this pull request Feb 3, 2022
Pull Request resolved: #72230

Here's a small PR that only fixes the extra heap allocations for shapes. Hopefully won't get stuck like #64387.
ghstack-source-id: 148314580

Differential Revision: [D33962610](https://our.internmc.facebook.com/intern/diff/D33962610/)
Here's a small PR that only fixes the extra heap allocations for shapes. Hopefully won't get stuck like #64387.

Differential Revision: [D33962610](https://our.internmc.facebook.com/intern/diff/D33962610/)

[ghstack-poisoned]
swolchok added a commit that referenced this pull request Feb 4, 2022
Pull Request resolved: #72230

Here's a small PR that only fixes the extra heap allocations for shapes. Hopefully won't get stuck like #64387.
ghstack-source-id: 148446641

Differential Revision: [D33962610](https://our.internmc.facebook.com/intern/diff/D33962610/)
Here's a small PR that only fixes the extra heap allocations for shapes. Hopefully won't get stuck like #64387.

Differential Revision: [D33962610](https://our.internmc.facebook.com/intern/diff/D33962610/)

[ghstack-poisoned]
swolchok added a commit that referenced this pull request Feb 7, 2022
Pull Request resolved: #72230

Here's a small PR that only fixes the extra heap allocations for shapes. Hopefully won't get stuck like #64387.
ghstack-source-id: 148556529

Differential Revision: [D33962610](https://our.internmc.facebook.com/intern/diff/D33962610/)
Here's a small PR that only fixes the extra heap allocations for shapes. Hopefully won't get stuck like #64387.

Differential Revision: [D33962610](https://our.internmc.facebook.com/intern/diff/D33962610/)

[ghstack-poisoned]
swolchok added a commit that referenced this pull request Feb 14, 2022
Pull Request resolved: #72230

Here's a small PR that only fixes the extra heap allocations for shapes. Hopefully won't get stuck like #64387.
ghstack-source-id: 149069527

Differential Revision: [D33962610](https://our.internmc.facebook.com/intern/diff/D33962610/)
facebook-github-bot pushed a commit that referenced this pull request Feb 15, 2022
Summary:
Pull Request resolved: #72230

Here's a small PR that only fixes the extra heap allocations for shapes. Hopefully won't get stuck like #64387.
ghstack-source-id: 149069527

Test Plan: CI

Reviewed By: ngimel

Differential Revision: D33962610

fbshipit-source-id: 51e200f5237bdf225bfb2445e1e36bacd0e65e92
@github-actions
Copy link
Contributor

Hey @swolchok.
You've committed this PR, but it does not have both a 'release notes: ...' and 'topics: ...' label. Please add one of each to the PR. The 'release notes: ...' label should represent the part of PyTorch that this PR changes (fx, autograd, distributed, etc) and the 'topics: ...' label should represent the kind of PR it is (not user facing, new feature, bug fix, perf improvement, etc). The list of valid labels can be found here for the 'release notes: ...' and here for the 'topics: ...'.
For changes that are 'topic: not user facing' there is no need for a release notes label.

cyyever pushed a commit to cyyever/pytorch_private that referenced this pull request Feb 15, 2022
Summary:
Pull Request resolved: pytorch/pytorch#72230

Here's a small PR that only fixes the extra heap allocations for shapes. Hopefully won't get stuck like #64387.
ghstack-source-id: 149069527

Test Plan: CI

Reviewed By: ngimel

Differential Revision: D33962610

fbshipit-source-id: 51e200f5237bdf225bfb2445e1e36bacd0e65e92
(cherry picked from commit 027537f32965d23fc78a36fec71be41cd5cbce3d)
cyyever pushed a commit to cyyever/pytorch_private that referenced this pull request Feb 15, 2022
Summary:
Pull Request resolved: pytorch/pytorch#72230

Here's a small PR that only fixes the extra heap allocations for shapes. Hopefully won't get stuck like #64387.
ghstack-source-id: 149069527

Test Plan: CI

Reviewed By: ngimel

Differential Revision: D33962610

fbshipit-source-id: 51e200f5237bdf225bfb2445e1e36bacd0e65e92
(cherry picked from commit 027537f32965d23fc78a36fec71be41cd5cbce3d)
cyyever pushed a commit to cyyever/pytorch_private that referenced this pull request Feb 15, 2022
Summary:
Pull Request resolved: pytorch/pytorch#72230

Here's a small PR that only fixes the extra heap allocations for shapes. Hopefully won't get stuck like #64387.
ghstack-source-id: 149069527

Test Plan: CI

Reviewed By: ngimel

Differential Revision: D33962610

fbshipit-source-id: 51e200f5237bdf225bfb2445e1e36bacd0e65e92
(cherry picked from commit 027537f32965d23fc78a36fec71be41cd5cbce3d)
cyyever pushed a commit to cyyever/pytorch_private that referenced this pull request Feb 16, 2022
Summary:
Pull Request resolved: pytorch/pytorch#72230

Here's a small PR that only fixes the extra heap allocations for shapes. Hopefully won't get stuck like #64387.
ghstack-source-id: 149069527

Test Plan: CI

Reviewed By: ngimel

Differential Revision: D33962610

fbshipit-source-id: 51e200f5237bdf225bfb2445e1e36bacd0e65e92
(cherry picked from commit 027537f32965d23fc78a36fec71be41cd5cbce3d)
cyyever pushed a commit to cyyever/pytorch_private that referenced this pull request Feb 16, 2022
Summary:
Pull Request resolved: pytorch/pytorch#72230

Here's a small PR that only fixes the extra heap allocations for shapes. Hopefully won't get stuck like #64387.
ghstack-source-id: 149069527

Test Plan: CI

Reviewed By: ngimel

Differential Revision: D33962610

fbshipit-source-id: 51e200f5237bdf225bfb2445e1e36bacd0e65e92
(cherry picked from commit 027537f32965d23fc78a36fec71be41cd5cbce3d)
cyyever pushed a commit to cyyever/pytorch_private that referenced this pull request Feb 17, 2022
Summary:
Pull Request resolved: pytorch/pytorch#72230

Here's a small PR that only fixes the extra heap allocations for shapes. Hopefully won't get stuck like #64387.
ghstack-source-id: 149069527

Test Plan: CI

Reviewed By: ngimel

Differential Revision: D33962610

fbshipit-source-id: 51e200f5237bdf225bfb2445e1e36bacd0e65e92
(cherry picked from commit 027537f32965d23fc78a36fec71be41cd5cbce3d)
lezcano added a commit that referenced this pull request Apr 4, 2022
This PR implements the bulk of #64387

Part of the optimisations were already merged in #72230

A number of these optimisations include:
- Make the code `const` correct.
- Create `DimVector`'s more efficiently (e.g. prefer `append` over
`insert`).
- Access sizes of the tensors via `sizes().front()` / `sizes().back()`
  / `sizes().end()[-2]`
- Do not create intermediary tensors / vectors when it can be avoided.
- Call `reshape` rather than `expect_contiguous`  + `view`

On top of these, it fixes a correctness issue of `matmul_out`, where the
out parameter was not resized correctly when passed to the backends.
This involves removing the use of `set_` from the calling code, as
requested by ezyang, and it incurs on most of the complexity of the
code that this PR adds.

[ghstack-poisoned]
lezcano added a commit that referenced this pull request Apr 5, 2022
…ic boogaloo"


This PR implements the bulk of #64387

Part of the optimisations were already merged in #72230

A number of these optimisations include:
- Make the code `const` correct.
- Create `DimVector`'s more efficiently (e.g. prefer `append` over
`insert`).
- Access sizes of the tensors via `sizes().front()` / `sizes().back()`
  / `sizes().end()[-2]`
- Do not create intermediary tensors / vectors when it can be avoided.
- Call `reshape` rather than `expect_contiguous`  + `view`

On top of these, it fixes a correctness issue of `matmul_out`, where the
out parameter was not resized correctly when passed to the backends.
This involves removing the use of `set_` from the calling code, as
requested by ezyang, and it incurs on most of the complexity of the
code that this PR adds.

[ghstack-poisoned]
lezcano added a commit that referenced this pull request Apr 5, 2022
This PR implements the bulk of #64387

Part of the optimisations were already merged in #72230

A number of these optimisations include:
- Make the code `const` correct.
- Create `DimVector`'s more efficiently (e.g. prefer `append` over
`insert`).
- Access sizes of the tensors via `sizes().front()` / `sizes().back()`
  / `sizes().end()[-2]`
- Do not create intermediary tensors / vectors when it can be avoided.
- Call `reshape` rather than `expect_contiguous`  + `view`

On top of these, it fixes a correctness issue of `matmul_out`, where the
out parameter was not resized correctly when passed to the backends.
This involves removing the use of `set_` from the calling code, as
requested by ezyang, and it incurs on most of the complexity of the
code that this PR adds.

[ghstack-poisoned]
lezcano added a commit that referenced this pull request Apr 5, 2022
This PR implements the bulk of
#64387

Part of the optimisations were already merged in
#72230

A number of these optimisations include:
- Make the code `const` correct.
- Create `DimVector`'s more efficiently (e.g. prefer `append` over
`insert`).
- Access sizes of the tensors via `sizes().front()` / `sizes().back()`
  / `sizes().end()[-2]`
- Do not create intermediary tensors / vectors when it can be avoided.
- Call `reshape` rather than `expect_contiguous`  + `view`

On top of these, it fixes a correctness issue of `matmul_out`, where the
out parameter was not resized correctly when passed to the backends.
This involves removing the use of `set_` from the calling code, as
requested by ezyang, and it incurs on most of the complexity of the
code that this PR adds.

ghstack-source-id: 0808c8f
Pull Request resolved: #75197
lezcano added a commit that referenced this pull request Apr 5, 2022
…ic boogaloo"


This PR implements the bulk of #64387

Part of the optimisations were already merged in #72230

A number of these optimisations include:
- Make the code `const` correct.
- Create `DimVector`'s more efficiently (e.g. prefer `append` over
`insert`).
- Access sizes of the tensors via `sizes().front()` / `sizes().back()`
  / `sizes().end()[-2]`
- Do not create intermediary tensors / vectors when it can be avoided.
- Call `reshape` rather than `expect_contiguous`  + `view`

On top of these, it fixes a correctness issue of `matmul_out`, where the
out parameter was not resized correctly when passed to the backends.
This involves removing the use of `set_` from the calling code, as
requested by ezyang, and it incurs on most of the complexity of the
code that this PR adds.

[ghstack-poisoned]
lezcano added a commit that referenced this pull request Apr 5, 2022
This PR implements the bulk of #64387

Part of the optimisations were already merged in #72230

A number of these optimisations include:
- Make the code `const` correct.
- Create `DimVector`'s more efficiently (e.g. prefer `append` over
`insert`).
- Access sizes of the tensors via `sizes().front()` / `sizes().back()`
  / `sizes().end()[-2]`
- Do not create intermediary tensors / vectors when it can be avoided.
- Call `reshape` rather than `expect_contiguous`  + `view`

On top of these, it fixes a correctness issue of `matmul_out`, where the
out parameter was not resized correctly when passed to the backends.
This involves removing the use of `set_` from the calling code, as
requested by ezyang, and it incurs on most of the complexity of the
code that this PR adds.

[ghstack-poisoned]
lezcano added a commit that referenced this pull request Apr 6, 2022
…ic boogaloo"


This PR implements the bulk of #64387

Part of the optimisations were already merged in #72230

A number of these optimisations include:
- Make the code `const` correct.
- Create `DimVector`'s more efficiently (e.g. prefer `append` over
`insert`).
- Access sizes of the tensors via `sizes().front()` / `sizes().back()`
  / `sizes().end()[-2]`
- Do not create intermediary tensors / vectors when it can be avoided.
- Call `reshape` rather than `expect_contiguous`  + `view`

On top of these, it fixes a correctness issue of `matmul_out`, where the
out parameter was not resized correctly when passed to the backends.
This involves removing the use of `set_` from the calling code, as
requested by ezyang, and it incurs on most of the complexity of the
code that this PR adds.

[ghstack-poisoned]
lezcano added a commit that referenced this pull request Apr 6, 2022
This PR implements the bulk of
#64387

Part of the optimisations were already merged in
#72230

A number of these optimisations include:
- Make the code `const` correct.
- Create `DimVector`'s more efficiently (e.g. prefer `append` over
`insert`).
- Access sizes of the tensors via `sizes().front()` / `sizes().back()`
  / `sizes().end()[-2]`
- Do not create intermediary tensors / vectors when it can be avoided.
- Call `reshape` rather than `expect_contiguous`  + `view`

On top of these, it fixes a correctness issue of `matmul_out`, where the
out parameter was not resized correctly when passed to the backends.
This involves removing the use of `set_` from the calling code, as
requested by ezyang, and it incurs on most of the complexity of the
code that this PR adds.

ghstack-source-id: 4f44593
Pull Request resolved: #75197
lezcano added a commit that referenced this pull request Apr 6, 2022
This PR implements the bulk of #64387

Part of the optimisations were already merged in #72230

A number of these optimisations include:
- Make the code `const` correct.
- Create `DimVector`'s more efficiently (e.g. prefer `append` over
`insert`).
- Access sizes of the tensors via `sizes().front()` / `sizes().back()`
  / `sizes().end()[-2]`
- Do not create intermediary tensors / vectors when it can be avoided.
- Call `reshape` rather than `expect_contiguous`  + `view`

On top of these, it fixes a correctness issue of `matmul_out`, where the
out parameter was not resized correctly when passed to the backends.
This involves removing the use of `set_` from the calling code, as
requested by ezyang, and it incurs on most of the complexity of the
code that this PR adds.

[ghstack-poisoned]
lezcano added a commit that referenced this pull request Apr 6, 2022
…ic boogaloo"


This PR implements the bulk of #64387

Part of the optimisations were already merged in #72230

A number of these optimisations include:
- Make the code `const` correct.
- Create `DimVector`'s more efficiently (e.g. prefer `append` over
`insert`).
- Access sizes of the tensors via `sizes().front()` / `sizes().back()`
  / `sizes().end()[-2]`
- Do not create intermediary tensors / vectors when it can be avoided.
- Call `reshape` rather than `expect_contiguous`  + `view`

On top of these, it fixes a correctness issue of `matmul_out`, where the
out parameter was not resized correctly when passed to the backends.
This involves removing the use of `set_` from the calling code, as
requested by ezyang, and it incurs on most of the complexity of the
code that this PR adds.

[ghstack-poisoned]
lezcano added a commit that referenced this pull request Apr 6, 2022
This PR implements the bulk of
#64387

Part of the optimisations were already merged in
#72230

A number of these optimisations include:
- Make the code `const` correct.
- Create `DimVector`'s more efficiently (e.g. prefer `append` over
`insert`).
- Access sizes of the tensors via `sizes().front()` / `sizes().back()`
  / `sizes().end()[-2]`
- Do not create intermediary tensors / vectors when it can be avoided.
- Call `reshape` rather than `expect_contiguous`  + `view`

On top of these, it fixes a correctness issue of `matmul_out`, where the
out parameter was not resized correctly when passed to the backends.
This involves removing the use of `set_` from the calling code, as
requested by ezyang, and it incurs on most of the complexity of the
code that this PR adds.

ghstack-source-id: 429f9e4
Pull Request resolved: #75197
lezcano added a commit that referenced this pull request Apr 6, 2022
This PR implements the bulk of #64387

Part of the optimisations were already merged in #72230

A number of these optimisations include:
- Make the code `const` correct.
- Create `DimVector`'s more efficiently (e.g. prefer `append` over
`insert`).
- Access sizes of the tensors via `sizes().front()` / `sizes().back()`
  / `sizes().end()[-2]`
- Do not create intermediary tensors / vectors when it can be avoided.
- Call `reshape` rather than `expect_contiguous`  + `view`

On top of these, it fixes a correctness issue of `matmul_out`, where the
out parameter was not resized correctly when passed to the backends.
This involves removing the use of `set_` from the calling code, as
requested by ezyang, and it incurs on most of the complexity of the
code that this PR adds.

[ghstack-poisoned]
lezcano added a commit that referenced this pull request Apr 21, 2022
…ic boogaloo"


This PR implements the bulk of #64387

Part of the optimisations were already merged in #72230

A number of these optimisations include:
- Make the code `const` correct.
- Create `DimVector`'s more efficiently (e.g. prefer `append` over
`insert`).
- Access sizes of the tensors via `sizes().front()` / `sizes().back()`
  / `sizes().end()[-2]`
- Do not create intermediary tensors / vectors when it can be avoided.
- Call `reshape` rather than `expect_contiguous`  + `view`

On top of these, it fixes a correctness issue of `matmul_out`, where the
out parameter was not resized correctly when passed to the backends.
This involves removing the use of `set_` from the calling code, as
requested by ezyang, and it incurs on most of the complexity of the
code that this PR adds.

[ghstack-poisoned]
lezcano added a commit that referenced this pull request Apr 21, 2022
This PR implements the bulk of #64387

Part of the optimisations were already merged in #72230

A number of these optimisations include:
- Make the code `const` correct.
- Create `DimVector`'s more efficiently (e.g. prefer `append` over
`insert`).
- Access sizes of the tensors via `sizes().front()` / `sizes().back()`
  / `sizes().end()[-2]`
- Do not create intermediary tensors / vectors when it can be avoided.
- Call `reshape` rather than `expect_contiguous`  + `view`

On top of these, it fixes a correctness issue of `matmul_out`, where the
out parameter was not resized correctly when passed to the backends.
This involves removing the use of `set_` from the calling code, as
requested by ezyang, and it incurs on most of the complexity of the
code that this PR adds.

[ghstack-poisoned]
lezcano added a commit that referenced this pull request Apr 21, 2022
This PR implements the bulk of
#64387

Part of the optimisations were already merged in
#72230

A number of these optimisations include:
- Make the code `const` correct.
- Create `DimVector`'s more efficiently (e.g. prefer `append` over
`insert`).
- Access sizes of the tensors via `sizes().front()` / `sizes().back()`
  / `sizes().end()[-2]`
- Do not create intermediary tensors / vectors when it can be avoided.
- Call `reshape` rather than `expect_contiguous`  + `view`

On top of these, it fixes a correctness issue of `matmul_out`, where the
out parameter was not resized correctly when passed to the backends.
This involves removing the use of `set_` from the calling code, as
requested by ezyang, and it incurs on most of the complexity of the
code that this PR adds.

ghstack-source-id: 1d8dadd
Pull Request resolved: #75197
lezcano added a commit that referenced this pull request May 4, 2022
…ic boogaloo"


This PR implements the bulk of #64387

Part of the optimisations were already merged in #72230

A number of these optimisations include:
- Make the code `const` correct.
- Create `DimVector`'s more efficiently (e.g. prefer `append` over
`insert`).
- Access sizes of the tensors via `sizes().front()` / `sizes().back()`
  / `sizes().end()[-2]`
- Do not create intermediary tensors / vectors when it can be avoided.
- Call `reshape` rather than `expect_contiguous`  + `view`

On top of these, it fixes a correctness issue of `matmul_out`, where the
out parameter was not resized correctly when passed to the backends.
This involves removing the use of `set_` from the calling code, as
requested by ezyang, and it incurs on most of the complexity of the
code that this PR adds.

[ghstack-poisoned]
lezcano added a commit that referenced this pull request May 4, 2022
This PR implements the bulk of #64387

Part of the optimisations were already merged in #72230

A number of these optimisations include:
- Make the code `const` correct.
- Create `DimVector`'s more efficiently (e.g. prefer `append` over
`insert`).
- Access sizes of the tensors via `sizes().front()` / `sizes().back()`
  / `sizes().end()[-2]`
- Do not create intermediary tensors / vectors when it can be avoided.
- Call `reshape` rather than `expect_contiguous`  + `view`

On top of these, it fixes a correctness issue of `matmul_out`, where the
out parameter was not resized correctly when passed to the backends.
This involves removing the use of `set_` from the calling code, as
requested by ezyang, and it incurs on most of the complexity of the
code that this PR adds.

[ghstack-poisoned]
lezcano added a commit that referenced this pull request May 5, 2022
…ic boogaloo"


This PR implements the bulk of #64387

Part of the optimisations were already merged in #72230

A number of these optimisations include:
- Make the code `const` correct.
- Create `DimVector`'s more efficiently (e.g. prefer `append` over
`insert`).
- Access sizes of the tensors via `sizes().front()` / `sizes().back()`
  / `sizes().end()[-2]`
- Do not create intermediary tensors / vectors when it can be avoided.
- Call `reshape` rather than `expect_contiguous`  + `view`

On top of these, it fixes a correctness issue of `matmul_out`, where the
out parameter was not resized correctly when passed to the backends.
This involves removing the use of `set_` from the calling code, as
requested by ezyang, and it incurs on most of the complexity of the
code that this PR adds.

[ghstack-poisoned]
lezcano added a commit that referenced this pull request May 5, 2022
This PR implements the bulk of #64387

Part of the optimisations were already merged in #72230

A number of these optimisations include:
- Make the code `const` correct.
- Create `DimVector`'s more efficiently (e.g. prefer `append` over
`insert`).
- Access sizes of the tensors via `sizes().front()` / `sizes().back()`
  / `sizes().end()[-2]`
- Do not create intermediary tensors / vectors when it can be avoided.
- Call `reshape` rather than `expect_contiguous`  + `view`

On top of these, it fixes a correctness issue of `matmul_out`, where the
out parameter was not resized correctly when passed to the backends.
This involves removing the use of `set_` from the calling code, as
requested by ezyang, and it incurs on most of the complexity of the
code that this PR adds.

[ghstack-poisoned]
lezcano added a commit that referenced this pull request May 10, 2022
…ic boogaloo"


This PR implements the bulk of #64387

Part of the optimisations were already merged in #72230

A number of these optimisations include:
- Make the code `const` correct.
- Create `DimVector`'s more efficiently (e.g. prefer `append` over
`insert`).
- Access sizes of the tensors via `sizes().front()` / `sizes().back()`
  / `sizes().end()[-2]`
- Do not create intermediary tensors / vectors when it can be avoided.
- Call `reshape` rather than `expect_contiguous`  + `view`

On top of these, it fixes a correctness issue of `matmul_out`, where the
out parameter was not resized correctly when passed to the backends.
This involves removing the use of `set_` from the calling code, as
requested by ezyang, and it incurs on most of the complexity of the
code that this PR adds.

[ghstack-poisoned]
lezcano added a commit that referenced this pull request May 10, 2022
This PR implements the bulk of #64387

Part of the optimisations were already merged in #72230

A number of these optimisations include:
- Make the code `const` correct.
- Create `DimVector`'s more efficiently (e.g. prefer `append` over
`insert`).
- Access sizes of the tensors via `sizes().front()` / `sizes().back()`
  / `sizes().end()[-2]`
- Do not create intermediary tensors / vectors when it can be avoided.
- Call `reshape` rather than `expect_contiguous`  + `view`

On top of these, it fixes a correctness issue of `matmul_out`, where the
out parameter was not resized correctly when passed to the backends.
This involves removing the use of `set_` from the calling code, as
requested by ezyang, and it incurs on most of the complexity of the
code that this PR adds.

[ghstack-poisoned]
lezcano added a commit that referenced this pull request May 10, 2022
…ic boogaloo"


This PR implements the bulk of #64387

Part of the optimisations were already merged in #72230

A number of these optimisations include:
- Make the code `const` correct.
- Create `DimVector`'s more efficiently (e.g. prefer `append` over
`insert`).
- Access sizes of the tensors via `sizes().front()` / `sizes().back()`
  / `sizes().end()[-2]`
- Do not create intermediary tensors / vectors when it can be avoided.
- Call `reshape` rather than `expect_contiguous`  + `view`

On top of these, it fixes a correctness issue of `matmul_out`, where the
out parameter was not resized correctly when passed to the backends.
This involves removing the use of `set_` from the calling code, as
requested by ezyang, and it incurs on most of the complexity of the
code that this PR adds.

[ghstack-poisoned]
lezcano added a commit that referenced this pull request May 10, 2022
This PR implements the bulk of #64387

Part of the optimisations were already merged in #72230

A number of these optimisations include:
- Make the code `const` correct.
- Create `DimVector`'s more efficiently (e.g. prefer `append` over
`insert`).
- Access sizes of the tensors via `sizes().front()` / `sizes().back()`
  / `sizes().end()[-2]`
- Do not create intermediary tensors / vectors when it can be avoided.
- Call `reshape` rather than `expect_contiguous`  + `view`

On top of these, it fixes a correctness issue of `matmul_out`, where the
out parameter was not resized correctly when passed to the backends.
This involves removing the use of `set_` from the calling code, as
requested by ezyang, and it incurs on most of the complexity of the
code that this PR adds.

[ghstack-poisoned]
lezcano added a commit that referenced this pull request May 10, 2022
…ic boogaloo"


This PR implements the bulk of #64387

Part of the optimisations were already merged in #72230

A number of these optimisations include:
- Make the code `const` correct.
- Create `DimVector`'s more efficiently (e.g. prefer `append` over
`insert`).
- Access sizes of the tensors via `sizes().front()` / `sizes().back()`
  / `sizes().end()[-2]`
- Do not create intermediary tensors / vectors when it can be avoided.
- Call `reshape` rather than `expect_contiguous`  + `view`

On top of these, it fixes a correctness issue of `matmul_out`, where the
out parameter was not resized correctly when passed to the backends.
This involves removing the use of `set_` from the calling code, as
requested by ezyang, and it incurs on most of the complexity of the
code that this PR adds.

[ghstack-poisoned]
lezcano added a commit that referenced this pull request May 10, 2022
This PR implements the bulk of #64387

Part of the optimisations were already merged in #72230

A number of these optimisations include:
- Make the code `const` correct.
- Create `DimVector`'s more efficiently (e.g. prefer `append` over
`insert`).
- Access sizes of the tensors via `sizes().front()` / `sizes().back()`
  / `sizes().end()[-2]`
- Do not create intermediary tensors / vectors when it can be avoided.
- Call `reshape` rather than `expect_contiguous`  + `view`

On top of these, it fixes a correctness issue of `matmul_out`, where the
out parameter was not resized correctly when passed to the backends.
This involves removing the use of `set_` from the calling code, as
requested by ezyang, and it incurs on most of the complexity of the
code that this PR adds.

[ghstack-poisoned]
lezcano added a commit that referenced this pull request May 11, 2022
…ic boogaloo"


This PR implements the bulk of #64387

Part of the optimisations were already merged in #72230

A number of these optimisations include:
- Make the code `const` correct.
- Create `DimVector`'s more efficiently (e.g. prefer `append` over
`insert`).
- Access sizes of the tensors via `sizes().front()` / `sizes().back()`
  / `sizes().end()[-2]`
- Do not create intermediary tensors / vectors when it can be avoided.
- Call `reshape` rather than `expect_contiguous`  + `view`

On top of these, it fixes a correctness issue of `matmul_out`, where the
out parameter was not resized correctly when passed to the backends.
This involves removing the use of `set_` from the calling code, as
requested by ezyang, and it incurs on most of the complexity of the
code that this PR adds.

[ghstack-poisoned]
lezcano added a commit that referenced this pull request May 11, 2022
This PR implements the bulk of #64387

Part of the optimisations were already merged in #72230

A number of these optimisations include:
- Make the code `const` correct.
- Create `DimVector`'s more efficiently (e.g. prefer `append` over
`insert`).
- Access sizes of the tensors via `sizes().front()` / `sizes().back()`
  / `sizes().end()[-2]`
- Do not create intermediary tensors / vectors when it can be avoided.
- Call `reshape` rather than `expect_contiguous`  + `view`

On top of these, it fixes a correctness issue of `matmul_out`, where the
out parameter was not resized correctly when passed to the backends.
This involves removing the use of `set_` from the calling code, as
requested by ezyang, and it incurs on most of the complexity of the
code that this PR adds.

[ghstack-poisoned]
lezcano added a commit that referenced this pull request May 11, 2022
…ic boogaloo"


This PR implements the bulk of #64387

Part of the optimisations were already merged in #72230

A number of these optimisations include:
- Make the code `const` correct.
- Create `DimVector`'s more efficiently (e.g. prefer `append` over
`insert`).
- Access sizes of the tensors via `sizes().front()` / `sizes().back()`
  / `sizes().end()[-2]`
- Do not create intermediary tensors / vectors when it can be avoided.
- Call `reshape` rather than `expect_contiguous`  + `view`

On top of these, it fixes a correctness issue of `matmul_out`, where the
out parameter was not resized correctly when passed to the backends.
This involves removing the use of `set_` from the calling code, as
requested by ezyang, and it incurs on most of the complexity of the
code that this PR adds.

[ghstack-poisoned]
lezcano added a commit that referenced this pull request May 11, 2022
This PR implements the bulk of #64387

Part of the optimisations were already merged in #72230

A number of these optimisations include:
- Make the code `const` correct.
- Create `DimVector`'s more efficiently (e.g. prefer `append` over
`insert`).
- Access sizes of the tensors via `sizes().front()` / `sizes().back()`
  / `sizes().end()[-2]`
- Do not create intermediary tensors / vectors when it can be avoided.
- Call `reshape` rather than `expect_contiguous`  + `view`

On top of these, it fixes a correctness issue of `matmul_out`, where the
out parameter was not resized correctly when passed to the backends.
This involves removing the use of `set_` from the calling code, as
requested by ezyang, and it incurs on most of the complexity of the
code that this PR adds.

[ghstack-poisoned]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants