Skip to content

Remove variable_excluded_from_dispatch() check for factory functions.#46371

Closed
ailzhang wants to merge 2 commits intogh/ailzhang/30/basefrom
gh/ailzhang/30/head
Closed

Remove variable_excluded_from_dispatch() check for factory functions.#46371
ailzhang wants to merge 2 commits intogh/ailzhang/30/basefrom
gh/ailzhang/30/head

Conversation

@ailzhang
Copy link
Contributor

@ailzhang ailzhang commented Oct 15, 2020

Stack from ghstack:

Differential Revision: D24324545

ailzhang pushed a commit that referenced this pull request Oct 15, 2020
@facebook-github-bot
Copy link
Contributor

facebook-github-bot commented Oct 15, 2020

💊 CI failures summary and remediations

As of commit e565991 (more details on the Dr. CI page):


  • 1/1 failures introduced in this PR---

1 failure not recognized by patterns:

Job Step Action
CircleCI docker-pytorch-linux-xenial-py3-clang5-android-ndk-r19c Check if image should be built 🔁 rerun
This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group.

See how this bot performed.

This comment has been revised 3 times.

@ailzhang
Copy link
Contributor Author

Some perf data

Before this PR

 λ ~ numactl -C 10 python repro.py
<torch.utils.benchmark.utils.common.Measurement object at 0x7f37e5560b50>
torch.empty((1,))
  Median: 2.26 us
  IQR:    0.07 us (2.22 to 2.30)
  442 measurements, 10000 runs per measurement, 1 thread
 λ ~ numactl -C 10 python repro.py
<torch.utils.benchmark.utils.common.Measurement object at 0x7f0f34bc6510>
torch.empty((1,))
  Median: 2.19 us
  IQR:    0.07 us (2.15 to 2.22)
  458 measurements, 10000 runs per measurement, 1 thread

After this PR

 λ ~ numactl -C 10 python repro.py
<torch.utils.benchmark.utils.common.Measurement object at 0x7f9b7235ea50>
torch.empty((1,))
  Median: 1.96 us
  IQR:    0.05 us (1.93 to 1.98)
  511 measurements, 10000 runs per measurement, 1 thread
 λ ~ numactl -C 10 python repro.py
<torch.utils.benchmark.utils.common.Measurement object at 0x7f2b99b13410>
torch.empty((1,))
  Median: 2.01 us
  IQR:    0.06 us (1.98 to 2.04)
  498 measurements, 10000 runs per measurement, 1 thread

Delta of instruction counts for torch.empty((1,))

1674 ???:0x000000000024f160
200 build/../torch/csrc/jit/frontend/tracer.h:torch::PythonArgs::intlist(int)
200 build/../c10/util/intrusive_ptr.h:torch::autograd::impl::set_pyobj(at::Tensor const&, _object*)
200 build/../c10/core/TensorImpl.cpp:c10::TensorImpl::autograd_meta() const
200 build/../c10/core/DefaultDtype.cpp:c10::get_default_dtype()
200 build/../aten/src/TH/THAllocator.cpp:getTHDefaultAllocator()
200 /usr/include/c++/7/tuple:c10::TensorImpl::TensorImpl(c10::Storage&&, c10::DispatchKeySet, caffe2::TypeMeta const&, c10::optional<c10::Device>)
200 /usr/include/c++/7/bits/atomic_base.h:at::Tensor at::detail::make_tensor<c10::TensorImpl, c10::intrusive_ptr<c10::StorageImpl, c10::detail::intrusive_target_default_null_type<c10::StorageImpl> >, c10::DispatchKey, caffe2::TypeMeta&>(c10::intrusive_ptr<c10::StorageImpl, c10::detail::intrusive_target_default_null_type<c10::StorageImpl> >&&, c10::DispatchKey&&, caffe2::TypeMeta&)
100 build/../c10/core/TensorOptions.h:at::native::empty_cpu(c10::ArrayRef<long>, c10::TensorOptions const&, c10::optional<c10::MemoryFormat>)
90 /build/glibc-OTsEL5/glibc-2.27/malloc/malloc.c:_int_free
45 /build/glibc-OTsEL5/glibc-2.27/malloc/malloc.c:_int_memalign
6 ???:0x0000000000215e40
-6 miniconda/include/python3.7m/pybind11/pytypes.h:pybind11::detail::function_call::~function_call()
-200 build/aten/src/ATen/core/TensorBody.h:at::native::empty_cpu(c10::ArrayRef<long>, c10::TensorOptions const&, c10::optional<c10::MemoryFormat>)
-200 build/../c10/util/typeid.h:c10::TensorOptions::TensorOptions()
-200 build/../c10/util/Optional.h:torch::utils::maybe_initialize_cuda(c10::TensorOptions const&)
-200 build/../c10/core/TensorImpl.h:torch::autograd::impl::set_pyobj(at::Tensor const&, _object*)
-200 build/../c10/core/DispatchKeySet.h:at::native::empty_cpu(c10::ArrayRef<long>, c10::TensorOptions const&, c10::optional<c10::MemoryFormat>)
-200 build/../aten/src/ATen/record_function.h:at::RecordFunction::RecordFunction(at::RecordScope)
-200 ???:0x0000000012bcaf0b
-200 /usr/include/c++/7/bits/stl_vector.h:torch::PythonArgParser::raw_parse(_object*, _object*, _object*, _object**)
-200 /usr/include/c++/7/bits/stl_vector.h:torch::PyWarningHandler::~PyWarningHandler()
-200 /usr/include/c++/7/bits/stl_construct.h:torch::PyWarningHandler::~PyWarningHandler()
-300 build/../c10/core/impl/LocalDispatchKeySet.h:c10::impl::tls_local_dispatch_key_set()
-300 build/../aten/src/ATen/native/TensorFactories.cpp:at::native::empty_cpu(c10::ArrayRef<long>, c10::TensorOptions const&, c10::optional<c10::MemoryFormat>)
-600 build/aten/src/ATen/BackendSelectRegister.cpp:at::(anonymous namespace)::empty_memory_format(c10::ArrayRef<long>, c10::TensorOptions const&, c10::optional<c10::MemoryFormat>)
-800 build/../c10/core/DispatchKey.cpp:c10::getAutogradKeyFromBackend(c10::DispatchKey)
-1700 ???:0x000000000024eaf0
-2800 build/../c10/core/impl/LocalDispatchKeySet.cpp:c10::impl::tls_local_dispatch_key_set()
-4400 /build/glibc-OTsEL5/glibc-2.27/elf/../sysdeps/x86_64/tls_get_addr.S:__tls_get_addr

Seems like an easy perf win if we just disable this check for factory ops. :D

@ailzhang
Copy link
Contributor Author

questions from @bhosmer on the old thread:
Just curious, how'd you arrive at the set of asserts to remove, just by spidering out from BackendSelect kernels? Asking bc I wouldn't have thought about the SparseTensor stuff as being affected by this change.

Curiosity question #2 - is anybody still asserting on variable_excluded_from_dispatch() after this change?

This is the minimal set I need to remove to make tests pass and I made sure they're called through factory ops.
There're a few left in codebase, I think we can remove them in a separate PR. :D

@ailzhang ailzhang requested review from bhosmer and ezyang October 15, 2020 01:41
@dr-ci
Copy link

dr-ci bot commented Oct 15, 2020

💊 CI failures summary and remediations

As of commit 0cc8779 (more details on the Dr. CI page):


  • 1/1 failures possibly* introduced in this PR
    • 1/1 non-CircleCI failure(s)---
This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group.

See how this bot performed.

This comment has been revised 6 times.

@zeeekhan77
Copy link

The upstream breakages are not fixed

ailzhang pushed a commit that referenced this pull request Oct 15, 2020
@codecov
Copy link

codecov bot commented Oct 16, 2020

Codecov Report

❗ No coverage uploaded for pull request base (gh/ailzhang/30/base@445db6b). Click here to learn what that means.
The diff coverage is n/a.

Impacted file tree graph

@@                  Coverage Diff                   @@
##             gh/ailzhang/30/base   #46371   +/-   ##
======================================================
  Coverage                       ?   68.33%           
======================================================
  Files                          ?      410           
  Lines                          ?    53805           
  Branches                       ?        0           
======================================================
  Hits                           ?    36770           
  Misses                         ?    17035           
  Partials                       ?        0           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 445db6b...0cc8779. Read the comment docs.

@facebook-github-bot
Copy link
Contributor

@ailzhang merged this pull request in ec5f81f.

1 similar comment
@facebook-github-bot
Copy link
Contributor

@ailzhang merged this pull request in ec5f81f.

@facebook-github-bot facebook-github-bot deleted the gh/ailzhang/30/head branch October 19, 2020 14:21
pytorchmergebot pushed a commit that referenced this pull request Jan 15, 2023
…92168)

When tracing a model using dynamo, theses assertions fail. Following #29653 and #46371, we think it might be OK to remove these two assertions as well.

Pull Request resolved: #92168
Approved by: https://github.com/ezyang
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants