Separate TLS for InferenceMode by ailzhang · Pull Request #55238 · pytorch/pytorch

ailzhang · 2021-04-02T18:17:48Z

Stack from ghstack:

Separate TLS for InferenceMode #55238 Separate TLS for InferenceMode

I tried to avoid creating new TLS, but InferenceMode::is_enabeld()
is in perf critical path (TensorImpl constructor) so it seems
worth adding one for it.
This PR reduces one sources of instruction count increased by
#55008.

 λ ~ python compare.py
<torch.utils.benchmark.utils.valgrind_wrapper.timer_interface.FunctionCounts object at 0x7f59097ef310>
     100  0x0000000004854750
    -100  0x0000000004854760
   -4400  c10::impl::tls_is_dispatch_key_included(...)

Differential Revision: D27539230

I tried to avoid creating new TLS, but InferenceMode::is_enabeld() is in perf critical path (TensorImpl constructor) so it seems worth adding one for it. This PR reduces one sources of instruction count increased by #55008. ``` λ ~ python compare.py <torch.utils.benchmark.utils.valgrind_wrapper.timer_interface.FunctionCounts object at 0x7f59097ef310> 100 0x0000000004854750 -100 0x0000000004854760 -4400 c10::impl::tls_is_dispatch_key_included(...) ``` [ghstack-poisoned]

facebook-github-bot · 2021-04-02T18:17:55Z

💊 CI failures summary and remediations

As of commit 739fede (more details on the Dr. CI page):

3/3 failures possibly* introduced in this PR
- 1/3 non-scanned failure(s)

🕵️ 1 new failure recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

pytorch_linux_xenial_cuda10_2_cudnn7_py3_gcc7_test1 (1/1)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun)

Apr 05 22:09:08 AssertionError: 0 not greater than or equal to 1

Apr 05 22:09:08   File "/opt/conda/lib/python3.6/site-packages/torch/testing/_internal/common_distributed.py", line 323, in wrapper
Apr 05 22:09:08     fn()
Apr 05 22:09:08   File "/opt/conda/lib/python3.6/site-packages/torch/testing/_internal/distributed/distributed_test.py", line 3652, in test_ddp_logging_data_cpu
Apr 05 22:09:08     model_DDP = self._test_ddp_logging_data(is_gpu=False)
Apr 05 22:09:08   File "/opt/conda/lib/python3.6/site-packages/torch/testing/_internal/distributed/distributed_test.py", line 3625, in _test_ddp_logging_data
Apr 05 22:09:08     self.assertGreaterEqual(ddp_logging_data.forward_compute_time, 1)
Apr 05 22:09:08   File "/opt/conda/lib/python3.6/unittest/case.py", line 1227, in assertGreaterEqual
Apr 05 22:09:08     self.fail(self._formatMessage(msg, standardMsg))
Apr 05 22:09:08   File "/opt/conda/lib/python3.6/unittest/case.py", line 670, in fail
Apr 05 22:09:08     raise self.failureException(msg)
Apr 05 22:09:08 AssertionError: 0 not greater than or equal to 1
Apr 05 22:09:08 
Apr 05 22:09:08 
Apr 05 22:09:08 
Apr 05 22:09:08 ----------------------------------------------------------------------
Apr 05 22:09:08 Ran 182 tests in 284.505s
Apr 05 22:09:08 
Apr 05 22:09:08 FAILED (errors=1, skipped=71)
Apr 05 22:09:08 
Apr 05 22:09:08 Generating XML reports...
Apr 05 22:09:08 Generated XML report: test-reports/dist-gloo/distributed.test_distributed_fork/TEST-TestDistBackendWithFork-20210405220424.xml

1 failure not recognized by patterns:

Job	Step	Action
^{pytorch_ios_12_0_0_x86_64_lite_interpreter_build}	^Build	🔁 rerun

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

I tried to avoid creating new TLS, but InferenceMode::is_enabeld() is in perf critical path (TensorImpl constructor) so it seems worth adding one for it. This PR reduces one sources of instruction count increased by #55008. ``` λ ~ python compare.py <torch.utils.benchmark.utils.valgrind_wrapper.timer_interface.FunctionCounts object at 0x7f59097ef310> 100 0x0000000004854750 -100 0x0000000004854760 -4400 c10::impl::tls_is_dispatch_key_included(...) ``` Differential Revision: [D27539230](https://our.internmc.facebook.com/intern/diff/D27539230) [ghstack-poisoned]

I tried to avoid creating new TLS, but InferenceMode::is_enabeld() is in perf critical path (TensorImpl constructor) so it seems worth adding one for it. This PR reduces one sources of instruction count increased by #55008. ``` λ ~ python compare.py <torch.utils.benchmark.utils.valgrind_wrapper.timer_interface.FunctionCounts object at 0x7f59097ef310> 100 0x0000000004854750 -100 0x0000000004854760 -4400 c10::impl::tls_is_dispatch_key_included(...) ``` ghstack-source-id: 96dd9d5 Pull Request resolved: #55238

ezyang

OK. Note that I probably would have worded the comment a little differently; saying instead that there is an invariant that set_enabled must uphold (this is NBD because it's private so you can enforce the invariant easily anyway)

I tried to avoid creating new TLS, but InferenceMode::is_enabeld() is in perf critical path (TensorImpl constructor) so it seems worth adding one for it. This PR reduces one sources of instruction count increased by #55008. ``` λ ~ python compare.py <torch.utils.benchmark.utils.valgrind_wrapper.timer_interface.FunctionCounts object at 0x7f59097ef310> 100 0x0000000004854750 -100 0x0000000004854760 -4400 c10::impl::tls_is_dispatch_key_included(...) ``` Differential Revision: [D27539230](https://our.internmc.facebook.com/intern/diff/D27539230) [ghstack-poisoned]

I tried to avoid creating new TLS, but InferenceMode::is_enabeld() is in perf critical path (TensorImpl constructor) so it seems worth adding one for it. This PR reduces one sources of instruction count increased by #55008. ``` λ ~ python compare.py <torch.utils.benchmark.utils.valgrind_wrapper.timer_interface.FunctionCounts object at 0x7f59097ef310> 100 0x0000000004854750 -100 0x0000000004854760 -4400 c10::impl::tls_is_dispatch_key_included(...) ``` ghstack-source-id: 84dce14 Pull Request resolved: #55238

Summary: Pull Request resolved: pytorch#55238 I tried to avoid creating new TLS, but InferenceMode::is_enabeld() is in perf critical path (TensorImpl constructor) so it seems worth adding one for it. This PR reduces one sources of instruction count increased by pytorch#55008. ``` λ ~ python compare.py <torch.utils.benchmark.utils.valgrind_wrapper.timer_interface.FunctionCounts object at 0x7f59097ef310> 100 0x0000000004854750 -100 0x0000000004854760 -4400 c10::impl::tls_is_dispatch_key_included(...) ``` Test Plan: Imported from OSS Reviewed By: ezyang Differential Revision: D27539230 Pulled By: ailzhang fbshipit-source-id: 54b0c141d59ac569a154a036b284001fca9fc560

facebook-github-bot · 2021-04-07T05:20:05Z

@ailzhang merged this pull request in bc05867.

Now that #55238 is landed for a week and no complains. It seems safe to say FEATURE_TORCH_MOBILE is always true and we can do some cleanup. [ghstack-poisoned]

Now that #55238 is landed for a week and no complains. It seems safe to say FEATURE_TORCH_MOBILE is always true and we can do some cleanup. ghstack-source-id: 43cbb36 Pull Request resolved: #55835

Summary: Pull Request resolved: #55835 Now that #55238 is landed for a week and no complains. It seems safe to say FEATURE_TORCH_MOBILE is always true and we can do some cleanup. Test Plan: Imported from OSS Reviewed By: ezyang, walterddr Differential Revision: D27721284 Pulled By: ailzhang fbshipit-source-id: 4896bc5f736373d0922cfbe8eed0d16df62f0fa1

Summary: Pull Request resolved: pytorch#55835 Now that pytorch#55238 is landed for a week and no complains. It seems safe to say FEATURE_TORCH_MOBILE is always true and we can do some cleanup. Test Plan: Imported from OSS Reviewed By: ezyang, walterddr Differential Revision: D27721284 Pulled By: ailzhang fbshipit-source-id: 4896bc5f736373d0922cfbe8eed0d16df62f0fa1

facebook-github-bot added the cla signed label Apr 2, 2021

ailzhang requested review from bhosmer, ezyang and robieta April 2, 2021 18:18

ezyang approved these changes Apr 5, 2021

View reviewed changes

ailzhang mentioned this pull request Apr 6, 2021

Separate TLS for InferenceMode (#55238) #55424

Closed

facebook-github-bot closed this in bc05867 Apr 7, 2021

facebook-github-bot added the Merged label Apr 7, 2021

facebook-github-bot deleted the gh/ailzhang/58/head branch April 10, 2021 14:17

ailzhang mentioned this pull request Apr 12, 2021

Cleanup since FEATURE_TORCH_MOBILE is always true. #55835

Closed

ailzhang pushed a commit that referenced this pull request Apr 12, 2021

Cleanup since FEATURE_TORCH_MOBILE is always true.

fb92b19

Now that #55238 is landed for a week and no complains. It seems safe to say FEATURE_TORCH_MOBILE is always true and we can do some cleanup. [ghstack-poisoned]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Separate TLS for InferenceMode#55238

Separate TLS for InferenceMode#55238
ailzhang wants to merge 3 commits intogh/ailzhang/58/basefrom
gh/ailzhang/58/head

ailzhang commented Apr 2, 2021 •

edited

Loading

Uh oh!

facebook-github-bot commented Apr 2, 2021 •

edited

Loading

Uh oh!

ezyang left a comment

Uh oh!

facebook-github-bot commented Apr 7, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

ailzhang commented Apr 2, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

facebook-github-bot commented Apr 2, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

💊 CI failures summary and remediations

🕵️ 1 new failure recognized by patterns

pytorch_linux_xenial_cuda10_2_cudnn7_py3_gcc7_test1 (1/1)

1 failure not recognized by patterns:

Uh oh!

ezyang left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot commented Apr 7, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ailzhang commented Apr 2, 2021 •

edited

Loading

facebook-github-bot commented Apr 2, 2021 •

edited

Loading