Skip to content

[1.8.1] Make ideep honor torch.set_num_thread changes (#53871)#54025

Merged
malfet merged 1 commit intopytorch:release/1.8from
malfet:malfet/cp-53871
Mar 16, 2021
Merged

[1.8.1] Make ideep honor torch.set_num_thread changes (#53871)#54025
malfet merged 1 commit intopytorch:release/1.8from
malfet:malfet/cp-53871

Conversation

@malfet
Copy link
Copy Markdown
Contributor

@malfet malfet commented Mar 15, 2021

Summary:
When compiled with OpenMP support ideep's computational_cache would cache max number of OpenMP workers
This number could be wrong after torch.set_num_threads call, so clean it after the call.

Fixes #53565

This is a cherry-pick of #53871 into release/1.8 branch

Reviewed By: albanD

Differential Revision: D27003265

Pulled By: malfet

fbshipit-source-id: 1d84c23070eafb3d444e09590d64f97f99ae9d36

Summary:
When compiled with OpenMP support `ideep`'s computational_cache would cache max number of OpenMP workers
This number could be wrong after `torch.set_num_threads` call, so clean it after the call.

Fixes pytorch#53565

Pull Request resolved: pytorch#53871

Reviewed By: albanD

Differential Revision: D27003265

Pulled By: malfet

fbshipit-source-id: 1d84c23070eafb3d444e09590d64f97f99ae9d36
@facebook-github-bot
Copy link
Copy Markdown
Contributor

facebook-github-bot commented Mar 15, 2021

💊 CI failures summary and remediations

As of commit 111ae24 (more details on the Dr. CI page):


  • 3/3 failures possibly* introduced in this PR
    • 1/3 non-scanned failure(s)

🕵️ 2 new failures recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

See CircleCI build pytorch_linux_xenial_py3_6_gcc5_4_test (1/2)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun)

Mar 15 22:57:41 [E request_callback_no_python.cpp:653] Received error while processing request type 258: RuntimeError: Can not pickle torch.futures.Future
Mar 15 22:57:40 At:
Mar 15 22:57:40   /opt/conda/lib/python3.6/site-packages/torch/distributed/rpc/internal.py(120): serialize
Mar 15 22:57:40   /opt/conda/lib/python3.6/site-packages/torch/distributed/rpc/internal.py(172): serialize
Mar 15 22:57:40 
Mar 15 22:57:40 [E request_callback_no_python.cpp:653] Received error while processing request type 258: RuntimeError: Can not pickle torch.futures.Future
Mar 15 22:57:40 
Mar 15 22:57:40 At:
Mar 15 22:57:40   /opt/conda/lib/python3.6/site-packages/torch/distributed/rpc/internal.py(120): serialize
Mar 15 22:57:40   /opt/conda/lib/python3.6/site-packages/torch/distributed/rpc/internal.py(172): serialize
Mar 15 22:57:40 
Mar 15 22:57:41 [E request_callback_no_python.cpp:653] Received error while processing request type 258: RuntimeError: Can not pickle torch.futures.Future
Mar 15 22:57:41 
Mar 15 22:57:41 At:
Mar 15 22:57:41   /opt/conda/lib/python3.6/site-packages/torch/distributed/rpc/internal.py(120): serialize
Mar 15 22:57:41   /opt/conda/lib/python3.6/site-packages/torch/distributed/rpc/internal.py(172): serialize
Mar 15 22:57:41 
Mar 15 22:57:41 ok (2.541s)
Mar 15 22:57:43   test_return_future_remote (__main__.TensorPipeRpcTestWithSpawn) ... ok (2.346s)
Mar 15 22:57:46   test_return_local_rrefs (__main__.TensorPipeRpcTestWithSpawn) ... ok (2.440s)
Mar 15 22:57:49   test_rpc_profiling_async_function (__main__.TensorPipeRpcTestWithSpawn) ... ok (3.444s)
Mar 15 22:57:53   test_rpc_profiling_async_function_single_threaded (__main__.TensorPipeRpcTestWithSpawn) ... ok (3.442s)

See CircleCI build pytorch_xla_linux_bionic_py3_6_clang9_test (2/2)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun)

Mar 15 21:59:58 AssertionError: False is not true
Mar 15 21:59:58   test_where_scalar_valid_combination_xla_uint8 (__main__.TestTorchDeviceTypeXLA) ... ok (0.028s)
Mar 15 21:59:58 
Mar 15 21:59:58 ======================================================================
Mar 15 21:59:58 FAIL [0.003s]: test_pickle_gradscaler_xla (__main__.TestTorchDeviceTypeXLA)
Mar 15 21:59:58 ----------------------------------------------------------------------
Mar 15 21:59:58 Traceback (most recent call last):
Mar 15 21:59:58   File "/opt/conda/lib/python3.6/site-packages/torch/testing/_internal/common_device_type.py", line 290, in instantiated_test
Mar 15 21:59:58     result = test_fn(self, *args)
Mar 15 21:59:58   File "/var/lib/jenkins/workspace/xla/test/../../test/test_torch.py", line 6172, in test_pickle_gradscaler
Mar 15 21:59:58     self.assertTrue(a.is_enabled() if torch.cuda.is_available() else not a.is_enabled())
Mar 15 21:59:58 AssertionError: False is not true
Mar 15 21:59:58 
Mar 15 21:59:58 ----------------------------------------------------------------------
Mar 15 21:59:58 Ran 244 tests in 149.955s
Mar 15 21:59:58 
Mar 15 21:59:58 FAILED (failures=1, skipped=140)
Mar 15 21:59:58 
Mar 15 21:59:58 Generating XML reports...
Mar 15 21:59:58 Generated XML report: test-reports/python-unittest/TEST-TestTorchDeviceTypeXLA-20210315215728.xml
Mar 15 21:59:58 + cleanup
Mar 15 21:59:58 + retcode=1

ci.pytorch.org: 1 failed


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

@malfet malfet merged commit c6139b7 into pytorch:release/1.8 Mar 16, 2021
@malfet malfet deleted the malfet/cp-53871 branch March 16, 2021 18:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants