Update foreach APIs to use scalar lists by izdeby · Pull Request #48223 · pytorch/pytorch

izdeby · 2020-11-19T00:07:57Z

Stack from ghstack:

[wip] Replace optimizers in torch.optim with the ones from torch.optim._multi_tensor #49039 [wip] Replace optimizers in torch.optim with the ones from torch.optim._multi_tensor
[wip] Pushed unsupported scenarios to slow path #48224 [wip] Pushed unsupported scenarios to slow path
Updated foreach pointwise ops to support complex and bloat16 on CUDA #51170 Updated foreach pointwise ops to support complex and bloat16 on CUDA
Update foreach max/min #49714 Update foreach max/min
Refactor foreach pointwise ops tests to use OpInfo #51060 Refactor foreach pointwise ops tests to use OpInfo
Refactor foreach binary ops tests with tensor lists to use OpInfo #51059 Refactor foreach binary ops tests with tensor lists to use OpInfo
Refactor foreach binary ops tests with scalars to use OpInfo #51058 Refactor foreach binary ops tests with scalars to use OpInfo
Refactor foreach unary ops tests to use OpInfo #49712 Refactor foreach unary ops tests to use OpInfo
Update foreach binary ops with a single scalar and list of tensors #49250 Update foreach_div and foreach_sub logic
Update foreach binary ops with a scalar list #49249 Update foreach binary ops with a scalar list
Refactor ForeachUtils.h #51131 Refactor ForeachUtils.h
Refactor ForeachUnaryOp.cu #49248 Refactor ForeachUnaryOp.cu
Update foreach APIs to use scalar lists #48223 Update foreach APIs to use scalar lists

Differential Revision: D25074763

Motivation
Update existing _foreach APIs to use ScalarList instead of at::ArrayRef

Testing
Update the tests assuming that any scalar type can be passed now, not just double.

[ghstack-poisoned]

dr-ci · 2020-11-19T00:29:35Z

💊 CI failures summary and remediations

As of commit 2a95c20 (more details on the Dr. CI page):

1/2 failures possibly* introduced in this PR
- 1/1 non-CircleCI failure(s)
1/2 tentatively recognized as flaky ❄️
- Click here to rerun these jobs

❄️ 1 failure tentatively classified as flaky

but reruns have not yet been triggered to confirm:

pytorch_linux_xenial_cuda10_2_cudnn7_py3_gcc7_test2 (1/1)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun) ❄️

Feb 02 21:05:00 RuntimeError: CUDA error: an illegal memory access was encountered

Feb 02 21:05:00   File "test_optim.py", line 1994, in test_update_bn_dnn
Feb 02 21:05:00     self._test_update_bn(dnn.cuda(), dl_x, dl_xy, True)
Feb 02 21:05:00   File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in cuda
Feb 02 21:05:00     return self._apply(lambda t: t.cuda(device))
Feb 02 21:05:00   File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 387, in _apply
Feb 02 21:05:00     module._apply(fn)
Feb 02 21:05:00   File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 409, in _apply
Feb 02 21:05:00     param_applied = fn(param)
Feb 02 21:05:00   File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in <lambda>
Feb 02 21:05:00     return self._apply(lambda t: t.cuda(device))
Feb 02 21:05:00 RuntimeError: CUDA error: an illegal memory access was encountered
Feb 02 21:05:00 
Feb 02 21:05:00 ----------------------------------------------------------------------
Feb 02 21:05:00 Ran 104 tests in 38.556s
Feb 02 21:05:00 
Feb 02 21:05:00 FAILED (errors=13)
Feb 02 21:05:00 
Feb 02 21:05:00 Generating XML reports...
Feb 02 21:05:00 Generated XML report: test-reports/dist-gloo/TEST-TestLRScheduler-20210202210421.xml
Feb 02 21:05:00 Generated XML report: test-reports/dist-gloo/TEST-TestOptim-20210202210421.xml
Feb 02 21:05:00 Generated XML report: test-reports/dist-gloo/TEST-TestSWAUtils-20210202210421.xml

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Differential Revision: [D25074763](https://our.internmc.facebook.com/intern/diff/D25074763) [ghstack-poisoned]

Differential Revision: [D25074763](https://our.internmc.facebook.com/intern/diff/D25074763) **Motivation** Update existing _foreach APIs to use ScalarList instead of at::ArrayRef<double> **Testing** Update the tests assuming that any scalar type can be passed now, not just double. [ghstack-poisoned]

aten/src/ATen/native/ForeachUtils.h

aten/src/ATen/native/native_functions.yaml

test/test_foreach.py

zou3519 · 2020-12-10T18:50:59Z

test/test_foreach.py

+                                    self.assertEqual(res, expected)
+
+                                foreach_bin_op_(tensors, scalars)
+                                self.assertEqual(res, tensors)


Some general feedback is: the tests are really hard to read because they branch so much. I'm not sure how to make them easier to read, though. Perhaps we should consider splitting them up or think about why the behavior diverges so much?

Yes, i fully agree. thats why in the next PR in this stack i separate all the tests. Much easier to read and maintain.

Differential Revision: [D25074763](https://our.internmc.facebook.com/intern/diff/D25074763) **Motivation** Update existing _foreach APIs to use ScalarList instead of at::ArrayRef<double> **Testing** Update the tests assuming that any scalar type can be passed now, not just double. [ghstack-poisoned]

facebook-github-bot · 2021-02-02T22:58:09Z

@izdeby merged this pull request in cce84b5.

vincentqb · 2021-02-04T00:43:55Z

I see errors on master after this pull request landed, see history. Can we confirm the errors are not related to this PR?

Example:

Traceback (most recent call last):
  File "test_optim.py", line 386, in test_adam
    lambda weight, bias: optimizer([weight, bias], lr=1e-3)
  File "test_optim.py", line 243, in _test_basic_cases
    scheduler_constructors
  File "test_optim.py", line 126, in _test_basic_cases_template
    optimizer.step(fn)
  File "/opt/conda/lib/python3.6/site-packages/torch/optim/optimizer.py", line 89, in wrapper
    return func(*args, **kwargs)
  File "/opt/conda/lib/python3.6/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/opt/conda/lib/python3.6/site-packages/torch/optim/_multi_tensor/adam.py", line 66, in step
    loss = closure()
  File "test_optim.py", line 115, in fn
    loss.backward()
  File "/opt/conda/lib/python3.6/site-packages/torch/tensor.py", line 245, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
  File "/opt/conda/lib/python3.6/site-packages/torch/autograd/__init__.py", line 147, in backward
    allow_unreachable=True, accumulate_grad=True)  # allow_unreachable flag
RuntimeError: CUDA error: an illegal memory access was encountered

ngimel · 2021-02-04T00:56:14Z

I've reverted this PR, there's IMA failure in this PR's CI, and master is currently broken.

facebook-github-bot · 2021-02-04T01:07:01Z

This pull request has been reverted by 443a431.

Summary: Caused by #48223 revert Pull Request resolved: #51702 Reviewed By: mruberry Differential Revision: D26245905 Pulled By: ngimel fbshipit-source-id: 9fd7860ecb5c22b2e568db3347d51e648d6c5d6b

Update foreach APIs to use scalar lists

0ef22d7

[ghstack-poisoned]

This was referenced Nov 19, 2020

Added foreach_trunc, foreahc_reciprocal, foreach_sigmoid APIs #47385

Closed

Enabled Scalar lists #48222

Closed

[wip] Pushed unsupported scenarios to slow path #48224

Closed

facebook-github-bot added the cla signed label Nov 19, 2020

izdeby changed the title ~~Update foreach APIs to use scalar lists~~ [WIP] Update foreach APIs to use scalar lists Nov 19, 2020

Iurii Zdebskyi added 3 commits December 2, 2020 13:59

Update on "[WIP] Update foreach APIs to use scalar lists"

c84e8a3

Differential Revision: [D25074763](https://our.internmc.facebook.com/intern/diff/D25074763) [ghstack-poisoned]

Update on "[WIP] Update foreach APIs to use scalar lists"

deb56ab

Differential Revision: [D25074763](https://our.internmc.facebook.com/intern/diff/D25074763) [ghstack-poisoned]

izdeby changed the title ~~[WIP] Update foreach APIs to use scalar lists~~ Update foreach APIs to use scalar lists Dec 4, 2020

izdeby requested review from cpuhrsch, gchanan, mcarilli, ngimel and zou3519 and removed request for mcarilli December 4, 2020 17:47

izdeby mentioned this pull request Dec 8, 2020

[wip] Replace optimizers in torch.optim with the ones from torch.optim._multi_tensor #49039

Closed

Iurii Zdebskyi added 2 commits December 8, 2020 14:51