Unify checks for normal by nkaretnikov · Pull Request #69629 · pytorch/pytorch

nkaretnikov · 2021-12-08T20:15:50Z

Stack from ghstack:

Split up checks into multiple macros
Use them in meta and impl functions for now since either might be called
Get rid of what appear to be just legacy checks that are irrelevant now
Update docs to not mention deprecated behavior.

Backward compatibility issues:

Shapes must match exactly between mean and std (no automatic resizing):

Old:

import torch

mean = torch.rand(3, 4, 5, device='cpu')
std = torch.rand(3, 4, 1, device='cpu')

torch.normal(mean=mean, std=std).size()
torch.Size([3, 4, 5])

New:

import torch

mean = torch.rand(3, 4, 5, device='cpu')
std = torch.rand(3, 4, 1, device='cpu')

torch.normal(mean=mean, std=std)
Traceback (most recent call last):
File "", line 1, in
RuntimeError: inconsistent tensor, output size ([3, 4, 5]) is not the same as
input size ([3, 4, 1])

Tensors with the same number of elements are treated as different if their
shapes don't match exactly:

Old:

import torch

out = torch.rand(2, 3, device='cpu')
mean = torch.rand(2, 3, device='cpu')
std = torch.rand(1, 6, device='cpu')

torch.normal(out=out, mean=mean, std=std)
:1: UserWarning: std and mean have the same number of elements, but are
not broadcastable. This was previously a supported mode of operation, but is
now deprecated and the support will be removed in version 1.6 release. Note
that the current implementation reshapes std to the shape of mean, which may be
incur data copies. Please ensure that std and mean are broadcastable to avoid
these issues. (Triggered internally at
aten/src/ATen/native/DistributionTemplates.h:189.)
tensor([[ 0.3345, 0.6314, 1.0163],
[-0.0397, 0.8128, 1.1436]])

New:

import torch

out = torch.rand(2, 3, device='cpu')
mean = torch.rand(2, 3, device='cpu')
std = torch.rand(1, 6, device='cpu')

torch.normal(out=out, mean=mean, std=std)
Traceback (most recent call last):
File "", line 1, in
RuntimeError: The size of tensor a (3) must match the size of tensor b (6) at
non-singleton dimension 1

An empty tensor is returned instead of an error when mean is a float and std
is an empty tensor:

Old:

import torch

mean = 4.
std = torch.rand(0, device='cpu')

torch.normal(mean=mean, std=std)
Traceback (most recent call last):
File "", line 1, in
RuntimeError: min(): Expected reduction dim to be specified for input.numel()
== 0. Specify the reduction dim with the 'dim' argument.

New:

import torch

mean = 4.
std = torch.rand(0, device='cpu')

torch.normal(mean=mean, std=std)
tensor([])

- Split up checks into multiple macros - Use them in meta and impl functions for now since either might be called - Get rid of what appear to be just legacy checks that are irrelevant now - Update docs to not mention deprecated behavior. Backward compatibility issues: - Shapes must match exactly between mean and std (no automatic resizing): Old: >>> import torch >>> >>> mean = torch.rand(3, 4, 5, device='cpu') >>> std = torch.rand(3, 4, 1, device='cpu') >>> >>> torch.normal(mean=mean, std=std).size() torch.Size([3, 4, 5]) New: >>> import torch >>> >>> mean = torch.rand(3, 4, 5, device='cpu') >>> std = torch.rand(3, 4, 1, device='cpu') >>> >>> torch.normal(mean=mean, std=std) Traceback (most recent call last): File "<stdin>", line 1, in <module> RuntimeError: inconsistent tensor, output size ([3, 4, 5]) is not the same as input size ([3, 4, 1]) - Tensors with the same number of elements are treated as different if their shapes don't match exactly: Old: >>> import torch >>> >>> out = torch.rand(2, 3, device='cpu') >>> mean = torch.rand(2, 3, device='cpu') >>> std = torch.rand(1, 6, device='cpu') >>> >>> torch.normal(out=out, mean=mean, std=std) <stdin>:1: UserWarning: std and mean have the same number of elements, but are not broadcastable. This was previously a supported mode of operation, but is now deprecated and the support will be removed in version 1.6 release. Note that the current implementation reshapes std to the shape of mean, which may be incur data copies. Please ensure that std and mean are broadcastable to avoid these issues. (Triggered internally at aten/src/ATen/native/DistributionTemplates.h:189.) tensor([[ 0.3345, 0.6314, 1.0163], [-0.0397, 0.8128, 1.1436]]) New: >>> import torch >>> >>> out = torch.rand(2, 3, device='cpu') >>> mean = torch.rand(2, 3, device='cpu') >>> std = torch.rand(1, 6, device='cpu') >>> >>> torch.normal(out=out, mean=mean, std=std) Traceback (most recent call last): File "<stdin>", line 1, in <module> RuntimeError: The size of tensor a (3) must match the size of tensor b (6) at non-singleton dimension 1 - An empty tensor is returned instead of an error when mean is a float and std is an empty tensor: Old: >>> import torch >>> >>> mean = 4. >>> std = torch.rand(0, device='cpu') >>> >>> torch.normal(mean=mean, std=std) Traceback (most recent call last): File "<stdin>", line 1, in <module> RuntimeError: min(): Expected reduction dim to be specified for input.numel() == 0. Specify the reduction dim with the 'dim' argument. New: >>> import torch >>> >>> mean = 4. >>> std = torch.rand(0, device='cpu') >>> >>> torch.normal(mean=mean, std=std) tensor([]) [ghstack-poisoned]

pytorch-probot · 2021-12-08T20:15:57Z

CI Flow Status

⚛️ CI Flow

Ruleset - Version: v1
Ruleset - File: https://github.com/pytorch/pytorch/blob/fc4c45317fed47b1f6d4851d175661c6bfc92cbe/.github/generated-ciflow-ruleset.json
PR ciflow labels: ciflow/default

Workflows	Labels (bold enabled)	Status
Triggered Workflows
linux-bionic-cuda11.5-py3.6-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/default`, `ciflow/linux`	✅ triggered
linux-bionic-py3.6-clang9	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/noarch`, `ciflow/xla`	✅ triggered
linux-docs	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/docs`, `ciflow/linux`	✅ triggered
linux-vulkan-bionic-py3.6-clang9	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/vulkan`	✅ triggered
linux-xenial-cuda11.3-py3.6-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/default`, `ciflow/linux`	✅ triggered
linux-xenial-py3-clang5-mobile-build	`ciflow/all`, `ciflow/default`, `ciflow/linux`, `ciflow/mobile`	✅ triggered
linux-xenial-py3-clang5-mobile-custom-build-static	`ciflow/all`, `ciflow/default`, `ciflow/linux`, `ciflow/mobile`	✅ triggered
linux-xenial-py3.6-clang7-asan	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/sanitizers`	✅ triggered
linux-xenial-py3.6-clang7-onnx	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/onnx`	✅ triggered
linux-xenial-py3.6-gcc5.4	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`	✅ triggered
linux-xenial-py3.6-gcc7	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`	✅ triggered
linux-xenial-py3.6-gcc7-bazel-test	`ciflow/all`, `ciflow/bazel`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`	✅ triggered
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-gradle-custom-build-single	`ciflow/all`, `ciflow/android`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`	✅ triggered
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-gradle-custom-build-single-full-jit	`ciflow/all`, `ciflow/android`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`	✅ triggered
win-vs2019-cpu-py3	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/win`	✅ triggered
win-vs2019-cuda11.3-py3	`ciflow/all`, `ciflow/cuda`, `ciflow/default`, `ciflow/win`	✅ triggered
Skipped Workflows
caffe2-linux-xenial-py3.6-gcc5.4	`ciflow/all`, `ciflow/cpu`, `ciflow/linux`	🚫 skipped
docker-builds	`ciflow/all`	🚫 skipped
ios-12-5-1-arm64	`ciflow/all`, `ciflow/ios`, `ciflow/macos`	🚫 skipped
ios-12-5-1-arm64-coreml	`ciflow/all`, `ciflow/ios`, `ciflow/macos`	🚫 skipped
ios-12-5-1-arm64-custom-ops	`ciflow/all`, `ciflow/ios`, `ciflow/macos`	🚫 skipped
ios-12-5-1-arm64-full-jit	`ciflow/all`, `ciflow/ios`, `ciflow/macos`	🚫 skipped
ios-12-5-1-arm64-metal	`ciflow/all`, `ciflow/ios`, `ciflow/macos`	🚫 skipped
ios-12-5-1-x86-64	`ciflow/all`, `ciflow/ios`, `ciflow/macos`	🚫 skipped
ios-12-5-1-x86-64-coreml	`ciflow/all`, `ciflow/ios`, `ciflow/macos`	🚫 skipped
ios-12-5-1-x86-64-full-jit	`ciflow/all`, `ciflow/ios`, `ciflow/macos`	🚫 skipped
libtorch-linux-bionic-cuda11.5-py3.6-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/libtorch`, `ciflow/linux`	🚫 skipped
libtorch-linux-xenial-cuda10.2-py3.6-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/libtorch`, `ciflow/linux`	🚫 skipped
libtorch-linux-xenial-cuda11.3-py3.6-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/libtorch`, `ciflow/linux`	🚫 skipped
linux-bionic-cuda10.2-py3.9-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/linux`, `ciflow/slow`	🚫 skipped
linux-docs-push	`ciflow/all`, `ciflow/cpu`, `ciflow/linux`, `ciflow/scheduled`	🚫 skipped
macos-10-15-py3-arm64	`ciflow/all`, `ciflow/macos`	🚫 skipped
macos-10-15-py3-lite-interpreter-x86-64	`ciflow/all`, `ciflow/macos`	🚫 skipped
macos-11-py3-x86-64	`ciflow/all`, `ciflow/macos`	🚫 skipped
parallelnative-linux-xenial-py3.6-gcc5.4	`ciflow/all`, `ciflow/cpu`, `ciflow/linux`	🚫 skipped
periodic-libtorch-linux-xenial-cuda11.1-py3.6-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/libtorch`, `ciflow/linux`, `ciflow/scheduled`	🚫 skipped
periodic-linux-xenial-cuda10.2-py3-gcc7-slow-gradcheck	`ciflow/all`, `ciflow/cuda`, `ciflow/linux`, `ciflow/scheduled`, `ciflow/slow`, `ciflow/slow-gradcheck`	🚫 skipped
periodic-linux-xenial-cuda11.1-py3.6-gcc7-debug	`ciflow/all`, `ciflow/cuda`, `ciflow/linux`, `ciflow/scheduled`	🚫 skipped
periodic-win-vs2019-cuda11.1-py3	`ciflow/all`, `ciflow/cuda`, `ciflow/scheduled`, `ciflow/win`	🚫 skipped

You can add a comment to the PR and tag @pytorchbot with the following commands:

# ciflow rerun, "ciflow/default" will always be added automatically
@pytorchbot ciflow rerun

# ciflow rerun with additional labels "-l <ciflow/label_name>", which is equivalent to adding these labels manually and trigger the rerun
@pytorchbot ciflow rerun -l ciflow/scheduled -l ciflow/slow

For more information, please take a look at the CI Flow Wiki.

facebook-github-bot · 2021-12-08T20:16:02Z

🔗 Helpful links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/69629
📄 Preview docs built from this PR
🔧 Opt-in to CIFlow to control what jobs run on your PRs

💊 CI failures summary and remediations

As of commit fc4c453 (more details on the Dr. CI page):

6/6 failures introduced in this PR

🕵️ 6 new failures recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

Lint / quick-checks (1/6)

Step: "Ensure correct trailing newlines" (full log | diagnosis details | 🔁 rerun)

2021-12-08T20:19:09.4501518Z python: can't open..._launches.py': [Errno 2] No such file or directory

2021-12-08T20:19:09.4141204Z ##[group]Run set -eux
2021-12-08T20:19:09.4141656Z �[36;1mset -eux�[0m
2021-12-08T20:19:09.4142362Z �[36;1mpython torch/testing/_check_kernel_launches.py |& tee "${GITHUB_WORKSPACE}"/cuda_kernel_launch_checks.txt�[0m
2021-12-08T20:19:09.4177411Z shell: /bin/bash -e {0}
2021-12-08T20:19:09.4177782Z env:
2021-12-08T20:19:09.4178293Z   pythonLocation: /opt/hostedtoolcache/Python/3.10.0/x64
2021-12-08T20:19:09.4179004Z   LD_LIBRARY_PATH: /opt/hostedtoolcache/Python/3.10.0/x64/lib
2021-12-08T20:19:09.4179508Z ##[endgroup]
2021-12-08T20:19:09.4320119Z + tee /home/runner/work/pytorch/pytorch/cuda_kernel_launch_checks.txt
2021-12-08T20:19:09.4331073Z + python torch/testing/_check_kernel_launches.py
2021-12-08T20:19:09.4501518Z python: can't open file '/home/runner/work/pytorch/pytorch/torch/testing/_check_kernel_launches.py': [Errno 2] No such file or directory
2021-12-08T20:19:09.4562186Z ##[group]Run (! git --no-pager grep -I -no $'#include <cub/' --  ./aten  ':(exclude)aten/src/ATen/cuda/cub*.cuh' || (echo "The above files have direct cub include; please include ATen/cuda/cub.cuh instead and wrap your cub calls in at::native namespace if necessary"; false))
2021-12-08T20:19:09.4563815Z �[36;1m(! git --no-pager grep -I -no $'#include <cub/' --  ./aten  ':(exclude)aten/src/ATen/cuda/cub*.cuh' || (echo "The above files have direct cub include; please include ATen/cuda/cub.cuh instead and wrap your cub calls in at::native namespace if necessary"; false))�[0m
2021-12-08T20:19:09.4594895Z shell: /bin/bash -e {0}
2021-12-08T20:19:09.4595205Z env:
2021-12-08T20:19:09.4595666Z   pythonLocation: /opt/hostedtoolcache/Python/3.10.0/x64
2021-12-08T20:19:09.4596286Z   LD_LIBRARY_PATH: /opt/hostedtoolcache/Python/3.10.0/x64/lib
2021-12-08T20:19:09.4596745Z ##[endgroup]
2021-12-08T20:19:09.4956073Z ##[group]Run (! git --no-pager grep -I -no $'cudaStreamSynchronize' --  ./aten ./c10 ':(exclude)aten/src/ATen/test' ':(exclude)c10/cuda/CUDAFunctions.h' || (echo "The above files call raw cuda APIs directly; please use at::cuda wrappers instead"; false))
2021-12-08T20:19:09.4958050Z �[36;1m(! git --no-pager grep -I -no $'cudaStreamSynchronize' --  ./aten ./c10 ':(exclude)aten/src/ATen/test' ':(exclude)c10/cuda/CUDAFunctions.h' || (echo "The above files call raw cuda APIs directly; please use at::cuda wrappers instead"; false))�[0m
2021-12-08T20:19:09.4993478Z shell: /bin/bash -e {0}

linux-bionic-py3.6-clang9 / test (default, 1, 2, linux.2xlarge) (2/6)

Step: "Test" (full log | diagnosis details | 🔁 rerun)

2021-12-08T21:27:16.9973045Z AssertionError: can only test a child process

2021-12-08T21:27:16.9774110Z     assert self._parent_pid == os.getpid(), 'can only test a child process'
2021-12-08T21:27:16.9774661Z AssertionError: can only test a child process
2021-12-08T21:27:16.9963161Z Exception ignored in: <bound method _MultiProcessingDataLoaderIter.__del__ of <torch.utils.data.dataloader._MultiProcessingDataLoaderIter object at 0x7f53bb4f8438>>
2021-12-08T21:27:16.9964654Z Traceback (most recent call last):
2021-12-08T21:27:16.9965665Z   File "/opt/conda/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 1328, in __del__
2021-12-08T21:27:16.9967096Z     self._shutdown_workers()
2021-12-08T21:27:16.9968003Z   File "/opt/conda/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 1320, in _shutdown_workers
2021-12-08T21:27:16.9970802Z     if w.is_alive():
2021-12-08T21:27:16.9971600Z   File "/opt/conda/lib/python3.6/multiprocessing/process.py", line 134, in is_alive
2021-12-08T21:27:16.9972512Z     assert self._parent_pid == os.getpid(), 'can only test a child process'
2021-12-08T21:27:16.9973045Z AssertionError: can only test a child process
2021-12-08T21:27:17.6309779Z ok (0.667s)
2021-12-08T21:27:17.6330849Z   test_iterabledataset_len (__main__.TestDataLoaderPersistentWorkers) ... ok (0.002s)
2021-12-08T21:27:19.2124414Z   test_large_sampler_indices (__main__.TestDataLoaderPersistentWorkers) ... ok (1.579s)
2021-12-08T21:27:19.2210519Z   test_len (__main__.TestDataLoaderPersistentWorkers) ... ok (0.009s)
2021-12-08T21:27:20.7416553Z   test_multiple_dataloaders (__main__.TestDataLoaderPersistentWorkers) ... ok (1.520s)
2021-12-08T21:27:24.0321338Z   test_multiprocessing_contexts (__main__.TestDataLoaderPersistentWorkers) ... ok (3.290s)
2021-12-08T21:27:24.8488572Z   test_no_segfault (__main__.TestDataLoaderPersistentWorkers) ... ok (0.817s)
2021-12-08T21:27:24.8522303Z   test_numpy (__main__.TestDataLoaderPersistentWorkers) ... ok (0.003s)
2021-12-08T21:27:24.8550810Z   test_numpy_gen_state (__main__.TestDataLoaderPersistentWorkers) ... ok (0.003s)
2021-12-08T21:27:24.8576533Z   test_numpy_scalars (__main__.TestDataLoaderPersistentWorkers) ... ok (0.003s)

linux-xenial-py3.6-gcc7 / test (default, 2, 2, linux.2xlarge) (3/6)

Step: "Test" (full log | diagnosis details | 🔁 rerun)

2021-12-08T21:26:32.4776606Z FAIL [0.006s]: tes...d_error_cpu (__main__.TestRandomTensorCreationCPU)

2021-12-08T21:26:32.4771158Z     raise rte
2021-12-08T21:26:32.4771912Z   File "/opt/conda/lib/python3.6/site-packages/torch/testing/_internal/common_device_type.py", line 376, in instantiated_test
2021-12-08T21:26:32.4772589Z     result = test(self, **param_kwargs)
2021-12-08T21:26:32.4773049Z   File "test_tensor_creation_ops.py", line 3327, in test_normal
2021-12-08T21:26:32.4773614Z     helper(self, device, dtype, lambda x: x, lambda t: t, lambda mean: mean)
2021-12-08T21:26:32.4774158Z   File "test_tensor_creation_ops.py", line 3267, in helper
2021-12-08T21:26:32.4774714Z     out = torch.normal(mean=torch.empty((0, 2)), std=torch.empty((0, 1)))
2021-12-08T21:26:32.4775352Z RuntimeError: inconsistent tensor, output size ([0, 2]) is not the same as input size ([0, 1])
2021-12-08T21:26:32.4775748Z 
2021-12-08T21:26:32.4776032Z ======================================================================
2021-12-08T21:26:32.4776606Z FAIL [0.006s]: test_normal_std_error_cpu (__main__.TestRandomTensorCreationCPU)
2021-12-08T21:26:32.4777378Z ----------------------------------------------------------------------
2021-12-08T21:26:32.4778012Z RuntimeError: normal expects std >= 0.0, but found std -1
2021-12-08T21:26:32.4778327Z 
2021-12-08T21:26:32.4778764Z During handling of the above exception, another exception occurred:
2021-12-08T21:26:32.4779137Z 
2021-12-08T21:26:32.4779452Z Traceback (most recent call last):
2021-12-08T21:26:32.4780280Z   File "/opt/conda/lib/python3.6/site-packages/torch/testing/_internal/common_device_type.py", line 376, in instantiated_test
2021-12-08T21:26:32.4780948Z     result = test(self, **param_kwargs)
2021-12-08T21:26:32.4781434Z   File "test_tensor_creation_ops.py", line 3336, in test_normal_std_error
2021-12-08T21:26:32.4782003Z     torch.normal(input, -1, (10,))

linux-bionic-py3.6-clang9 / test (default, 2, 2, linux.2xlarge) (4/6)

Step: "Test" (full log | diagnosis details | 🔁 rerun)

2021-12-08T21:35:21.6757155Z FAIL [0.006s]: tes...d_error_cpu (__main__.TestRandomTensorCreationCPU)

2021-12-08T21:35:21.6751824Z     raise rte
2021-12-08T21:35:21.6752563Z   File "/opt/conda/lib/python3.6/site-packages/torch/testing/_internal/common_device_type.py", line 376, in instantiated_test
2021-12-08T21:35:21.6753197Z     result = test(self, **param_kwargs)
2021-12-08T21:35:21.6753665Z   File "test_tensor_creation_ops.py", line 3327, in test_normal
2021-12-08T21:35:21.6754202Z     helper(self, device, dtype, lambda x: x, lambda t: t, lambda mean: mean)
2021-12-08T21:35:21.6754745Z   File "test_tensor_creation_ops.py", line 3267, in helper
2021-12-08T21:35:21.6755314Z     out = torch.normal(mean=torch.empty((0, 2)), std=torch.empty((0, 1)))
2021-12-08T21:35:21.6755941Z RuntimeError: inconsistent tensor, output size ([0, 2]) is not the same as input size ([0, 1])
2021-12-08T21:35:21.6756330Z 
2021-12-08T21:35:21.6756604Z ======================================================================
2021-12-08T21:35:21.6757155Z FAIL [0.006s]: test_normal_std_error_cpu (__main__.TestRandomTensorCreationCPU)
2021-12-08T21:35:21.6757922Z ----------------------------------------------------------------------
2021-12-08T21:35:21.6758524Z RuntimeError: normal expects std >= 0.0, but found std -1
2021-12-08T21:35:21.6758825Z 
2021-12-08T21:35:21.6759262Z During handling of the above exception, another exception occurred:
2021-12-08T21:35:21.6759622Z 
2021-12-08T21:35:21.6759924Z Traceback (most recent call last):
2021-12-08T21:35:21.6760729Z   File "/opt/conda/lib/python3.6/site-packages/torch/testing/_internal/common_device_type.py", line 376, in instantiated_test
2021-12-08T21:35:21.6761366Z     result = test(self, **param_kwargs)
2021-12-08T21:35:21.6761857Z   File "test_tensor_creation_ops.py", line 3336, in test_normal_std_error
2021-12-08T21:35:21.6762408Z     torch.normal(input, -1, (10,))

linux-xenial-py3.6-gcc5.4 / test (default, 2, 2, linux.2xlarge) (5/6)

Step: "Test" (full log | diagnosis details | 🔁 rerun)

2021-12-08T21:31:03.8449844Z FAIL [0.006s]: tes...d_error_cpu (__main__.TestRandomTensorCreationCPU)

2021-12-08T21:31:03.8444276Z     raise rte
2021-12-08T21:31:03.8445064Z   File "/opt/conda/lib/python3.6/site-packages/torch/testing/_internal/common_device_type.py", line 376, in instantiated_test
2021-12-08T21:31:03.8445750Z     result = test(self, **param_kwargs)
2021-12-08T21:31:03.8446213Z   File "test_tensor_creation_ops.py", line 3327, in test_normal
2021-12-08T21:31:03.8446785Z     helper(self, device, dtype, lambda x: x, lambda t: t, lambda mean: mean)
2021-12-08T21:31:03.8447333Z   File "test_tensor_creation_ops.py", line 3267, in helper
2021-12-08T21:31:03.8447892Z     out = torch.normal(mean=torch.empty((0, 2)), std=torch.empty((0, 1)))
2021-12-08T21:31:03.8448554Z RuntimeError: inconsistent tensor, output size ([0, 2]) is not the same as input size ([0, 1])
2021-12-08T21:31:03.8448943Z 
2021-12-08T21:31:03.8449250Z ======================================================================
2021-12-08T21:31:03.8449844Z FAIL [0.006s]: test_normal_std_error_cpu (__main__.TestRandomTensorCreationCPU)
2021-12-08T21:31:03.8450637Z ----------------------------------------------------------------------
2021-12-08T21:31:03.8451271Z RuntimeError: normal expects std >= 0.0, but found std -1
2021-12-08T21:31:03.8451588Z 
2021-12-08T21:31:03.8452028Z During handling of the above exception, another exception occurred:
2021-12-08T21:31:03.8452409Z 
2021-12-08T21:31:03.8452732Z Traceback (most recent call last):
2021-12-08T21:31:03.8453645Z   File "/opt/conda/lib/python3.6/site-packages/torch/testing/_internal/common_device_type.py", line 376, in instantiated_test
2021-12-08T21:31:03.8454335Z     result = test(self, **param_kwargs)
2021-12-08T21:31:03.8454819Z   File "test_tensor_creation_ops.py", line 3336, in test_normal_std_error
2021-12-08T21:31:03.8455409Z     torch.normal(input, -1, (10,))

linux-xenial-py3.6-gcc5.4 / test (default, 1, 2, linux.2xlarge) (6/6)

Step: "Test" (full log | diagnosis details | 🔁 rerun)

2021-12-08T21:29:13.8689991Z AssertionError: can only test a child process

2021-12-08T21:29:13.8589385Z     assert self._parent_pid == os.getpid(), 'can only test a child process'
2021-12-08T21:29:13.8589939Z AssertionError: can only test a child process
2021-12-08T21:29:13.8672776Z Exception ignored in: <bound method _MultiProcessingDataLoaderIter.__del__ of <torch.utils.data.dataloader._MultiProcessingDataLoaderIter object at 0x7fa09abd3240>>
2021-12-08T21:29:13.8674467Z Traceback (most recent call last):
2021-12-08T21:29:13.8675775Z   File "/opt/conda/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 1328, in __del__
2021-12-08T21:29:13.8679485Z     self._shutdown_workers()
2021-12-08T21:29:13.8680554Z   File "/opt/conda/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 1320, in _shutdown_workers
2021-12-08T21:29:13.8687418Z     if w.is_alive():
2021-12-08T21:29:13.8688134Z   File "/opt/conda/lib/python3.6/multiprocessing/process.py", line 134, in is_alive
2021-12-08T21:29:13.8689232Z     assert self._parent_pid == os.getpid(), 'can only test a child process'
2021-12-08T21:29:13.8689991Z AssertionError: can only test a child process
2021-12-08T21:29:14.5319331Z ok (0.688s)
2021-12-08T21:29:14.5342017Z   test_iterabledataset_len (__main__.TestDataLoaderPersistentWorkers) ... ok (0.002s)
2021-12-08T21:29:16.2339625Z   test_large_sampler_indices (__main__.TestDataLoaderPersistentWorkers) ... ok (1.699s)
2021-12-08T21:29:16.2415554Z   test_len (__main__.TestDataLoaderPersistentWorkers) ... ok (0.008s)
2021-12-08T21:29:17.7925191Z   test_multiple_dataloaders (__main__.TestDataLoaderPersistentWorkers) ... ok (1.551s)
2021-12-08T21:29:21.1720910Z   test_multiprocessing_contexts (__main__.TestDataLoaderPersistentWorkers) ... ok (3.379s)
2021-12-08T21:29:22.0147572Z   test_no_segfault (__main__.TestDataLoaderPersistentWorkers) ... ok (0.843s)
2021-12-08T21:29:22.0181509Z   test_numpy (__main__.TestDataLoaderPersistentWorkers) ... ok (0.003s)
2021-12-08T21:29:22.0210632Z   test_numpy_gen_state (__main__.TestDataLoaderPersistentWorkers) ... ok (0.003s)
2021-12-08T21:29:22.0237011Z   test_numpy_scalars (__main__.TestDataLoaderPersistentWorkers) ... ok (0.003s)

This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

lezcano

I really liked the macro approach as it makes everything very readable, and it's clear that they simply encapsulate TORCH_CHECKs!

I left a few minor comments, but overall this PR is looking great, thanks!

Its great to see how much cleaner the code looks by using structured kernels really.

lezcano · 2021-12-09T08:27:53Z

torch/_torch_docs.py

+The shapes of :attr:`mean`, :attr:`std`, and :attr:`out` (if present) must
+match exactly.  Unless the size of :attr:`out` is zero, then it will be resized
+based on the supplied arguments, which must still match.


This is probably not correct. Could it be the case that

Suggested change

The shapes of :attr:`mean`, :attr:`std`, and :attr:`out` (if present) must

match exactly. Unless the size of :attr:`out` is zero, then it will be resized

based on the supplied arguments, which must still match.

The shapes of :attr:`mean` and :attr:`std` must be broadcastable.

given that you're using infer_size in the previous PR.

I would not mention the out= argument here, as the behaviour of that argument is the same all across the PyTorch library.

I see in the example in the OP that the following now breaks:

torch.normal(torch.rand(3, 4, 5), torch.rand(3, 4, 1))

This does not seem right, as their shapes are broadcastable. We should support this behaviour.

In fact, I believe the only change needed to support this behaviour is to remove the CHECK_NORMAL_SIZES check. Note that the fact that mean and std are broadcastable is checked within at::infer_size in the meta function, so we don't need to add any additional checks for that.

i'll look into it and follow up later. wasn't sure what behavior is desirable here

lezcano · 2021-12-09T08:28:25Z

torch/_torch_docs.py

-.. note:: When the shapes do not match, the shape of :attr:`mean`
-          is used as the shape for the returned output tensor


Nice to have this deprecated, even if it was 4 versions later than we said we would deprecate it (cc @mruberry)

lezcano · 2021-12-09T08:31:58Z

aten/src/ATen/native/DistributionTemplates.h

+#define CHECK_NORMAL_SIZES(output, input) \
+  TORCH_CHECK( \
+    output.sizes().equals(input.sizes()), \
+    "inconsistent tensor, output size (", output.sizes(), \
+    ") is not the same as input size (", input.sizes(), ")");
+


It looks like this check is not necessary now, as it should be performed by the structured kernels machinery.

the reason i have these is because everything calls into template implementations now and these template versions still can be called by someone (e.g., see the rng tests i mentioned in the other pr: aten/src/ATen/test/cpu_rng_test.cpp). i can remove the checks if it's considered okay

We should modify that code to call into the at::normal functions. We should only avoid calling the at:: API when the performance is crucial, and even then, one needs to be careful about that.

We should only avoid calling the at:: API when the performance is crucial

could you elaborate on the perf bit?

PyTorch has a custom dispatcher mechanism. When you call a function in the at:: namespace, this mechanism is activated and it calls the correct backend, e.g. CPU, CUDA, XLA, but also forward or backward AD. This has certain overhead, so sometimes you'll find code that calls directly to the corresponding at::native:: function. Now, most of the time, the correct thing to do is to call the at:: function, but if you are implementing your own backward AD (i.e. it's not a composite kernel) then you should be good to call at::native::. See a non-example of this where I did and shouldn't have called an at::native:: within a forward AD implementation: #63568 (comment)

lezcano · 2021-12-09T08:35:10Z

aten/src/ATen/native/DistributionTemplates.h

+  CHECK_NORMAL_SIZES(output, mean);
+  CHECK_NORMAL_SIZES(output, std);
+  CHECK_NORMAL_TENSOR_STD(std);


(pretty much) no checks should happen in the _impl side of things. They already happen in the meta part, so this should probably go.

lezcano · 2021-12-09T08:41:05Z

aten/src/ATen/native/DistributionTemplates.h

+  CHECK_NORMAL_SIZES(output, mean);
+  CHECK_NORMAL_STD(std);


These should go as well.

lezcano · 2021-12-09T08:41:13Z

aten/src/ATen/native/DistributionTemplates.h

+  CHECK_NORMAL_SIZES(output, std);
+  CHECK_NORMAL_TENSOR_STD(std);


lezcano · 2021-12-09T08:42:52Z

aten/src/ATen/native/DistributionTemplates.h

 template<template<typename> class normal_kernel, typename RNG>
 Tensor& normal_impl_(Tensor& self, double mean, double std, c10::optional<Generator> gen) {
-  TORCH_CHECK(std >= 0.0, "normal_ expects std >= 0.0, but found std=", std);
+  CHECK_NORMAL_STD(std);


As mentioned in the previous PR, this check should probably go if we go down the road of having this operation implemented as a structured kernel.

noted, but see above

nkaretnikov · 2021-12-15T20:46:13Z

to avoid confusion, will open a new stack to address issues related to BC and broadcasting

pytorch-probot bot added the ciflow/default label Dec 8, 2021

facebook-github-bot added the cla signed label Dec 8, 2021

This was referenced Dec 8, 2021

Port tensor variants of normal to structured kernel #69628

Closed

Fix a typo: add a missing space #69630

Closed

Bug fix: allow std 0 in the meta definition of normal_ #69631

Closed

Add meta variants to float variants of normal #69632

Closed

pytorchbot added the open source label Dec 8, 2021

nkaretnikov added the module: structured kernels Related to new structured kernels functionality label Dec 8, 2021

lezcano reviewed Dec 9, 2021

View reviewed changes

nkaretnikov closed this Dec 15, 2021

facebook-github-bot deleted the gh/nkaretnikov/2/head branch January 15, 2022 15:16

		.. note:: When the shapes do not match, the shape of :attr:`mean`
		is used as the shape for the returned output tensor

		CHECK_NORMAL_SIZES(output, std);
		CHECK_NORMAL_TENSOR_STD(std);

Conversation

nkaretnikov commented Dec 8, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-probot bot commented Dec 8, 2021

⚛️ CI Flow

Uh oh!

facebook-github-bot commented Dec 8, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful links

💊 CI failures summary and remediations

🕵️ 6 new failures recognized by patterns

Lint / quick-checks (1/6)

linux-bionic-py3.6-clang9 / test (default, 1, 2, linux.2xlarge) (2/6)

linux-xenial-py3.6-gcc7 / test (default, 2, 2, linux.2xlarge) (3/6)

linux-bionic-py3.6-clang9 / test (default, 2, 2, linux.2xlarge) (4/6)

linux-xenial-py3.6-gcc5.4 / test (default, 2, 2, linux.2xlarge) (5/6)

linux-xenial-py3.6-gcc5.4 / test (default, 1, 2, linux.2xlarge) (6/6)

Uh oh!

lezcano left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lezcano Dec 9, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lezcano Dec 9, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nkaretnikov commented Dec 15, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

nkaretnikov commented Dec 8, 2021 •

edited

Loading

facebook-github-bot commented Dec 8, 2021 •

edited

Loading

lezcano Dec 9, 2021 •

edited

Loading

lezcano Dec 9, 2021 •

edited

Loading