Skip to content

test_lstsq in test_linalg.py is flaky on ROCm #53976

@mruberry

Description

@mruberry

Example failure snippet:

20:32:54 ======================================================================
20:32:54 ERROR [0.053s]: test_linalg_lstsq_cuda_float32 (__main__.TestLinalgCUDA)
20:32:54 ----------------------------------------------------------------------
20:32:54 Traceback (most recent call last):
20:32:54   File "/opt/conda/lib/python3.6/site-packages/torch/testing/_internal/common_utils.py", line 899, in wrapper
20:32:54     method(*args, **kwargs)
20:32:54   File "/opt/conda/lib/python3.6/site-packages/torch/testing/_internal/common_device_type.py", line 292, in instantiated_test
20:32:54     raise rte
20:32:54   File "/opt/conda/lib/python3.6/site-packages/torch/testing/_internal/common_device_type.py", line 287, in instantiated_test
20:32:54     result = test_fn(self, *args)
20:32:54   File "/opt/conda/lib/python3.6/site-packages/torch/testing/_internal/common_device_type.py", line 572, in dep_fn
20:32:54     return fn(slf, device, *args, **kwargs)
20:32:54   File "/opt/conda/lib/python3.6/site-packages/torch/testing/_internal/common_device_type.py", line 572, in dep_fn
20:32:54     return fn(slf, device, *args, **kwargs)
20:32:54   File "test_linalg.py", line 210, in test_linalg_lstsq
20:32:54     a = random_well_conditioned_matrix(*shape, dtype=dtype, device=device)
20:32:54   File "/opt/conda/lib/python3.6/site-packages/torch/testing/_internal/common_utils.py", line 1601, in random_well_conditioned_matrix
20:32:54     u, _, v = x.svd()
20:32:54 RuntimeError: magma: The value of work_size(-9223372036854775808) is too large to fit into a magma_int_t (4 bytes)

cc @jianyuh @nikitaved @pearu @mruberry @heitorschueroff @walterddr @IvanYashchuk @jeffdaily @sunway513 @ROCmSupport @VitalyFedyunin

Metadata

Metadata

Assignees

No one assigned

    Labels

    module: linear algebraIssues related to specialized linear algebra operations in PyTorch; includes matrix multiply matmulmodule: rocmAMD GPU support for Pytorchmodule: testsIssues related to tests (not the torch.testing module)triagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate module

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions