Skip to content

Multiple failures in test_unary_ufuncs on POWER #59248

@Flamefire

Description

@Flamefire

🐛 Bug

A large bunch of tests in test_unary_ufuncs fails on POWER9 after the vector intrinsic introduction in PyTorch 1.8 they look related hence pasted as a single issue.

To Reproduce

Steps to reproduce the behavior:

  1. run test_unary_ufuncs.py on PPC

Expected behavior

Tests pass

Environment

  • PyTorch Version (e.g., 1.0): 1.8.1
  • OS (e.g., Linux): Linux
  • How you installed PyTorch (conda, pip, source): source
  • Build command you used (if compiling from source): CMAKE_BUILD_TYPE=Release BUILD_TEST=0 PYTORCH_BUILD_VERSION=1.8.1 PYTORCH_BUILD_NUMBER=1 MAX_JOBS=$(nproc) BLAS=Eigen USE_FFMPEG=1 BUILD_CUSTOM_PROTOBUF=0 USE_IBVERBS=1 USE_CUDA=0 USE_METAL=0 python setup.py install
  • Python version: 3.8.6

Additional context

I'll try to categorize the tests into failure types and (possible) explanations:

  • test_contig_vs_every_other_angle_cpu_complex64: 256 / 513 elements failed The greatest difference was 3.0628208592534065 (-3.096264600753784 vs. -0.033443741500377655): Looks like the VSX version gets bogus values here
  • test_contig_vs_every_other_angle_cpu_float32/float64, test_non_contig_angle_cpu_float32/float64, test_non_contig_expand_angle_cpu_int16/int32/int64/int8: greatest difference was 3.1415927410125732 (0.0 vs. 3.1415927410125732): This is an omission: The VSX implementation does not return PI when the argument is negative:
    return Vectorized<double>{0};
  • test_non_contig_angle_cpu_complex128/complex64: greatest difference was 1.5819721883714497 (0.8557325138535072 vs. 2.437704702224957) and greatest difference was 2.4029039442539215 (-0.4417212903499603 vs. -2.844625234603882): Similar to the first although the relation to (almost) pi/2 and 3/4*pi suggest some mixing of values
  • test_non_contig_index_angle_cpu_complex128: greatest difference was 0.38246162968485264 (-1.3715882372626278 vs. -1.7540498669474804)
  • test_reference_numerics_angle_cpu_complex128/complex64: Similar failures, probably the whole complex-angle stuff is bugged
  • test_reference_numerics_angle_cpu_float32/float64: greatest difference was nan (0.0 vs. nan), not sure here, maybe same as point 2
  • test_reference_numerics_angle_cpu_int16/int32/int64/int8: Same as 2 (PI vs 0)
  • test_reference_numerics_log1p_cpu_float32 (and log10, log2, log): Compares a value (88.72283935546875, 128.0, 38.53184127807617) against inf
  • test_reference_numerics_logit_cpu_float32 greatest difference was nan (-87.3365478515625 vs. nan)
  • test_sgn_cpu_complex128/complex64: greatest difference was 0.5775589227244207 (0.8735823660790683 vs. 0.29602344335464753) and greatest difference was 0.38670969009399414 (-2.521059989929199 vs. -2.9077696800231934), it is the angle comparison that fails, so same as 1

cc @mruberry @VitalyFedyunin @walterddr

Metadata

Metadata

Assignees

No one assigned

    Labels

    module: POWERIssues specific to the POWER/ppc architecturemodule: testsIssues related to tests (not the torch.testing module)triagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate module

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions