Skip to content

Aarch64 unit test failures from nightly/manylinux build, jammy upgrade to gcc13 needed #166736

@robert-hardwick

Description

@robert-hardwick

🐛 Describe the bug

We have noticed 2 test failures on AArch64 ( neoverse-v2 / c8g ) which are not happening in https://github.com/pytorch/pytorch/actions/workflows/linux-aarch64.yml

Mismatched elements: 1 / 513 (0.2%)
Greatest absolute difference: 253 at index (512,)
Greatest relative difference: 1.0 at index (512,)

To execute this test, run the following from the base repo dir:
    python test/test_unary_ufuncs.py TestUnaryUfuncsCPU.test_contig_vs_every_other__refs__conversions_byte_cpu_float32

and

Mismatched elements: 9 / 40 (22.5%)
Greatest absolute difference: 1 at index (0, 0, 5)
Greatest relative difference: 1.0 at index (0, 0, 5)

The failure occurred for item [3]

To execute this test, run the following from the base repo dir:
    python test/inductor/test_torchinductor.py CpuTests.test_to_dtype_cpu

These problems exist on nightly build. We have investigated and it looks like it happens since nightly 10.25 which looks like this commit b31bad1

Actions Requested.

Can we upgrade jammy images to GCC13 @malfet which should show these problems and then we might need to revert b31bad1

Versions

Collecting environment information...
PyTorch version: 2.10.0.dev20251031+cpu
Is debug build: False
CUDA used to build PyTorch: None
ROCM used to build PyTorch: N/A

OS: Ubuntu 22.04.5 LTS (aarch64)
GCC version: (Ubuntu 11.4.0-1ubuntu1~22.04.2) 11.4.0
Clang version: Could not collect
CMake version: version 3.31.6
Libc version: glibc-2.35

Python version: 3.10.19 | packaged by conda-forge | (main, Oct 22 2025, 22:26:30) [GCC 14.3.0] (64-bit runtime)
Python platform: Linux-6.8.0-1040-aws-aarch64-with-glibc2.35
Is CUDA available: False
CUDA runtime version: No CUDA
CUDA_MODULE_LOADING set to: N/A
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA
Is XPU available: False
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

CPU:
Architecture: aarch64
CPU op-mode(s): 64-bit
Byte Order: Little Endian
CPU(s): 32
On-line CPU(s) list: 0-31
Vendor ID: ARM
Model name: Neoverse-V2
Model: 1
Thread(s) per core: 1
Core(s) per cluster: 32
Socket(s): -
Cluster(s): 1
Stepping: r0p1
BogoMIPS: 2000.00
Flags: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm ssbs sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache: 2 MiB (32 instances)
L1i cache: 2 MiB (32 instances)
L2 cache: 64 MiB (32 instances)
L3 cache: 36 MiB (1 instance)
NUMA node(s): 1
NUMA node0 CPU(s): 0-31
Vulnerability Gather data sampling: Not affected
Vulnerability Itlb multihit: Not affected
Vulnerability L1tf: Not affected
Vulnerability Mds: Not affected
Vulnerability Meltdown: Not affected
Vulnerability Mmio stale data: Not affected
Vulnerability Reg file data sampling: Not affected
Vulnerability Retbleed: Not affected
Vulnerability Spec rstack overflow: Not affected
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1: Mitigation; __user pointer sanitization
Vulnerability Spectre v2: Not affected
Vulnerability Srbds: Not affected
Vulnerability Tsx async abort: Not affected

Versions of relevant libraries:
[pip3] mypy==1.16.0
[pip3] mypy_extensions==1.1.0
[pip3] numpy==1.22.4
[pip3] onnx==1.19.1
[pip3] onnx-ir==0.1.11
[pip3] onnxscript==0.5.4
[pip3] optree==0.13.0
[pip3] torch==2.10.0.dev20251031+cpu
[pip3] torchvision==0.25.0.dev20251031
[conda] No relevant packages

cc @seemethere @malfet @atalman @pytorch/pytorch-dev-infra @snadampal @milpuz01 @aditew01 @nikhil-arm @fadara01

Metadata

Metadata

Assignees

No one assigned

    Labels

    module: armRelated to ARM architectures builds of PyTorch. Includes Apple M1module: binariesAnything related to official binaries that we release to usersmodule: ciRelated to continuous integrationtriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate module

    Type

    No type

    Projects

    Status

    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions