ICE in gcc < 13.5 for AArch64 on Neoverse-V1

### 🐛 Describe the bug

We have seen a few ICE of the form

` internal compiler error: in expand_insn, at optabs.cc:8185`

https://github.com/pytorch/pytorch/actions/runs/19027214755/job/54332952760

this was due to https://github.com/pytorch/pytorch/pull/166687 which was reverted and relanded with a fix.

However we also see this in 2 unit tests

```
In static member function ‘static at::vec::CPU_CAPABILITY::Vectorized<float> at::vec::CPU_CAPABILITY::Vectorized<float>::blendv(const at::vec::CPU_CAPABILITY::Vectorized<float>&, const at::vec::CPU_CAPABILITY::Vectorized<float>&, const at::vec::CPU_CAPABILITY::Vectorized<float>&)’,
    inlined from ‘static at::vec::CPU_CAPABILITY::VectorizedN<T, N> at::vec::CPU_CAPABILITY::VectorizedN<T, N>::blendv(const at::vec::CPU_CAPABILITY::VectorizedN<T, N>&, const at::vec::CPU_CAPABILITY::VectorizedN<T, N>&, const at::vec::CPU_CAPABILITY::VectorizedN<T, N>&) [with T = float; int N = 1]’ at /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/ATen/cpu/vec/vec_n.h:142:32,
    inlined from ‘static at::vec::CPU_CAPABILITY::VecMask<T, N> at::vec::CPU_CAPABILITY::VecMask<T, N>::blendv(const at::vec::CPU_CAPABILITY::VecMask<T, N>&, const at::vec::CPU_CAPABILITY::VecMask<T, N>&, const at::vec::CPU_CAPABILITY::VecMask<T, N>&) [with T = float; int N = 1]’ at /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/ATen/cpu/vec/vec_mask.h:172:57,
    inlined from ‘void kernel(float*)’ at /tmp/tmp3w4gi3x6/65/c65mflcfwnn4ujuyupzb4sy6sejcttgastif7vtzn3zysjj3piii.main.cpp:29:61:
/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/ATen/cpu/vec/sve/vec_float.h:84:32: internal compiler error: in expand_insn, at optabs.cc:8185
   84 |     return svsel_f32(mask, b, a);
      |                                ^
0x16c5913 expand_insn(insn_code, unsigned int, expand_operand*)
	../../src/gcc/optabs.cc:8185
0x16c5913 expand_insn(insn_code, unsigned int, expand_operand*)
	../../src/gcc/optabs.cc:8181
0x1746c4f expand_fn_using_insn(gcall*, insn_code, unsigned int, unsigned int) [clone .constprop.0]
	../../src/gcc/internal-fn.cc:194
0x10e7c4f expand_call_stmt
	../../src/gcc/cfgexpand.cc:2737
0x10e7c4f expand_gimple_stmt_1
	../../src/gcc/cfgexpand.cc:3880
0x10e7c4f expand_gimple_stmt
	../../src/gcc/cfgexpand.cc:4044
0x109a26f expand_gimple_basic_block
	../../src/gcc/cfgexpand.cc:6106
0x109a26f execute
	../../src/gcc/cfgexpand.cc:6841
Please submit a full bug report, with preprocessed source (by using -freport-bug).
Please include the complete backtrace with any bug report.
See <file:///usr/share/doc/gcc-13/README.Bugs> for instructions.


Set TORCHDYNAMO_VERBOSE=1 for the internal stack trace (please do this especially if you're reporting a bug to PyTorch). For even more developer context, set TORCH_LOGS="+dynamo"


To execute this test, run the following from the base repo dir:
    python test/inductor/test_fused_attention.py SDPAPatternRewriterCpuDynamicTests.test_sdpa_rewriter_5_cpu
```

The list is:

python test/inductor/test_fused_attention.py SDPAPatternRewriterCpuDynamicTests.test_sdpa_rewriter_5_cpu
python test/inductor/test_fused_attention.py SDPAPatternRewriterCpuTests.test_sdpa_rewriter_5_cpu


This has been looked at by our team and fixed in GCC 13.5
https://gcc.gnu.org/cgit/gcc/commit/?id=f3b7007bbabea686e133b001de1cac089afbd11f

It is also reportedly fixed in GCC14


### Versions

nightly 20251120

cc @malfet @seemethere @snadampal @milpuz01 @aditew01 @nikhil-arm @fadara01 @nWEIdia

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ICE in gcc < 13.5 for AArch64 on Neoverse-V1 #168288

🐛 Describe the bug

Versions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

ICE in gcc < 13.5 for AArch64 on Neoverse-V1 #168288

Description

🐛 Describe the bug

Versions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions