Skip to content

[AArch64] llvm.experimental.cttz.elts.i64.v4i1 appears to be lowered incorrectly with SVE #178644

@fhahn

Description

@fhahn

The IR snippet in https://godbolt.org/z/j9h98rcKT produces incorrect results with -mcpu=neoverse-v2, but works correctly with plain NEON. It looks like the major difference is that llvm.experimental.cttz.elts.i64.v4i1 is lowered to SVE instructions, so I suspect this lowering is does not match the definition of the intrinsic.

To reproduce end-to-end, check out aac5f40ab2fe91418e8727d4276bdcb5b08e1a70 from
and build with -mcpu=neoverse-v2 + run SingleSource/UnitTests/Vectorizer/early-exit

This should print a mismatch:

Checking early_exit_find_step_1
Miscompare for interleave-forced: 4 != 8

Metadata

Metadata

Assignees

Type

No fields configured for Bug.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions