Skip to content

[UnitTests] Add initial set of dedicated early-exit unit tests.#264

Merged
fhahn merged 4 commits into
llvm:mainfrom
fhahn:test-suite-early-exit-vec
Jan 19, 2026
Merged

[UnitTests] Add initial set of dedicated early-exit unit tests.#264
fhahn merged 4 commits into
llvm:mainfrom
fhahn:test-suite-early-exit-vec

Conversation

@fhahn

@fhahn fhahn commented Jun 25, 2025

Copy link
Copy Markdown
Contributor

Adds initial unit tests for early-exit vectorization covering a variation of auto-vectorization and forced interleaving with pragmas.

The interleaving variant is currently mis-compiled and needs

We should probably extend the tests to make sure we cover various other scenarios, including returning the loaded element for the early exit, different index types and array sizes.

@@ -1,3 +1,12 @@

# Enable matrix types extension tests for compilers supporting -fenable-matrix.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the comment doesn't match the code below?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed now, givne that it has been enabled by default for a while now

}; \
auto InterleavedFn = [](std::function<void(int *)> II) -> unsigned { \
Init II(Src); \
_Pragma("clang loop vectorize(enable) interleave_count(4)") Loop \

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this PR cannot be merged until the pragma has handled correctly.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should also be working now as expected

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! Is there a way to tell that these loops did actually vectorise? It's pretty hard to encourage loop vectorisation for early exit loops now with the dereferenceable SCEV checking code having to ensure that the pointers never wrap. I wouldn't be surprised if additional builtins are required to provide attributes to the pointers.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortuantely I don't think there's a way to test if they are actually vectorized expect checking the generated code or remarks.

the current version uses a local array on the stack where the size is directly related to the bounds, which should be the most reliable way we have currently. We could also add ones using the dereferenceable builtins as follow-ups and possible ones without static info to make sure we also have coverage once first-faulting loads are supported

/// * one where VF is picked automatically and interleaving is forced.
#define DEFINE_SCALAR_AND_VECTOR_EARLY_EXIT(Init, Src, Loop) \
auto ScalarFn = [](std::function<void(int *)> II) -> unsigned { \
Init II(Src); \

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is pretty confusing as it looks like constructing a variable. Can you split this out over different lines so it's clearer, i.e.

  InitSrc;
  II(Src);

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated thanks

// for to zero.
auto Init1 = [IdxToFind](int *Arr) {
std::fill_n(Arr, N, 1);
if (IdxToFind < N)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a risk that a future compiler optimisation basically figures out the answer based on this init code (after inlining) and just optimises ScalarFn to return a constant before we even reach the loop vectoriser? It might be better to have Init1 and Init2 as real functions marked as noinline.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In other places we prevent this fro caling through a function marked as optnone. Done the same thing here, thanks

@fhahn fhahn force-pushed the test-suite-early-exit-vec branch from 28ae6ea to fa5c6c3 Compare January 7, 2026 10:09
@fhahn

fhahn commented Jan 12, 2026

Copy link
Copy Markdown
Contributor Author

ping

1 similar comment
@fhahn

fhahn commented Jan 19, 2026

Copy link
Copy Markdown
Contributor Author

ping

@Meinersbur Meinersbur left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

II(Src); \
_Pragma("clang loop vectorize(enable)") Loop \
}; \
auto InterleavedFn = [](std::function<void(int *)> II) -> unsigned { \

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth having a variant of this where IC=4, VF=1 to test the interleaved/scalar case?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added, thanks

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for updating! Although I actually meant something like this

  _Pragma("clang loop vectorize_width(1) interleave_count(4)") Loop

. I think both this new one and the one below could still vectorise with VF>1 right?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, added an interleaved-only variant as well, thanks

}; \
auto InterleavedFn = [](std::function<void(int *)> II) -> unsigned { \
Init II(Src); \
_Pragma("clang loop vectorize(enable) interleave_count(4)") Loop \

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! Is there a way to tell that these loops did actually vectorise? It's pretty hard to encourage loop vectorisation for early exit loops now with the dereferenceable SCEV checking code having to ensure that the pointers never wrap. I wouldn't be surprised if additional builtins are required to provide attributes to the pointers.

fhahn added 2 commits January 19, 2026 15:05
Adds initial unit tests for early-exit vectorization covering a
variation of auto-vectorization and forced interleaving with pragmas.

The interleaving variant is currently mis-compiled and needs
 * llvm/llvm-project#145340
 * llvm/llvm-project#145394.

We should probably extend the tests to make sure we cover various other
scenarios, including returning the loaded element for the early exit,
different index types and array sizes.
@fhahn fhahn force-pushed the test-suite-early-exit-vec branch from fa5c6c3 to e60f8ff Compare January 19, 2026 15:05
@fhahn fhahn force-pushed the test-suite-early-exit-vec branch from e60f8ff to 5dbe493 Compare January 19, 2026 15:06

@david-arm david-arm left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@fhahn fhahn merged commit 10a0e5a into llvm:main Jan 19, 2026
1 check passed
@fhahn fhahn deleted the test-suite-early-exit-vec branch January 19, 2026 16:22
@luporl

luporl commented Jan 22, 2026

Copy link
Copy Markdown
Contributor

@fhahn These new unit tests fail on AArch64 with -mcpu=neoverse-v2:

Can you please fix them or revert this PR?

@fhahn

fhahn commented Jan 22, 2026

Copy link
Copy Markdown
Contributor Author

@luporl thanks for the heads up, reverted for now, will take a closer look tomorrow morning

@Meinersbur

Copy link
Copy Markdown
Member

Also failed in staging: https://lab.llvm.org/staging/#/builders/49/builds/1274

@fhahn

fhahn commented Jan 29, 2026

Copy link
Copy Markdown
Contributor Author

I looked into the neoverse-v2 failure. The only difference between NEON codegen is that we lower llvm.experimental.cttz.elts.i64.v4i1 using SVE instructions, and that is causing the incorrect result. This seems to indicate a lowering issue llvm/llvm-project#178644

@fhahn

fhahn commented Jan 29, 2026

Copy link
Copy Markdown
Contributor Author

Also failed in staging: https://lab.llvm.org/staging/#/builders/49/builds/1274

This failure looks puzzling and more like an issue with the build directory being stale unless I am missing something, as i t fails with

FAIL: test-suite :: SingleSource/UnitTests/Vectorizer/early-exit.test (2324 of 2465)
******************** TEST 'test-suite :: SingleSource/UnitTests/Vectorizer/early-exit.test' FAILED ********************
/srv/bbworker/polly-x86_64-fdcserver/polly-x86_64-linux-test-suite/build/testsuite.build/tools/timeit-target --timeout 7200 --limit-core 0 --limit-cpu 7200 --limit-file-size 209715200 --limit-rss-size 838860800 --append-exitstatus --redirect-output /srv/bbworker/polly-x86_64-fdcserver/polly-x86_64-linux-test-suite/build/testsuite.build/SingleSource/UnitTests/Vectorizer/Output/early-exit.test.out --redirect-input /dev/null --chdir /srv/bbworker/polly-x86_64-fdcserver/polly-x86_64-linux-test-suite/build/testsuite.build/SingleSource/UnitTests/Vectorizer --summary /srv/bbworker/polly-x86_64-fdcserver/polly-x86_64-linux-test-suite/build/testsuite.build/SingleSource/UnitTests/Vectorizer/Output/early-exit.test.time /srv/bbworker/polly-x86_64-fdcserver/polly-x86_64-linux-test-suite/build/testsuite.build/SingleSource/UnitTests/Vectorizer/early-exit
/srv/bbworker/polly-x86_64-fdcserver/polly-x86_64-linux-test-suite/build/testsuite.build/tools/fpcmp-target /srv/bbworker/polly-x86_64-fdcserver/polly-x86_64-linux-test-suite/build/testsuite.build/SingleSource/UnitTests/Vectorizer/Output/early-exit.test.out /srv/bbworker/polly-x86_64-fdcserver/polly-x86_64-linux-test-suite/build/testsuite.build/SingleSource/UnitTests/Vectorizer/early-exit.reference_output
+ /srv/bbworker/polly-x86_64-fdcserver/polly-x86_64-linux-test-suite/build/testsuite.build/tools/fpcmp-target /srv/bbworker/polly-x86_64-fdcserver/polly-x86_64-linux-test-suite/build/testsuite.build/SingleSource/UnitTests/Vectorizer/Output/early-exit.test.out /srv/bbworker/polly-x86_64-fdcserver/polly-x86_64-linux-test-suite/build/testsuite.build/SingleSource/UnitTests/Vectorizer/early-exit.reference_output
/srv/bbworker/polly-x86_64-fdcserver/polly-x86_64-linux-test-suite/build/testsuite.build/tools/fpcmp-target: error: unable to open '/srv/bbworker/polly-x86_64-fdcserver/polly-x86_64-linux-test-suite/build/testsuite.build/SingleSource/UnitTests/Vectorizer/early-exit.reference_output'

@Meinersbur

Meinersbur commented Jan 29, 2026

Copy link
Copy Markdown
Member

@fhahn You are right, the buildbot did not clean the test-suite's build dir, meaning that that *.test files, even if the test disappeared from test-suite, were kept in the build-dir. Fixed in llvm/llvm-project@e509974

fhahn added a commit that referenced this pull request Jan 29, 2026
…ts. (#264)" (#328)

This reverts commit be30466.
(#264)

Reapply the commit without interleaving forced for now, as this exposed
on AArch64 SVE lowering bug:
llvm/llvm-project#178644
@fhahn

fhahn commented Jan 29, 2026

Copy link
Copy Markdown
Contributor Author

@fhahn You are right, the buildbot do not clean the test-suite's build dir, meaning that that *.test files, even if the test disappeared from test-suite, were kept in the build-dir. Fixed in llvm/llvm-project@e509974

Great thanks!

I re-landed the PR, but with forced interleaving left out for now until llvm/llvm-project#178644 is fixed so we can make progress with #325

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants