[UnitTests] Add initial set of dedicated early-exit unit tests.#264
Conversation
b349e09 to
28ae6ea
Compare
| @@ -1,3 +1,12 @@ | |||
|
|
|||
| # Enable matrix types extension tests for compilers supporting -fenable-matrix. | |||
There was a problem hiding this comment.
I think the comment doesn't match the code below?
There was a problem hiding this comment.
Removed now, givne that it has been enabled by default for a while now
| }; \ | ||
| auto InterleavedFn = [](std::function<void(int *)> II) -> unsigned { \ | ||
| Init II(Src); \ | ||
| _Pragma("clang loop vectorize(enable) interleave_count(4)") Loop \ |
There was a problem hiding this comment.
Looks like this PR cannot be merged until the pragma has handled correctly.
There was a problem hiding this comment.
This should also be working now as expected
There was a problem hiding this comment.
Great! Is there a way to tell that these loops did actually vectorise? It's pretty hard to encourage loop vectorisation for early exit loops now with the dereferenceable SCEV checking code having to ensure that the pointers never wrap. I wouldn't be surprised if additional builtins are required to provide attributes to the pointers.
There was a problem hiding this comment.
Unfortuantely I don't think there's a way to test if they are actually vectorized expect checking the generated code or remarks.
the current version uses a local array on the stack where the size is directly related to the bounds, which should be the most reliable way we have currently. We could also add ones using the dereferenceable builtins as follow-ups and possible ones without static info to make sure we also have coverage once first-faulting loads are supported
| /// * one where VF is picked automatically and interleaving is forced. | ||
| #define DEFINE_SCALAR_AND_VECTOR_EARLY_EXIT(Init, Src, Loop) \ | ||
| auto ScalarFn = [](std::function<void(int *)> II) -> unsigned { \ | ||
| Init II(Src); \ |
There was a problem hiding this comment.
This is pretty confusing as it looks like constructing a variable. Can you split this out over different lines so it's clearer, i.e.
InitSrc;
II(Src);
| // for to zero. | ||
| auto Init1 = [IdxToFind](int *Arr) { | ||
| std::fill_n(Arr, N, 1); | ||
| if (IdxToFind < N) |
There was a problem hiding this comment.
Is there a risk that a future compiler optimisation basically figures out the answer based on this init code (after inlining) and just optimises ScalarFn to return a constant before we even reach the loop vectoriser? It might be better to have Init1 and Init2 as real functions marked as noinline.
There was a problem hiding this comment.
In other places we prevent this fro caling through a function marked as optnone. Done the same thing here, thanks
28ae6ea to
fa5c6c3
Compare
|
ping |
1 similar comment
|
ping |
| II(Src); \ | ||
| _Pragma("clang loop vectorize(enable)") Loop \ | ||
| }; \ | ||
| auto InterleavedFn = [](std::function<void(int *)> II) -> unsigned { \ |
There was a problem hiding this comment.
Is it worth having a variant of this where IC=4, VF=1 to test the interleaved/scalar case?
There was a problem hiding this comment.
Thanks for updating! Although I actually meant something like this
_Pragma("clang loop vectorize_width(1) interleave_count(4)") Loop
. I think both this new one and the one below could still vectorise with VF>1 right?
There was a problem hiding this comment.
yep, added an interleaved-only variant as well, thanks
| }; \ | ||
| auto InterleavedFn = [](std::function<void(int *)> II) -> unsigned { \ | ||
| Init II(Src); \ | ||
| _Pragma("clang loop vectorize(enable) interleave_count(4)") Loop \ |
There was a problem hiding this comment.
Great! Is there a way to tell that these loops did actually vectorise? It's pretty hard to encourage loop vectorisation for early exit loops now with the dereferenceable SCEV checking code having to ensure that the pointers never wrap. I wouldn't be surprised if additional builtins are required to provide attributes to the pointers.
Adds initial unit tests for early-exit vectorization covering a variation of auto-vectorization and forced interleaving with pragmas. The interleaving variant is currently mis-compiled and needs * llvm/llvm-project#145340 * llvm/llvm-project#145394. We should probably extend the tests to make sure we cover various other scenarios, including returning the loaded element for the early exit, different index types and array sizes.
fa5c6c3 to
e60f8ff
Compare
e60f8ff to
5dbe493
Compare
|
@fhahn These new unit tests fail on AArch64 with
Can you please fix them or revert this PR? |
…s. (#264)" (#328) This reverts commit 10a0e5a. Fails on AArch64 with -mcpu=neoverse-v2: https://lab.llvm.org/buildbot/#/builders/198/builds/11196 https://lab.llvm.org/buildbot/#/builders/199/builds/8332
|
@luporl thanks for the heads up, reverted for now, will take a closer look tomorrow morning |
|
Also failed in staging: https://lab.llvm.org/staging/#/builders/49/builds/1274 |
|
I looked into the |
This failure looks puzzling and more like an issue with the build directory being stale unless I am missing something, as i t fails with |
|
@fhahn You are right, the buildbot did not clean the test-suite's build dir, meaning that that *.test files, even if the test disappeared from test-suite, were kept in the build-dir. Fixed in llvm/llvm-project@e509974 |
…ts. (#264)" (#328) This reverts commit be30466. (#264) Reapply the commit without interleaving forced for now, as this exposed on AArch64 SVE lowering bug: llvm/llvm-project#178644
Great thanks! I re-landed the PR, but with forced interleaving left out for now until llvm/llvm-project#178644 is fixed so we can make progress with #325 |
Adds initial unit tests for early-exit vectorization covering a variation of auto-vectorization and forced interleaving with pragmas.
The interleaving variant is currently mis-compiled and needs
We should probably extend the tests to make sure we cover various other scenarios, including returning the loaded element for the early exit, different index types and array sizes.