[SYCL][Fusion] JIT compiler kernel fusion passes#7661
[SYCL][Fusion] JIT compiler kernel fusion passes#7661steffenlarsen merged 1 commit intointel:syclfrom
Conversation
|
This patch can be reviewed and merged independently of #7531. The tests are currently not yet run together with |
5e1613a to
5451db0
Compare
5451db0 to
fae8208
Compare
Co-authored-by: Lukas Sommer <lukas.sommer@codeplay.com> Co-authored-by: Victor Perez <victor.perez@codeplay.com> Signed-off-by: Lukas Sommer <lukas.sommer@codeplay.com>
fae8208 to
2021ae2
Compare
|
@intel/llvm-gatekeepers the PR is ready to be merged, AMD failure is unrelated to this patch AFAIK |
|
HIP failures reported in #7634. |
AlexeySachkov
left a comment
There was a problem hiding this comment.
Some post-commit review
| NF->copyAttributesFrom(F); | ||
| // Drop masked-out attributes. | ||
| SmallVector<AttributeSet> Attributes; | ||
| const llvm::AttributeList PAL = NF->getAttributes(); |
There was a problem hiding this comment.
nit: llvm:: namespace is used inconsistently: there is using namespace llvm;, but in some places llvm:: is still used. For example, below on line 44 AttributeList is referenced without llvm::. This comment also applies to other files
|
|
||
| { | ||
| // Copy metadata. | ||
| SmallVector<std::pair<unsigned, MDNode *>> MDs; |
There was a problem hiding this comment.
I'm a bit surprised that SmallVector doesn't require a second argument, am I missing something? What is the point of using SmallVector if we are not pre-allocating some storage on stack?
There was a problem hiding this comment.
This is actually recommended if there is no well motivated choice for N:
In the absence of a well-motivated choice for the number of inlined elements N, it is recommended to use SmallVector (that is, omitting the N). This will choose a default number of inlined elements reasonable for allocation on the stack (for example, trying to keep sizeof(SmallVector) around 64 bytes).
From https://llvm.org/doxygen/classllvm_1_1SmallVector.html#details
There was a problem hiding this comment.
Didn't know there is a default value for N, thanks!
| // Copy metadata. | ||
| SmallVector<std::pair<unsigned, MDNode *>> MDs; | ||
| F->getAllMetadata(MDs); | ||
| for (auto MD : MDs) { |
There was a problem hiding this comment.
Unnecessary copy due to missing &?
|
|
||
| using namespace llvm; | ||
|
|
||
| static FunctionType *createMaskedFunctionType(const BitVector &Mask, |
There was a problem hiding this comment.
Why not anonymous namespace?
There was a problem hiding this comment.
My understanding so far was that there doesn't seem to be a big difference between static and anonymous namespaces, also based on this.
Do you have a specific reason to prefer one over the other?
There was a problem hiding this comment.
No strong reason, just making it more C++-y
There was a problem hiding this comment.
Using static in this case allows us to keep the static functions closer to their non-static users/callers without declaring multiple anonymous namespaces, so we stick with that for now.
| auto *NewIndex = [&]() -> Value * { | ||
| if (LocalSize == 1) { | ||
| return Builder.getInt64(0); | ||
| } else { |
There was a problem hiding this comment.
LLVM Coding Standard: Don't use else after return
| auto *FK = dyn_cast<MDString>(MDOp.get()); | ||
| assert(FK && "Kernel should be given by its name as MDString"); |
There was a problem hiding this comment.
No need to use dyn_cast if you don't check the result. cast already contains an assert statement for you
| if (StubFunction.hasMetadata(ParameterMDKind)) { | ||
| llvm::MDNode *ParamMD = StubFunction.getMetadata(ParameterMDKind); | ||
| for (const auto &Op : ParamMD->operands()) { | ||
| auto *Tuple = dyn_cast<MDNode>(Op.get()); |
There was a problem hiding this comment.
Once again, dyn_cast -> cast
| auto *ConstantMD = dyn_cast<ConstantAsMetadata>(MD); | ||
| assert(ConstantMD && "Metadata not constant"); |
| SmallVector<Value *> Indices; | ||
| Indices.push_back(Builder.getInt32(0)); |
There was a problem hiding this comment.
ArrayRef is constructible from a single element: we should use that instead of performing an allocation on heap for a single pointer.
| if (RetCode) { | ||
| return std::move(RetCode); | ||
| } |
|
@AlexeySachkov: Thank you for your feedback, I addressed your post-commit comments in #7719. |
Fix tests for kernel fusion passes in static builds by using correct library, loadable by `opt`, for tests. Also fixes post-commit feedback from @AlexeySachkov in #7661. Signed-off-by: Lukas Sommer <lukas.sommer@codeplay.com>
This is the fourth patch in a series of patches to add an implementation of the kernel fusion extension. We have split the implementation into multiple patches to make them more easy to review.
This patch adds the LLVM passes that perform the kernel fusion and related optimizations:
The type of memory to use is currently specified by the user in the runtime.
The information is propagated from the SYCL runtime to the passes via LLVM metadata inserted by the JIT compiler frontend.
After and between the fusion passes, some standard LLVM optimization and transformation passes are executed to enable passes and optimize the fused kernel.
Co-authored-by: Lukas Sommer lukas.sommer@codeplay.com
Co-authored-by: Victor Perez victor.perez@codeplay.com
Signed-off-by: Lukas Sommer lukas.sommer@codeplay.com