[OpenMP] Fix stack corruption due to argument mismatch#96386
[OpenMP] Fix stack corruption due to argument mismatch#96386
Conversation
|
@llvm/pr-subscribers-llvm-transforms @llvm/pr-subscribers-offload Author: Sushant Gokhale (sushgokh) ChangesWhile lowering (#pragma omp target update from), clang's generated .omp_task_entry. is setting up 9 arguments while calling __tgt_target_data_update_nowait_mapper. At the same time, in __tgt_target_data_update_nowait_mapper, call to targetData<TaskAsyncInfoWrapperTy>() is converted to a sibcall assuming it has the argument count listed in the signature. AARCH64 asm sequence for this is as follows (removed unrelated insns): ` __tgt_target_data_update_nowait_mapper: ldp x29, x30, sp, #16 // 16-byte Folded Reload On AArch64, call to __tgt_target_data_update_nowait_mapper in .omp_task_entry. sets up only single space on stack and this results in ovewriting $fp and subsequent stack corruption. This issue can be credited to discrepancy of __tgt_target_data_update_nowait_mapper signature in openmp/libomptarget/include/omptarget.h taking 13 arguments while clang/lib/CodeGen/CGOpenMPRuntime.cpp and This patch modifies __tgt_target_data_update_nowait_mapper signature to match .omp_task_entry usage(and other 2 files mentioned above). Full diff: https://github.com/llvm/llvm-project/pull/96386.diff 2 Files Affected:
diff --git a/offload/include/omptarget.h b/offload/include/omptarget.h
index 323dee41630f2..968589e866334 100644
--- a/offload/include/omptarget.h
+++ b/offload/include/omptarget.h
@@ -388,8 +388,7 @@ void __tgt_target_data_update_mapper(ident_t *Loc, int64_t DeviceId,
void __tgt_target_data_update_nowait_mapper(
ident_t *Loc, int64_t DeviceId, int32_t ArgNum, void **ArgsBase,
void **Args, int64_t *ArgSizes, int64_t *ArgTypes, map_var_info_t *ArgNames,
- void **ArgMappers, int32_t DepNum, void *DepList, int32_t NoAliasDepNum,
- void *NoAliasDepList);
+ void **ArgMappers);
// Performs the same actions as data_begin in case ArgNum is non-zero
// and initiates run of offloaded region on target platform; if ArgNum
diff --git a/offload/src/interface.cpp b/offload/src/interface.cpp
index 763b051cc6d77..27562388c2f11 100644
--- a/offload/src/interface.cpp
+++ b/offload/src/interface.cpp
@@ -207,8 +207,7 @@ EXTERN void __tgt_target_data_update_mapper(ident_t *Loc, int64_t DeviceId,
EXTERN void __tgt_target_data_update_nowait_mapper(
ident_t *Loc, int64_t DeviceId, int32_t ArgNum, void **ArgsBase,
void **Args, int64_t *ArgSizes, int64_t *ArgTypes, map_var_info_t *ArgNames,
- void **ArgMappers, int32_t DepNum, void *DepList, int32_t NoAliasDepNum,
- void *NoAliasDepList) {
+ void **ArgMappers) {
OMPT_IF_BUILT(ReturnAddressSetterRAII RA(__builtin_return_address(0)));
targetData<TaskAsyncInfoWrapperTy>(
Loc, DeviceId, ArgNum, ArgsBase, Args, ArgSizes, ArgTypes, ArgNames,
|
offload/src/interface.cpp
Outdated
| void **Args, int64_t *ArgSizes, int64_t *ArgTypes, map_var_info_t *ArgNames, | ||
| void **ArgMappers, int32_t DepNum, void *DepList, int32_t NoAliasDepNum, | ||
| void *NoAliasDepList) { | ||
| void **ArgMappers) { |
There was a problem hiding this comment.
These are all unused? Seems weird that we're not using them at all.
There was a problem hiding this comment.
The original design was from libomp task interface, but we never use it.
However, I think we might want to keep them, as I definitely had a patch long time ago that uses them. A better fix would be to modify the front end to emit null value correspondingly.
There was a problem hiding this comment.
However, I think we might want to keep them, as I definitely had a patch long time ago that uses them. A better fix would be to modify the front end to emit null value correspondingly.
All the nowait calls, listed below, suffer from the same discrepancy
__tgt_target_data_begin_nowait_mapper
__tgt_target_data_end_nowait_mapper
__tgt_target_data_update_nowait_mapper
while their counterparts, derived here, only have 9 arguments
So, I am thinking of modifying all the nowait calls while addressing this specific issue.
@shiltian If you still think that modifying the front-end would be better here, could you tell me what changes are exactly required in the OffloadingArgs here ? I hope this is what you are suggesting.
There was a problem hiding this comment.
Yes. If it is nowait, push extra null values.
shiltian
left a comment
There was a problem hiding this comment.
Besides, don't you need to update clang tests?
While lowering (#pragma omp target update from), clang's generated .omp_task_entry. is setting up 9 arguments while calling __tgt_target_data_update_nowait_mapper. At the same time, in __tgt_target_data_update_nowait_mapper, call to targetData<TaskAsyncInfoWrapperTy>() is converted to a sibcall assuming it has the argument count listed in the signature. AARCH64 asm sequence for this is as follows (removed unrelated insns): .omp_task_entry..108: sub sp, sp, llvm#32 stp x29, x30, sp, llvm#16 // 16-byte Folded Spill add x29, sp, llvm#16 str x8, sp, llvm#8. // stack canary str xzr, [sp] bl __tgt_target_data_update_nowait_mapper __tgt_target_data_update_nowait_mapper: sub sp, sp, llvm#32 stp x29, x30, sp, llvm#16 // 16-byte Folded Spill add x29, sp, llvm#16 str x8, sp, llvm#8 // stack canary // Sibcall argument setup adrp x8, :got:_Z16targetDataUpdateP7ident_tR8DeviceTyiPPvS4_PlS5_S4_S4_R11AsyncInfoTyb ldr x8, [x8, :got_lo12:_Z16targetDataUpdateP7ident_tR8DeviceTyiPPvS4_PlS5_S4_S4_R11AsyncInfoTyb] stp x9, x8, x29, llvm#16 adrp x8, .L.str.8 add x8, x8, :lo12:.L.str.8 str x8, x29, llvm#32. <==. This is the insn that erases $fp ldp x29, x30, sp, llvm#16 // 16-byte Folded Reload add sp, sp, llvm#32 // Sibcall b ZL10targetDataI22TaskAsyncInfoWrapperTyEvP7ident_tliPPvS4_PlS5_S4_S4_PFiS2_R8DeviceTyiS4_S4_S5_S5_S4_S4_R11AsyncInfoTybEPKcSD On AArch64, call to __tgt_target_data_update_nowait_mapper in .omp_task_entry. sets up only single space on stack and this results in ovewriting $fp and subsequent stack corruption. This issue can be credited to discrepancy of __tgt_target_data_update_nowait_mapper signature in openmp/libomptarget/include/omptarget.h taking 13 arguments while clang/lib/CodeGen/CGOpenMPRuntime.cpp and llvm/include/llvm/Frontend/OpenMP/OMPKinds.def taking only 9 arguments. This patch modifies __tgt_target_data_update_nowait_mapper signature to match .omp_task_entry usage(and other 2 files mentioned above). Co-authored-by: Kugan Vivekanandarajah <kvivekananda@nvidia.com>
While lowering (#pragma omp target update from), clang's generated .omp_task_entry. is setting up 9 arguments while calling __tgt_target_data_update_nowait_mapper.
At the same time, in __tgt_target_data_update_nowait_mapper, call to targetData() is converted to a sibcall assuming it has the argument count listed in the signature.
AARCH64 asm sequence for this is as follows (removed unrelated insns):
`
.omp_task_entry..108:
sub sp, sp, #32
stp x29, x30, sp, #16 // 16-byte Folded Spill
add x29, sp, #16
str x8, sp, #8. // stack canary
str xzr, [sp]
bl __tgt_target_data_update_nowait_mapper
__tgt_target_data_update_nowait_mapper:
sub sp, sp, #32
stp x29, x30, sp, #16 // 16-byte Folded Spill
add x29, sp, #16
str x8, sp, #8 // stack canary
// Sibcall argument setup
adrp x8, :got:_Z16targetDataUpdateP7ident_tR8DeviceTyiPPvS4_PlS5_S4_S4_R11AsyncInfoTyb
ldr x8, [x8, :got_lo12:_Z16targetDataUpdateP7ident_tR8DeviceTyiPPvS4_PlS5_S4_S4_R11AsyncInfoTyb]
stp x9, x8, x29, #16
adrp x8, .L.str.8
add x8, x8, :lo12:.L.str.8
str x8, x29, #32. <==. This is the insn that erases $fp
ldp x29, x30, sp, #16 // 16-byte Folded Reload
add sp, sp, #32
// Sibcall
b ZL10targetDataI22TaskAsyncInfoWrapperTyEvP7ident_tliPPvS4_PlS5_S4_S4_PFiS2_R8DeviceTyiS4_S4_S5_S5_S4_S4_R11AsyncInfoTybEPKcSD
`
On AArch64, call to __tgt_target_data_update_nowait_mapper in .omp_task_entry. sets up only single space on stack and this results in ovewriting $fp and subsequent stack corruption. This issue can be credited to discrepancy of __tgt_target_data_update_nowait_mapper signature in openmp/libomptarget/include/omptarget.h taking 13 arguments while clang/lib/CodeGen/CGOpenMPRuntime.cpp and
llvm/include/llvm/Frontend/OpenMP/OMPKinds.def taking only 9 arguments.
This patch modifies __tgt_target_data_update_nowait_mapper signature to match .omp_task_entry usage(and other 2 files mentioned above).