Conversation
…from being marked as parial maps
The following test was triggering a runtime crash **on the host before launching the kernel**:
```fortran
program test_omp_target_map_bug_v5
implicit none
type nested_type
real, allocatable :: alloc_field(:)
end type nested_type
type nesting_type
integer :: int_field
type(nested_type) :: derived_field
end type nesting_type
type(nesting_type) :: config
allocate(config%derived_field%alloc_field(1))
!$OMP TARGET ENTER DATA MAP(TO:config, config%derived_field%alloc_field)
!$OMP TARGET
config%derived_field%alloc_field(1) = 1.0
!$OMP END TARGET
deallocate(config%derived_field%alloc_field)
end program test_omp_target_map_bug_v5
```
In particular, the runtime was producing a segmentation fault when the test is compiled with any optimization level > 0; if you compile with -O0 the sample ran fine.
After debugging the runtime, it turned out the crash was happening at the point where the runtime calls the default mapper emitted by the compiler for `nesting_type; in particular at this point in the runtime: https://github.com/llvm/llvm-project/blob/c62cd2877cc25a0d708ad22a70c2a57590449c4d/offload/libomptarget/omptarget.cpp#L307.
Bisecting the optimization pipeline using `-mllvm -opt-bisect-limit=N`, the first pass that triggered the issue on `O1` was the `instcombine` pass. Deubbing this further, the issue narrows down to canonalizing `getelementptr` instructions from using struct types (in this case the `nesting_type` in the sample above) to using `i8`. In particular, in `O0`, you would see something like this:
```llvm
define internal void @.omp_mapper._QQFnesting_type_omp_default_mapper(ptr noundef %0, ptr noundef %1, ptr noundef %2, i64 noundef %3, i64 noundef %4, ptr noundef %5) #6 {
entry:
%6 = udiv exact i64 %3, 56
%7 = getelementptr %_QFTnesting_type, ptr %2, i64 %6
....
}
```
```llvm
define internal void @.omp_mapper._QQFnesting_type_omp_default_mapper(ptr noundef %0, ptr noundef %1, ptr noundef %2, i64 noundef %3, i64 noundef %4, ptr noundef %5) #6 {
entry:
%6 = getelementptr i8, ptr %2, i64 %3
....
}
```
The `udiv exact` instruction emitted by the OMP IR Builder (see: https://github.com/llvm/llvm-project/blob/c62cd2877cc25a0d708ad22a70c2a57590449c4d/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp#L9154) allows `instcombine` to assume that `%3` is divisible by the struct size (here `56`) and, therefore, replaces the result of the division with direct GEP on `i8` rather than the struct type.
However, the runtime was calling `@.omp_mapper._QQFnesting_type_omp_default_mapper` not with `56` (the proper struct size) but with `48`!
Debugging this further, I found that the size of `omp.map.info` operation to which the default mapper is attached computes the value of `48` because we set the map to partial (see: https://github.com/llvm/llvm-project/blob/c62cd2877cc25a0d708ad22a70c2a57590449c4d/flang/lib/Optimizer/OpenMP/MapInfoFinalization.cpp#L1146 and https://github.com/llvm/llvm-project/blob/c62cd2877cc25a0d708ad22a70c2a57590449c4d/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp#L4501-L4512).
However, I think this is incorrect since the emitted mapper (and user-defined mappers in general) are defined on the whole struct type and should never be marked as partial. Hence, the fix in this PR.
|
@llvm/pr-subscribers-offload @llvm/pr-subscribers-flang-openmp Author: Kareem Ergawy (ergawy) ChangesThe following test was triggering a runtime crash on the host before launching the kernel: program test_omp_target_map_bug_v5
implicit none
type nested_type
real, allocatable :: alloc_field(:)
end type nested_type
type nesting_type
integer :: int_field
type(nested_type) :: derived_field
end type nesting_type
type(nesting_type) :: config
allocate(config%derived_field%alloc_field(1))
!$OMP TARGET ENTER DATA MAP(TO:config, config%derived_field%alloc_field)
!$OMP TARGET
config%derived_field%alloc_field(1) = 1.0
!$OMP END TARGET
deallocate(config%derived_field%alloc_field)
end program test_omp_target_map_bug_v5In particular, the runtime was producing a segmentation fault when the test is compiled with any optimization level > 0; if you compile with -O0 the sample ran fine. After debugging the runtime, it turned out the crash was happening at the point where the runtime calls the default mapper emitted by the compiler for `nesting_type; in particular at this point in the runtime: .Bisecting the optimization pipeline using define internal void @.omp_mapper._QQFnesting_type_omp_default_mapper(ptr noundef %0, ptr noundef %1, ptr noundef %2, i64 noundef %3, i64 noundef %4, ptr noundef %5) #<!-- -->6 {
entry:
%6 = udiv exact i64 %3, 56
%7 = getelementptr %_QFTnesting_type, ptr %2, i64 %6
....
}define internal void @.omp_mapper._QQFnesting_type_omp_default_mapper(ptr noundef %0, ptr noundef %1, ptr noundef %2, i64 noundef %3, i64 noundef %4, ptr noundef %5) #<!-- -->6 {
entry:
%6 = getelementptr i8, ptr %2, i64 %3
....
}The instcombine to assume that %3 is divisible by the struct size (here 56) and, therefore, replaces the result of the division with direct GEP on i8 rather than the struct type.
However, the runtime was calling Debugging this further, I found that the size of llvm-project/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp Lines 4501 to 4512 in c62cd28 However, I think this is incorrect since the emitted mapper (and user-defined mappers in general) are defined on the whole struct type and should never be marked as partial. Hence, the fix in this PR. Full diff: https://github.com/llvm/llvm-project/pull/175133.diff 3 Files Affected:
diff --git a/flang/lib/Optimizer/OpenMP/MapInfoFinalization.cpp b/flang/lib/Optimizer/OpenMP/MapInfoFinalization.cpp
index 3fe133d63d24d..a60960e739d24 100644
--- a/flang/lib/Optimizer/OpenMP/MapInfoFinalization.cpp
+++ b/flang/lib/Optimizer/OpenMP/MapInfoFinalization.cpp
@@ -1143,7 +1143,8 @@ class MapInfoFinalizationPass
newMemberIndices.emplace_back(path);
op.setMembersIndexAttr(builder.create2DI64ArrayAttr(newMemberIndices));
- op.setPartialMap(true);
+ // Set to partial map only if there is no user-defined mapper.
+ op.setPartialMap(op.getMapperIdAttr() == nullptr);
return mlir::WalkResult::advance();
});
diff --git a/flang/test/Transforms/omp-map-info-finalization-implicit-field.fir b/flang/test/Transforms/omp-map-info-finalization-implicit-field.fir
index 632525b4b43c9..d3e8125d2ee3d 100644
--- a/flang/test/Transforms/omp-map-info-finalization-implicit-field.fir
+++ b/flang/test/Transforms/omp-map-info-finalization-implicit-field.fir
@@ -15,13 +15,24 @@ fir.global internal @_QFEdst_record : !record_t {
fir.has_value %0 : !record_t
}
+omp.declare_mapper @record_mapper : !record_t {
+^bb0(%arg0: !fir.ref<!record_t>):
+ %0 = omp.map.info var_ptr(%arg0: !fir.ref<!record_t>, !record_t) map_clauses(implicit, tofrom) capture(ByRef) -> !fir.ref<!record_t>
+ omp.declare_mapper.info map_entries(%0: !fir.ref<!record_t>)
+}
+
func.func @_QQmain() {
%6 = fir.address_of(@_QFEdst_record) : !fir.ref<!record_t>
%7:2 = hlfir.declare %6 {uniq_name = "_QFEdst_record"} : (!fir.ref<!record_t>) -> (!fir.ref<!record_t>, !fir.ref<!record_t>)
%16 = omp.map.info var_ptr(%7#1 : !fir.ref<!record_t>, !record_t) map_clauses(implicit, tofrom) capture(ByRef) -> !fir.ref<!record_t> {name = "dst_record"}
- omp.target map_entries(%16 -> %arg0 : !fir.ref<!record_t>) {
+ %17 = omp.map.info var_ptr(%7#1 : !fir.ref<!record_t>, !record_t) map_clauses(implicit, tofrom) capture(ByRef) mapper(@record_mapper) -> !fir.ref<!record_t> {name = "dst_record_with_mapper"}
+ omp.target map_entries(%16 -> %arg0, %17 -> %arg1 : !fir.ref<!record_t>, !fir.ref<!record_t>) {
%20:2 = hlfir.declare %arg0 {uniq_name = "_QFEdst_record"} : (!fir.ref<!record_t>) -> (!fir.ref<!record_t>, !fir.ref<!record_t>)
+ %21:2 = hlfir.declare %arg1 {uniq_name = "_QFEdst_record"} : (!fir.ref<!record_t>) -> (!fir.ref<!record_t>, !fir.ref<!record_t>)
+
%23 = hlfir.designate %20#0{"to_implicitly_map"} {fortran_attrs = #fir.var_attrs<allocatable>} : (!fir.ref<!record_t>) -> !fir.ref<!fir.box<!fir.heap<!fir.array<?xf32>>>>
+
+ %24 = hlfir.designate %21#0{"to_implicitly_map"} {fortran_attrs = #fir.var_attrs<allocatable>} : (!fir.ref<!record_t>) -> !fir.ref<!fir.box<!fir.heap<!fir.array<?xf32>>>>
omp.terminator
}
return
@@ -56,6 +67,19 @@ func.func @_QQmain() {
// CHECK-SAME: [1], [1, 0] : {{.*}}) -> {{.*}}> {name =
// CHECK-SAME: "dst_record", partial_map = true}
+// Verify map ops when using a mapper:
+// Implicit field mapping is the same as for the non-mapper case.
+// CHECK: omp.map.info
+// CHECK: omp.map.info
+
+// Verify that partial-map is not set if the map info op uses a user-defined (or
+// compiler-emitted) mapper.
+// CHECK: %[[RECORD_MAP_MAPPER:.*]] = omp.map.info var_ptr(
+// CHECK-SAME: %[[RECORD_DECL]]#1 : {{.*}}) map_clauses(
+// CHECK-SAME: implicit, tofrom) capture(ByRef) mapper(@record_mapper)
+// CHECK-SAME: members(%{{.*}}, %{{.*}} : [1], [1, 0] : {{.*}}) -> {{.*}}> {name =
+// CHECK-SAME: "dst_record_with_mapper"}
+
// CHECK: omp.target map_entries(
// CHECK-SAME: %[[RECORD_MAP]] -> %{{[^[:space:]]+}},
// CHECK-SAME: %[[FIELD_MAP]] -> %{{[^[:space:]]+}},
diff --git a/offload/test/offloading/fortran/default-mapper-nested-derived-type.f90 b/offload/test/offloading/fortran/default-mapper-nested-derived-type.f90
new file mode 100644
index 0000000000000..5d69fa072fd63
--- /dev/null
+++ b/offload/test/offloading/fortran/default-mapper-nested-derived-type.f90
@@ -0,0 +1,34 @@
+! Regression test for default mappers emitted for nested derived types. Some
+! optimization passes and instrumentation callbacks cause crashes in emitted
+! mappers and this test guards against such crashes.
+
+! REQUIRES: flang, amdgpu
+
+! RUN: %libomptarget-compile-fortran-generic
+! RUN: env LIBOMPTARGET_INFO=16 %libomptarget-run-generic 2>&1 | %fcheck-generic
+
+program test_omp_target_map_bug_v5
+ implicit none
+ type nested_type
+ real, allocatable :: alloc_field(:)
+ end type nested_type
+
+ type nesting_type
+ integer :: int_field
+ type(nested_type) :: derived_field
+ end type nesting_type
+
+ type(nesting_type) :: config
+
+ allocate(config%derived_field%alloc_field(1))
+
+ !$OMP TARGET ENTER DATA MAP(TO:config, config%derived_field%alloc_field)
+
+ !$OMP TARGET
+ config%derived_field%alloc_field(1) = 1.0
+ !$OMP END TARGET
+
+ deallocate(config%derived_field%alloc_field)
+end program test_omp_target_map_bug_v5
+
+! CHECK: "PluginInterface" device {{[0-9]+}} info: Launching kernel {{.*}}
|
|
@llvm/pr-subscribers-flang-fir-hlfir Author: Kareem Ergawy (ergawy) ChangesThe following test was triggering a runtime crash on the host before launching the kernel: program test_omp_target_map_bug_v5
implicit none
type nested_type
real, allocatable :: alloc_field(:)
end type nested_type
type nesting_type
integer :: int_field
type(nested_type) :: derived_field
end type nesting_type
type(nesting_type) :: config
allocate(config%derived_field%alloc_field(1))
!$OMP TARGET ENTER DATA MAP(TO:config, config%derived_field%alloc_field)
!$OMP TARGET
config%derived_field%alloc_field(1) = 1.0
!$OMP END TARGET
deallocate(config%derived_field%alloc_field)
end program test_omp_target_map_bug_v5In particular, the runtime was producing a segmentation fault when the test is compiled with any optimization level > 0; if you compile with -O0 the sample ran fine. After debugging the runtime, it turned out the crash was happening at the point where the runtime calls the default mapper emitted by the compiler for `nesting_type; in particular at this point in the runtime: .Bisecting the optimization pipeline using define internal void @.omp_mapper._QQFnesting_type_omp_default_mapper(ptr noundef %0, ptr noundef %1, ptr noundef %2, i64 noundef %3, i64 noundef %4, ptr noundef %5) #<!-- -->6 {
entry:
%6 = udiv exact i64 %3, 56
%7 = getelementptr %_QFTnesting_type, ptr %2, i64 %6
....
}define internal void @.omp_mapper._QQFnesting_type_omp_default_mapper(ptr noundef %0, ptr noundef %1, ptr noundef %2, i64 noundef %3, i64 noundef %4, ptr noundef %5) #<!-- -->6 {
entry:
%6 = getelementptr i8, ptr %2, i64 %3
....
}The instcombine to assume that %3 is divisible by the struct size (here 56) and, therefore, replaces the result of the division with direct GEP on i8 rather than the struct type.
However, the runtime was calling Debugging this further, I found that the size of llvm-project/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp Lines 4501 to 4512 in c62cd28 However, I think this is incorrect since the emitted mapper (and user-defined mappers in general) are defined on the whole struct type and should never be marked as partial. Hence, the fix in this PR. Full diff: https://github.com/llvm/llvm-project/pull/175133.diff 3 Files Affected:
diff --git a/flang/lib/Optimizer/OpenMP/MapInfoFinalization.cpp b/flang/lib/Optimizer/OpenMP/MapInfoFinalization.cpp
index 3fe133d63d24d..a60960e739d24 100644
--- a/flang/lib/Optimizer/OpenMP/MapInfoFinalization.cpp
+++ b/flang/lib/Optimizer/OpenMP/MapInfoFinalization.cpp
@@ -1143,7 +1143,8 @@ class MapInfoFinalizationPass
newMemberIndices.emplace_back(path);
op.setMembersIndexAttr(builder.create2DI64ArrayAttr(newMemberIndices));
- op.setPartialMap(true);
+ // Set to partial map only if there is no user-defined mapper.
+ op.setPartialMap(op.getMapperIdAttr() == nullptr);
return mlir::WalkResult::advance();
});
diff --git a/flang/test/Transforms/omp-map-info-finalization-implicit-field.fir b/flang/test/Transforms/omp-map-info-finalization-implicit-field.fir
index 632525b4b43c9..d3e8125d2ee3d 100644
--- a/flang/test/Transforms/omp-map-info-finalization-implicit-field.fir
+++ b/flang/test/Transforms/omp-map-info-finalization-implicit-field.fir
@@ -15,13 +15,24 @@ fir.global internal @_QFEdst_record : !record_t {
fir.has_value %0 : !record_t
}
+omp.declare_mapper @record_mapper : !record_t {
+^bb0(%arg0: !fir.ref<!record_t>):
+ %0 = omp.map.info var_ptr(%arg0: !fir.ref<!record_t>, !record_t) map_clauses(implicit, tofrom) capture(ByRef) -> !fir.ref<!record_t>
+ omp.declare_mapper.info map_entries(%0: !fir.ref<!record_t>)
+}
+
func.func @_QQmain() {
%6 = fir.address_of(@_QFEdst_record) : !fir.ref<!record_t>
%7:2 = hlfir.declare %6 {uniq_name = "_QFEdst_record"} : (!fir.ref<!record_t>) -> (!fir.ref<!record_t>, !fir.ref<!record_t>)
%16 = omp.map.info var_ptr(%7#1 : !fir.ref<!record_t>, !record_t) map_clauses(implicit, tofrom) capture(ByRef) -> !fir.ref<!record_t> {name = "dst_record"}
- omp.target map_entries(%16 -> %arg0 : !fir.ref<!record_t>) {
+ %17 = omp.map.info var_ptr(%7#1 : !fir.ref<!record_t>, !record_t) map_clauses(implicit, tofrom) capture(ByRef) mapper(@record_mapper) -> !fir.ref<!record_t> {name = "dst_record_with_mapper"}
+ omp.target map_entries(%16 -> %arg0, %17 -> %arg1 : !fir.ref<!record_t>, !fir.ref<!record_t>) {
%20:2 = hlfir.declare %arg0 {uniq_name = "_QFEdst_record"} : (!fir.ref<!record_t>) -> (!fir.ref<!record_t>, !fir.ref<!record_t>)
+ %21:2 = hlfir.declare %arg1 {uniq_name = "_QFEdst_record"} : (!fir.ref<!record_t>) -> (!fir.ref<!record_t>, !fir.ref<!record_t>)
+
%23 = hlfir.designate %20#0{"to_implicitly_map"} {fortran_attrs = #fir.var_attrs<allocatable>} : (!fir.ref<!record_t>) -> !fir.ref<!fir.box<!fir.heap<!fir.array<?xf32>>>>
+
+ %24 = hlfir.designate %21#0{"to_implicitly_map"} {fortran_attrs = #fir.var_attrs<allocatable>} : (!fir.ref<!record_t>) -> !fir.ref<!fir.box<!fir.heap<!fir.array<?xf32>>>>
omp.terminator
}
return
@@ -56,6 +67,19 @@ func.func @_QQmain() {
// CHECK-SAME: [1], [1, 0] : {{.*}}) -> {{.*}}> {name =
// CHECK-SAME: "dst_record", partial_map = true}
+// Verify map ops when using a mapper:
+// Implicit field mapping is the same as for the non-mapper case.
+// CHECK: omp.map.info
+// CHECK: omp.map.info
+
+// Verify that partial-map is not set if the map info op uses a user-defined (or
+// compiler-emitted) mapper.
+// CHECK: %[[RECORD_MAP_MAPPER:.*]] = omp.map.info var_ptr(
+// CHECK-SAME: %[[RECORD_DECL]]#1 : {{.*}}) map_clauses(
+// CHECK-SAME: implicit, tofrom) capture(ByRef) mapper(@record_mapper)
+// CHECK-SAME: members(%{{.*}}, %{{.*}} : [1], [1, 0] : {{.*}}) -> {{.*}}> {name =
+// CHECK-SAME: "dst_record_with_mapper"}
+
// CHECK: omp.target map_entries(
// CHECK-SAME: %[[RECORD_MAP]] -> %{{[^[:space:]]+}},
// CHECK-SAME: %[[FIELD_MAP]] -> %{{[^[:space:]]+}},
diff --git a/offload/test/offloading/fortran/default-mapper-nested-derived-type.f90 b/offload/test/offloading/fortran/default-mapper-nested-derived-type.f90
new file mode 100644
index 0000000000000..5d69fa072fd63
--- /dev/null
+++ b/offload/test/offloading/fortran/default-mapper-nested-derived-type.f90
@@ -0,0 +1,34 @@
+! Regression test for default mappers emitted for nested derived types. Some
+! optimization passes and instrumentation callbacks cause crashes in emitted
+! mappers and this test guards against such crashes.
+
+! REQUIRES: flang, amdgpu
+
+! RUN: %libomptarget-compile-fortran-generic
+! RUN: env LIBOMPTARGET_INFO=16 %libomptarget-run-generic 2>&1 | %fcheck-generic
+
+program test_omp_target_map_bug_v5
+ implicit none
+ type nested_type
+ real, allocatable :: alloc_field(:)
+ end type nested_type
+
+ type nesting_type
+ integer :: int_field
+ type(nested_type) :: derived_field
+ end type nesting_type
+
+ type(nesting_type) :: config
+
+ allocate(config%derived_field%alloc_field(1))
+
+ !$OMP TARGET ENTER DATA MAP(TO:config, config%derived_field%alloc_field)
+
+ !$OMP TARGET
+ config%derived_field%alloc_field(1) = 1.0
+ !$OMP END TARGET
+
+ deallocate(config%derived_field%alloc_field)
+end program test_omp_target_map_bug_v5
+
+! CHECK: "PluginInterface" device {{[0-9]+}} info: Launching kernel {{.*}}
|
agozillon
left a comment
There was a problem hiding this comment.
LGTM, thank you for the fix!
TIFitis
left a comment
There was a problem hiding this comment.
Ty for the fix.
Nit: Please correct the typo in the PR title.
omp.map.info ops with user-defined mappers from being marked as parial mapsomp.map.info ops with user-defined mappers from being marked as partial maps
…from being marked as parial maps (llvm#175133) The following test was triggering a runtime crash **on the host before launching the kernel**: ```fortran program test_omp_target_map_bug_v5 implicit none type nested_type real, allocatable :: alloc_field(:) end type nested_type type nesting_type integer :: int_field type(nested_type) :: derived_field end type nesting_type type(nesting_type) :: config allocate(config%derived_field%alloc_field(1)) !$OMP TARGET ENTER DATA MAP(TO:config, config%derived_field%alloc_field) !$OMP TARGET config%derived_field%alloc_field(1) = 1.0 !$OMP END TARGET deallocate(config%derived_field%alloc_field) end program test_omp_target_map_bug_v5 ``` In particular, the runtime was producing a segmentation fault when the test is compiled with any optimization level > 0; if you compile with -O0 the sample ran fine. After debugging the runtime, it turned out the crash was happening at the point where the runtime calls the default mapper emitted by the compiler for `nesting_type; in particular at this point in the runtime: https://github.com/llvm/llvm-project/blob/c62cd2877cc25a0d708ad22a70c2a57590449c4d/offload/libomptarget/omptarget.cpp#L307. Bisecting the optimization pipeline using `-mllvm -opt-bisect-limit=N`, the first pass that triggered the issue on `O1` was the `instcombine` pass. Debugging this further, the issue narrows down to canonicalizing `getelementptr` instructions from using struct types (in this case the `nesting_type` in the sample above) to using addressing bytes (`i8`). In particular, in `O0`, you would see something like this: ```llvm define internal void @.omp_mapper._QQFnesting_type_omp_default_mapper(ptr noundef %0, ptr noundef %1, ptr noundef %2, i64 noundef %3, i64 noundef %4, ptr noundef %5) llvm#6 { entry: %6 = udiv exact i64 %3, 56 %7 = getelementptr %_QFTnesting_type, ptr %2, i64 %6 .... } ``` ```llvm define internal void @.omp_mapper._QQFnesting_type_omp_default_mapper(ptr noundef %0, ptr noundef %1, ptr noundef %2, i64 noundef %3, i64 noundef %4, ptr noundef %5) llvm#6 { entry: %6 = getelementptr i8, ptr %2, i64 %3 .... } ``` The `udiv exact` instruction emitted by the OMP IR Builder (see: https://github.com/llvm/llvm-project/blob/c62cd2877cc25a0d708ad22a70c2a57590449c4d/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp#L9154) allows `instcombine` to assume that `%3` is divisible by the struct size (here `56`) and, therefore, replaces the result of the division with direct GEP on `i8` rather than the struct type. However, the runtime was calling `@.omp_mapper._QQFnesting_type_omp_default_mapper` not with `56` (the proper struct size) but with `48`! Debugging this further, I found that the size of `omp.map.info` operation to which the default mapper is attached computes the value of `48` because we set the map to partial (see: https://github.com/llvm/llvm-project/blob/c62cd2877cc25a0d708ad22a70c2a57590449c4d/flang/lib/Optimizer/OpenMP/MapInfoFinalization.cpp#L1146 and https://github.com/llvm/llvm-project/blob/c62cd2877cc25a0d708ad22a70c2a57590449c4d/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp#L4501-L4512). However, I think this is incorrect since the emitted mapper (and user-defined mappers in general) are defined on the whole struct type and should never be marked as partial. Hence, the fix in this PR.
The following test was triggering a runtime crash on the host before launching the kernel:
In particular, the runtime was producing a segmentation fault when the test is compiled with any optimization level > 0; if you compile with -O0 the sample ran fine.
After debugging the runtime, it turned out the crash was happening at the point where the runtime calls the default mapper emitted by the compiler for `nesting_type; in particular at this point in the runtime:
llvm-project/offload/libomptarget/omptarget.cpp
Line 307 in c62cd28
Bisecting the optimization pipeline using
-mllvm -opt-bisect-limit=N, the first pass that triggered the issue onO1was theinstcombinepass. Debugging this further, the issue narrows down to canonicalizinggetelementptrinstructions from using struct types (in this case thenesting_typein the sample above) to using addressing bytes (i8). In particular, inO0, you would see something like this:The
udiv exactinstruction emitted by the OMP IR Builder (see:llvm-project/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
Line 9154 in c62cd28
instcombineto assume that%3is divisible by the struct size (here56) and, therefore, replaces the result of the division with direct GEP oni8rather than the struct type.However, the runtime was calling
@.omp_mapper._QQFnesting_type_omp_default_mappernot with56(the proper struct size) but with48!Debugging this further, I found that the size of
omp.map.infooperation to which the default mapper is attached computes the value of48because we set the map to partial (see:llvm-project/flang/lib/Optimizer/OpenMP/MapInfoFinalization.cpp
Line 1146 in c62cd28
llvm-project/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
Lines 4501 to 4512 in c62cd28
However, I think this is incorrect since the emitted mapper (and user-defined mappers in general) are defined on the whole struct type and should never be marked as partial. Hence, the fix in this PR.