[DirectX] Denote dx.resource.getpointer with IntrInaccessibleMemOnly and IntrReadMem#193593
Conversation
…ly` and `IntrReadMem` `IntrConvergent` was originally added to `dx.resource.getpointer` to prevent optimization passes (SimplifyCFG, GVN) from sinking the intrinsic out of control flow branches, which would create phi nodes on the returned pointer. Using `IntrInaccessibleMemOnly` and `IntrReadMem` semantics still prevent passes from merging or sinking identical calls across branches. However, this allows the call to be moved within a single control flow path and allows for further legal optimizations. Updates relevant tests and adds a new test to demonstrate a potential optimization
|
@llvm/pr-subscribers-llvm-ir @llvm/pr-subscribers-backend-directx Author: Finn Plummer (inbelic) Changes
Using Updates relevant tests and adds a new test to demonstrate a now legal potential optimization. Assisted by: Claude Opus 4.6 Full diff: https://github.com/llvm/llvm-project/pull/193593.diff 4 Files Affected:
diff --git a/llvm/include/llvm/IR/IntrinsicsDirectX.td b/llvm/include/llvm/IR/IntrinsicsDirectX.td
index 13fd0e2dc6255..54e783b0ee337 100644
--- a/llvm/include/llvm/IR/IntrinsicsDirectX.td
+++ b/llvm/include/llvm/IR/IntrinsicsDirectX.td
@@ -38,7 +38,7 @@ def int_dx_resource_handlefromimplicitbinding
def int_dx_resource_getpointer
: DefaultAttrsIntrinsic<[llvm_anyptr_ty], [llvm_any_ty, llvm_any_ty],
- [IntrConvergent, IntrNoMem]>;
+ [IntrReadMem, IntrInaccessibleMemOnly]>;
def int_dx_resource_nonuniformindex
: DefaultAttrsIntrinsic<[llvm_i32_ty], [llvm_i32_ty], [IntrNoMem]>;
diff --git a/llvm/test/Transforms/DirectX/getpointer-sink-behavior.ll b/llvm/test/Transforms/DirectX/getpointer-sink-behavior.ll
new file mode 100644
index 0000000000000..d560a2af4e3c5
--- /dev/null
+++ b/llvm/test/Transforms/DirectX/getpointer-sink-behavior.ll
@@ -0,0 +1,31 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 6
+; RUN: opt -passes=sink -S %s | FileCheck %s
+
+; Verify that dx.resource.getpointer can be sunk into a branch where it is
+; only used, now that it is no longer marked convergent.
+
+define void @can_sink_into_branch(i1 %cond) {
+; CHECK-LABEL: define void @can_sink_into_branch(
+; CHECK-SAME: i1 [[COND:%.*]]) {
+; CHECK-NEXT: [[ENTRY:.*:]]
+; CHECK-NEXT: br i1 [[COND]], label %[[IF_THEN:.*]], label %[[IF_END:.*]]
+; CHECK: [[IF_THEN]]:
+; CHECK-NEXT: [[BUF:%.*]] = call target("dx.RawBuffer", i32, 1, 0) @llvm.dx.resource.handlefrombinding.tdx.RawBuffer_i32_1_0t(i32 0, i32 0, i32 1, i32 0, ptr null)
+; CHECK-NEXT: [[PTR:%.*]] = call noundef nonnull align 4 dereferenceable(4) ptr @llvm.dx.resource.getpointer.p0.tdx.RawBuffer_i32_1_0t.i32(target("dx.RawBuffer", i32, 1, 0) [[BUF]], i32 0)
+; CHECK-NEXT: store i32 42, ptr [[PTR]], align 4
+; CHECK-NEXT: br label %[[IF_END]]
+; CHECK: [[IF_END]]:
+; CHECK-NEXT: ret void
+;
+entry:
+ %buf = call target("dx.RawBuffer", i32, 1, 0) @llvm.dx.resource.handlefrombinding.tdx.RawBuffer_i32_1_0t(i32 0, i32 0, i32 1, i32 0, ptr null)
+ %ptr = call noundef nonnull align 4 dereferenceable(4) ptr @llvm.dx.resource.getpointer.p0.tdx.RawBuffer_i32_1_0t.i32(target("dx.RawBuffer", i32, 1, 0) %buf, i32 0)
+ br i1 %cond, label %if.then, label %if.end
+
+if.then:
+ store i32 42, ptr %ptr, align 4
+ br label %if.end
+
+if.end:
+ ret void
+}
diff --git a/llvm/test/Transforms/GVN/no-sink-dxgetpointer.ll b/llvm/test/Transforms/GVN/no-sink-dxgetpointer.ll
index 2d5a07562f4a4..eeaead53f893a 100644
--- a/llvm/test/Transforms/GVN/no-sink-dxgetpointer.ll
+++ b/llvm/test/Transforms/GVN/no-sink-dxgetpointer.ll
@@ -1,8 +1,9 @@
; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 6
; RUN: opt -passes=gvn -passes=instnamer -S %s | FileCheck %s
-; This test ensures that given dx.resource.getpointer is marked convergent, the
-; GVN pass is prevented from sinking these intrinsics.
+; This test ensures that given dx.resource.getpointer reads inaccessible memory,
+; the GVN pass is prevented from sinking these intrinsics out of branches which
+; would create phi nodes on the returned ptr.
;
; NOTE: The following ir represents case F and G from:
; https://godbolt.org/z/cK4xh1P49.
diff --git a/llvm/test/Transforms/SimplifyCFG/DirectX/no-sink-dxgetpointer.ll b/llvm/test/Transforms/SimplifyCFG/DirectX/no-sink-dxgetpointer.ll
index 038b25d765d6f..576acce2b8dd2 100644
--- a/llvm/test/Transforms/SimplifyCFG/DirectX/no-sink-dxgetpointer.ll
+++ b/llvm/test/Transforms/SimplifyCFG/DirectX/no-sink-dxgetpointer.ll
@@ -1,9 +1,9 @@
; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 6
; RUN: opt -passes=simplifycfg -passes=instnamer -S %s | FileCheck %s
-; This test ensures that given dx.resource.getpointer is marked convergent, the
-; SimplifyCFG pass will be prevented from moving these intrinsics into the
-; branches required for sinking handle retrieve before resource access.
+; This test ensures that given dx.resource.getpointer reads inaccessible memory,
+; the SimplifyCFG pass will be prevented from sinking these intrinsics out of
+; branches which would create phi nodes on the returned ptr.
;
; NOTE: The following test ir is generated from:
; https://godbolt.org/z/1EdGTbscE.
|
|
@llvm/pr-subscribers-llvm-transforms Author: Finn Plummer (inbelic) Changes
Using Updates relevant tests and adds a new test to demonstrate a now legal potential optimization. Assisted by: Claude Opus 4.6 Full diff: https://github.com/llvm/llvm-project/pull/193593.diff 4 Files Affected:
diff --git a/llvm/include/llvm/IR/IntrinsicsDirectX.td b/llvm/include/llvm/IR/IntrinsicsDirectX.td
index 13fd0e2dc6255..54e783b0ee337 100644
--- a/llvm/include/llvm/IR/IntrinsicsDirectX.td
+++ b/llvm/include/llvm/IR/IntrinsicsDirectX.td
@@ -38,7 +38,7 @@ def int_dx_resource_handlefromimplicitbinding
def int_dx_resource_getpointer
: DefaultAttrsIntrinsic<[llvm_anyptr_ty], [llvm_any_ty, llvm_any_ty],
- [IntrConvergent, IntrNoMem]>;
+ [IntrReadMem, IntrInaccessibleMemOnly]>;
def int_dx_resource_nonuniformindex
: DefaultAttrsIntrinsic<[llvm_i32_ty], [llvm_i32_ty], [IntrNoMem]>;
diff --git a/llvm/test/Transforms/DirectX/getpointer-sink-behavior.ll b/llvm/test/Transforms/DirectX/getpointer-sink-behavior.ll
new file mode 100644
index 0000000000000..d560a2af4e3c5
--- /dev/null
+++ b/llvm/test/Transforms/DirectX/getpointer-sink-behavior.ll
@@ -0,0 +1,31 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 6
+; RUN: opt -passes=sink -S %s | FileCheck %s
+
+; Verify that dx.resource.getpointer can be sunk into a branch where it is
+; only used, now that it is no longer marked convergent.
+
+define void @can_sink_into_branch(i1 %cond) {
+; CHECK-LABEL: define void @can_sink_into_branch(
+; CHECK-SAME: i1 [[COND:%.*]]) {
+; CHECK-NEXT: [[ENTRY:.*:]]
+; CHECK-NEXT: br i1 [[COND]], label %[[IF_THEN:.*]], label %[[IF_END:.*]]
+; CHECK: [[IF_THEN]]:
+; CHECK-NEXT: [[BUF:%.*]] = call target("dx.RawBuffer", i32, 1, 0) @llvm.dx.resource.handlefrombinding.tdx.RawBuffer_i32_1_0t(i32 0, i32 0, i32 1, i32 0, ptr null)
+; CHECK-NEXT: [[PTR:%.*]] = call noundef nonnull align 4 dereferenceable(4) ptr @llvm.dx.resource.getpointer.p0.tdx.RawBuffer_i32_1_0t.i32(target("dx.RawBuffer", i32, 1, 0) [[BUF]], i32 0)
+; CHECK-NEXT: store i32 42, ptr [[PTR]], align 4
+; CHECK-NEXT: br label %[[IF_END]]
+; CHECK: [[IF_END]]:
+; CHECK-NEXT: ret void
+;
+entry:
+ %buf = call target("dx.RawBuffer", i32, 1, 0) @llvm.dx.resource.handlefrombinding.tdx.RawBuffer_i32_1_0t(i32 0, i32 0, i32 1, i32 0, ptr null)
+ %ptr = call noundef nonnull align 4 dereferenceable(4) ptr @llvm.dx.resource.getpointer.p0.tdx.RawBuffer_i32_1_0t.i32(target("dx.RawBuffer", i32, 1, 0) %buf, i32 0)
+ br i1 %cond, label %if.then, label %if.end
+
+if.then:
+ store i32 42, ptr %ptr, align 4
+ br label %if.end
+
+if.end:
+ ret void
+}
diff --git a/llvm/test/Transforms/GVN/no-sink-dxgetpointer.ll b/llvm/test/Transforms/GVN/no-sink-dxgetpointer.ll
index 2d5a07562f4a4..eeaead53f893a 100644
--- a/llvm/test/Transforms/GVN/no-sink-dxgetpointer.ll
+++ b/llvm/test/Transforms/GVN/no-sink-dxgetpointer.ll
@@ -1,8 +1,9 @@
; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 6
; RUN: opt -passes=gvn -passes=instnamer -S %s | FileCheck %s
-; This test ensures that given dx.resource.getpointer is marked convergent, the
-; GVN pass is prevented from sinking these intrinsics.
+; This test ensures that given dx.resource.getpointer reads inaccessible memory,
+; the GVN pass is prevented from sinking these intrinsics out of branches which
+; would create phi nodes on the returned ptr.
;
; NOTE: The following ir represents case F and G from:
; https://godbolt.org/z/cK4xh1P49.
diff --git a/llvm/test/Transforms/SimplifyCFG/DirectX/no-sink-dxgetpointer.ll b/llvm/test/Transforms/SimplifyCFG/DirectX/no-sink-dxgetpointer.ll
index 038b25d765d6f..576acce2b8dd2 100644
--- a/llvm/test/Transforms/SimplifyCFG/DirectX/no-sink-dxgetpointer.ll
+++ b/llvm/test/Transforms/SimplifyCFG/DirectX/no-sink-dxgetpointer.ll
@@ -1,9 +1,9 @@
; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 6
; RUN: opt -passes=simplifycfg -passes=instnamer -S %s | FileCheck %s
-; This test ensures that given dx.resource.getpointer is marked convergent, the
-; SimplifyCFG pass will be prevented from moving these intrinsics into the
-; branches required for sinking handle retrieve before resource access.
+; This test ensures that given dx.resource.getpointer reads inaccessible memory,
+; the SimplifyCFG pass will be prevented from sinking these intrinsics out of
+; branches which would create phi nodes on the returned ptr.
;
; NOTE: The following test ir is generated from:
; https://godbolt.org/z/1EdGTbscE.
|
Co-authored-by: Justin Bogner <mail@justinbogner.com>
…ly` and `IntrReadMem` (llvm#193593) `IntrConvergent` was originally added to `dx.resource.getpointer` to prevent optimization passes (`SimplifyCFG`, `GVN`) from sinking the intrinsic out of control flow branches, which would create phi nodes on the returned pointer. Using `IntrInaccessibleMemOnly` and `IntrReadMem` semantics still prevent passes from merging or sinking identical calls across branches. However, this allows the call to be moved within a single control flow path. Updates relevant tests and adds a new test to demonstrate a now legal potential optimization. This was discovered when llvm#188792 caused the following failure: https://github.com/llvm/llvm-project/actions/runs/24577221310/job/71865579618. When emitting convergence control tokens, each resource access is then a user of the convergence control tokens, which makes it's use more unnecessarily restrictive for optimizations and in this case would prevent a loop unroll from taking place. Assisted by: Claude Opus 4.6
…ly` and `IntrReadMem` (llvm#193593) `IntrConvergent` was originally added to `dx.resource.getpointer` to prevent optimization passes (`SimplifyCFG`, `GVN`) from sinking the intrinsic out of control flow branches, which would create phi nodes on the returned pointer. Using `IntrInaccessibleMemOnly` and `IntrReadMem` semantics still prevent passes from merging or sinking identical calls across branches. However, this allows the call to be moved within a single control flow path. Updates relevant tests and adds a new test to demonstrate a now legal potential optimization. This was discovered when llvm#188792 caused the following failure: https://github.com/llvm/llvm-project/actions/runs/24577221310/job/71865579618. When emitting convergence control tokens, each resource access is then a user of the convergence control tokens, which makes it's use more unnecessarily restrictive for optimizations and in this case would prevent a loop unroll from taking place. Assisted by: Claude Opus 4.6
…g DirectX" (#194452) The initial landing surfaced 3 somewhat orthogonal issues related to loop unrolling. These are addressed: [here](#193592), [here](#193593) and [here](#193590). These caused these [tests](https://github.com/llvm/llvm-project/actions/runs/24577221310/job/71865579618#step:8:87913) to fail in the offload test suite. We can verify that these are now passing as expected (fixing any of the 3 issues would resolve this and allow us to reland) Some additional tests were added since the revert that are now accounted for and updated in the reland fixes commit. This relands #188792
…en targeting DirectX" (#194452) The initial landing surfaced 3 somewhat orthogonal issues related to loop unrolling. These are addressed: [here](llvm/llvm-project#193592), [here](llvm/llvm-project#193593) and [here](llvm/llvm-project#193590). These caused these [tests](https://github.com/llvm/llvm-project/actions/runs/24577221310/job/71865579618#step:8:87913) to fail in the offload test suite. We can verify that these are now passing as expected (fixing any of the 3 issues would resolve this and allow us to reland) Some additional tests were added since the revert that are now accounted for and updated in the reland fixes commit. This relands llvm/llvm-project#188792
…en targeting DirectX" (#194452) The initial landing surfaced 3 somewhat orthogonal issues related to loop unrolling. These are addressed: [here](llvm/llvm-project#193592), [here](llvm/llvm-project#193593) and [here](llvm/llvm-project#193590). These caused these [tests](https://github.com/llvm/llvm-project/actions/runs/24577221310/job/71865579618#step:8:87913) to fail in the offload test suite. We can verify that these are now passing as expected (fixing any of the 3 issues would resolve this and allow us to reland) Some additional tests were added since the revert that are now accounted for and updated in the reland fixes commit. This relands llvm/llvm-project#188792
IntrConvergentwas originally added todx.resource.getpointerto prevent optimization passes (SimplifyCFG,GVN) from sinking the intrinsic out of control flow branches, which would create phi nodes on the returned pointer.Using
IntrInaccessibleMemOnlyandIntrReadMemsemantics still prevent passes from merging or sinking identical calls across branches. However, this allows the call to be moved within a single control flow path.Updates relevant tests and adds a new test to demonstrate a now legal potential optimization.
This was discovered when #188792 caused the following failure: https://github.com/llvm/llvm-project/actions/runs/24577221310/job/71865579618. When emitting convergence control tokens, each resource access is then a user of the convergence control tokens, which makes it's use more unnecessarily restrictive for optimizations and in this case would prevent a loop unroll from taking place.
Assisted by: Claude Opus 4.6