[DSE] Consider the aliasing through global variable while checking clobber#120044
[DSE] Consider the aliasing through global variable while checking clobber#120044
Conversation
|
@llvm/pr-subscribers-llvm-transforms Author: Haopeng Liu (haopliu) ChangesWhile update the read clobber check for the "initializes" attr, we checked the aliasing among arguments, but didn't consider the aliasing through global variable. It causes problems in this example: We mistakenly removed Fix the issue by requiring the CallBase only access argmem or inaccessiblemem. Full diff: https://github.com/llvm/llvm-project/pull/120044.diff 2 Files Affected:
diff --git a/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp b/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
index 4799640089fa9a..19f6bba583941b 100644
--- a/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
+++ b/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
@@ -1665,11 +1665,16 @@ struct DSEState {
// original MD. Stop walk.
// If KillingDef is a CallInst with "initializes" attribute, the reads in
// the callee would be dominated by initializations, so it should be safe.
+ // Note that in `getInitializesArgMemLoc`, we only check the aliasing
+ // among arguments. If aliasing through global variables, we consider it
+ // as read clobber.
bool IsKillingDefFromInitAttr = false;
if (IsInitializesAttrMemLoc) {
- if (KillingI == UseInst &&
- KillingUndObj == getUnderlyingObject(MaybeDeadLoc.Ptr))
- IsKillingDefFromInitAttr = true;
+ if (auto *CB = dyn_cast<CallBase>(UseInst))
+ IsKillingDefFromInitAttr =
+ KillingI == UseInst &&
+ KillingUndObj == getUnderlyingObject(MaybeDeadLoc.Ptr) &&
+ CB->onlyAccessesInaccessibleMemOrArgMem();
}
if (isReadClobber(MaybeDeadLoc, UseInst) && !IsKillingDefFromInitAttr) {
diff --git a/llvm/test/Transforms/DeadStoreElimination/inter-procedural.ll b/llvm/test/Transforms/DeadStoreElimination/inter-procedural.ll
index d93da9b6612b05..9c040d4d5bb380 100644
--- a/llvm/test/Transforms/DeadStoreElimination/inter-procedural.ll
+++ b/llvm/test/Transforms/DeadStoreElimination/inter-procedural.ll
@@ -2,12 +2,22 @@
; RUN: opt < %s -passes=dse -enable-dse-initializes-attr-improvement -S | FileCheck %s
declare void @p1_write_only(ptr nocapture noundef writeonly initializes((0, 2)) dead_on_unwind)
+
declare void @p1_write_then_read(ptr nocapture noundef initializes((0, 2)) dead_on_unwind)
+ memory(argmem: readwrite, inaccessiblemem: readwrite)
+
declare void @p1_clobber(ptr nocapture noundef)
+
declare void @p2_same_range(ptr nocapture noundef initializes((0, 2)) dead_on_unwind, ptr nocapture noundef initializes((0, 2)) dead_on_unwind)
+ memory(argmem: readwrite, inaccessiblemem: readwrite)
+
declare void @p2_no_init(ptr nocapture noundef initializes((0, 2)) dead_on_unwind, ptr nocapture noundef dead_on_unwind)
+
declare void @p2_no_dead_on_unwind(ptr nocapture noundef initializes((0, 2)) dead_on_unwind, ptr nocapture noundef initializes((0, 2)))
+ memory(argmem: readwrite, inaccessiblemem: readwrite)
+
declare void @p2_no_dead_on_unwind_but_nounwind(ptr nocapture noundef initializes((0, 2)) dead_on_unwind, ptr nocapture noundef initializes((0, 2))) nounwind
+ memory(argmem: readwrite, inaccessiblemem: readwrite)
; Function Attrs: mustprogress nounwind uwtable
define i16 @p1_write_only_caller() {
@@ -215,8 +225,12 @@ define i16 @p2_no_dead_on_unwind_but_nounwind_alias_caller() {
}
declare void @llvm.memset.p0.i64(ptr nocapture, i8, i64, i1) nounwind
+
declare void @large_p1(ptr nocapture noundef initializes((0, 200))) nounwind
+ memory(argmem: readwrite, inaccessiblemem: readwrite)
+
declare void @large_p2(ptr nocapture noundef initializes((0, 200)), ptr nocapture noundef initializes((0, 100))) nounwind
+ memory(argmem: readwrite, inaccessiblemem: readwrite)
; Function Attrs: mustprogress nounwind uwtable
define i16 @large_p1_caller() {
@@ -299,3 +313,23 @@ define i16 @large_p2_may_or_partial_alias_caller2(ptr %base1, ptr %base2) {
ret i16 %l
}
+@g = global i16 123, align 2
+
+declare void @read_global(ptr nocapture noundef initializes((0, 2)))
+ memory(read, argmem: write, inaccessiblemem: none) nounwind
+
+define i16 @global_var_alias() {
+; CHECK-LABEL: @global_var_alias(
+; CHECK-NEXT: store i32 0, ptr @g, align 4
+; CHECK-NEXT: [[G_ADDR:%.*]] = getelementptr i32, ptr @g, i64 1
+; CHECK-NEXT: call void @read_global(ptr [[G_ADDR]])
+; CHECK-NEXT: [[L:%.*]] = load i16, ptr @g, align 2
+; CHECK-NEXT: ret i16 [[L]]
+;
+ store i32 0, ptr @g, align 4
+ %g_addr = getelementptr i32, ptr @g, i64 1
+ call void @read_global(ptr %g_addr)
+ %l = load i16, ptr @g
+ ret i16 %l
+}
+
|
| IsKillingDefFromInitAttr = | ||
| KillingI == UseInst && | ||
| KillingUndObj == getUnderlyingObject(MaybeDeadLoc.Ptr) && | ||
| CB->onlyAccessesInaccessibleMemOrArgMem(); |
There was a problem hiding this comment.
Would it make more sense to handle this inside getInitializesArgMemLoc?
There was a problem hiding this comment.
Moved this check to inside getInitializesArgMemLoc. Thanks!
aeubanks
left a comment
There was a problem hiding this comment.
I think it's unfortunate to require onlyAccessesInaccessibleMemOrArgMem() for all of the cases, it'll probably prevent a lot of valid DSE. can we be more precise on when we need onlyAccessesInaccessibleMemOrArgMem()? if the pointer hasn't escaped yet then we don't need this restriction
| ; CHECK-NEXT: [[L:%.*]] = load i16, ptr @g, align 2 | ||
| ; CHECK-NEXT: ret i16 [[L]] | ||
| ; | ||
| store i32 0, ptr @g, align 4 |
There was a problem hiding this comment.
isn't this storing an i32 value to an i16 global? and the gep is out of bounds?
There was a problem hiding this comment.
Ah, fixed the mistake. Thanks.
Updated the check to require either the argument is from Alloca that doesn't escape or PTAL, thanks. |
|
The check for alloca should be a check for isIdentifiedFunctionLocal and your ValidFromAlloca function should be PointerMayBeCaptured -- though it probably makes sense to query it via EarliestEscapeAnalysis, which is a bit more precise and cached. |
|
Thanks! Updated the check to use |
| // Return true if "Arg" is function local and isn't captured before "CB" or | ||
| // if "Arg" is GEP whose base pointer is function local and isn't captured | ||
| // before "CB". | ||
| bool IsFuncLocalAndNotCaptured(Value *Arg, const CallBase *CB, |
There was a problem hiding this comment.
| bool IsFuncLocalAndNotCaptured(Value *Arg, const CallBase *CB, | |
| bool isFuncLocalAndNotCaptured(Value *Arg, const CallBase *CB, |
| EarliestEscapeAnalysis &EA) { | ||
| if (isIdentifiedFunctionLocal(Arg)) | ||
| return EA.isNotCapturedBefore(Arg, CB, /*OrAt*/ true); | ||
| const auto *GEP = dyn_cast<GetElementPtrInst>(Arg); |
There was a problem hiding this comment.
Check the result of getUnderlyingObject instead, it will look through multiple GEPs.
| ; | ||
| store i16 0, ptr @g, align 4 | ||
| %g_addr = getelementptr i8, ptr @g, i64 1 | ||
| call void @read_global(ptr %g_addr) |
There was a problem hiding this comment.
it seems like this case would still go out of bounds for the i16 global?
%g_addr = 1 byte from the start
then @read_global initializes 2 bytes from there?
There was a problem hiding this comment.
Ooops, removed the GEP and directly read/write @g. Thanks.
jvoung
left a comment
There was a problem hiding this comment.
LGTM w.r.t. my comments
Nice recommendations for isIdentifiedFunctionLocal( and isNotCapturedBefore =)
| // either it's function local and isn't captured before or the "CB" only | ||
| // accesses arg or inaccessible mem. | ||
| if (!Inits.empty() && !isFuncLocalAndNotCaptured(CurArg, CB, EA) && | ||
| !CB->onlyAccessesInaccessibleMemOrArgMem()) |
There was a problem hiding this comment.
I'd check onlyAccessesInaccessibleMemOrArgMem before isFuncLocalAndNotCaptured as it's cheaper.
There was a problem hiding this comment.
Ah, thanks for the reminder! Done.
| } | ||
|
|
||
| declare void @fn_capture(ptr noundef) | ||
| ; Function Attrs: mustprogress nounwind uwtable |
There was a problem hiding this comment.
Drop inaccurate Function Attrs comment.
| ret i16 %l | ||
| } | ||
|
|
||
| declare void @fn_capture(ptr noundef) |
There was a problem hiding this comment.
Drop noundef, it's not relevant here.
(Retry) enable the initializes improvement in DSE. Initially enabled in #119116. Fix the aliasing issue through global variables in #120044. The compile-time comparison of this enabling (no meaningful diff): https://llvm-compile-time-tracker.com/compare.php?from=b46fcb9fa32f24660b1b8858d5c4cbdb76ef9d8b&to=33dc817b81f7898c87b052d1ddfd3d6e6f5b5dbd&stat=instructions%3Au
(Retry) enable the initializes improvement in DSE. Initially enabled in llvm/llvm-project#119116. Fix the aliasing issue through global variables in llvm/llvm-project#120044. The compile-time comparison of this enabling (no meaningful diff): https://llvm-compile-time-tracker.com/compare.php?from=b46fcb9fa32f24660b1b8858d5c4cbdb76ef9d8b&to=33dc817b81f7898c87b052d1ddfd3d6e6f5b5dbd&stat=instructions%3Au
While update the read clobber check for the "initializes" attr, we checked the aliasing among arguments, but didn't consider the aliasing through global variable. It causes problems in this example:
We mistakenly removed
g_var = 0;as a dead store.Fix the issue by requiring the CallBase only access argmem or inaccessiblemem.