DeadBranchElim: Correctly handle loops in dead branches.#773
DeadBranchElim: Correctly handle loops in dead branches.#773greg-lunarg wants to merge 5 commits intoKhronosGroup:masterfrom
Conversation
|
Note that this option can generate invalid code until this fix is applied. |
|
This is still not right. It does not mark orphaned merge blocks as live if their parent is live. Need to add code for that. |
|
This code now fixes the original problem without known regressions. It also contains two new tests to make sure we remove loops in dead branches and this pass can handle orphan'd selection merge blocks. |
dneto0
left a comment
There was a problem hiding this comment.
This is an improvement, but I think still has a hole.
Your new tests are good.
| uint32_t dLabId = (*dbi)->id(); | ||
| while (dLabId != mergeLabId) { | ||
| if (!HasNonPhiRef(dLabId)) { | ||
| if (liveLabIds.find(dLabId) == liveLabIds.end()) { |
There was a problem hiding this comment.
I think this is not quite right, for subtle reasons.
The problem case arises when the structured order could visit a block too early. Here's a case, roughly corresponding to:
if (A) { B } else { C} D ; E
CFG edges are:
A -> B
A -> C
B -> D
C -> D
D -> E
But by some fluke of some previous set of optimizations we've done, A's declared merge block happens to be E rather than normal D.
Then a valid structured ordering is: A D B C E.
And with that ordering we visit D "too early", i.e. before its live predecessors, and erroneously delete D.
I think the better algorithm is to compute the live set of blocks before entering the loop over the structured order.
Basically, I'm suggesting a loop based on a worklist, and then following successors and merge blocks and continue targets, right up at line 223 or so.
There was a problem hiding this comment.
You might be be conflating StructuredOrder and StructuredSuccessors, used to compute StructuredOrder.
StructuredOrder uses reverse DFS, but just using CFG successors doesn't quite work because merge blocks might be orphaned. So we created StructuredSuccessors which adds the merge block of a header as the first successor of the header so DFS works "correctly".
It is possible (and now seems very likely) that my definition of StructuredOrder is under-specified, but I am fairly certain in this case that D will not precede B or C in StructuredOrder by the rules of (reversed) DFS and the actual code we are using. If you think that D can precede B or C by our current code, please let me know.
I think I can also add to my definition of StructuredOrder that blocks will always succeed blocks that they post-dominate. Perhaps I should add that?
There was a problem hiding this comment.
I agree that the merge target and continue targets are needed in the orderings.
I agree the sticking point is that structured order is slightly underspecified. Structured order uses revers post-order on the appropriate CFG graph augmented with the merge-block and continue-target arcs. By my calculation the order in my example would be: A C D E B. So indeed D wouldn't get orphaned like I describe. Waving my hands, I could see that reverse post-order always places at a block after at least one of its predecessors, and so I can believe that I can't make another pathological case to break the algorithm.
Maybe I'm arguing over the head of a pin then. I approve this code change, but let's agree that we ought to update the structured order definition to say it really is reverse post-order of that augmented graph.
I'm just concerned that this bug is the second case where subtle orderings of things exposed holes in the optimizer and I was seeking an easy fix to make it more robust. I'm trying to get to a place where we write the more robust versions of algorithms when they have the same conceptual complexity.
|
Rebased, squashed, and pushed into master as 7c3de19 |
|
Sorry not to address your algorithm idea. Here is one problem I saw: what if one of the live blocks branches out of the structured construct of interest eg. a break out of selection construct to the merge block of the containing loop. We still would require some way to bound this analysis, so we would be back to relying on something like StructuredOrder to enumerate the blocks of interest. I do agree that the current algo might be slightly more fragile due to its reliance on order, but it may be a little more efficient and use less code. So maybe we are even? :) |
This does not appear to be standard practice. The results of these checks would only change if the compiler itself changes, so checking it on everyrun is unnecessary. Fixes KhronosGroup#773
The backedge to a dead loop header was causing the algorithm to think the header was live. This new solution is identical to the previous solution except that it also ignores refs from blocks that have not been marked live, such as blocks that contain backedges.