Skip to content

BranchFolder not debug invariant #43309

@bjope

Description

@bjope
Bugzilla Link 43964
Resolution FIXED
Resolved on Nov 21, 2019 10:26
Version trunk
OS Windows NT
Blocks #37076
CC @adrian-prantl,@jmorse,@mikaelholmen,@pogo59,@vedantk
Fixed by commit(s) rG898de302919b

Extended Description

Unfortunately I see a lot more test cases failing due to codegen not being debug invariant after the fix of #41483 (see https://reviews.llvm.org/D66467).

Main reason is that the loops after SkipTopCFIAndReturn: now undo the work done just before the SkipTopCFIAndReturn label. In the past we have moved the iterators backwards, past any DBG_VALUE instructions. Now we do that, but directly afterward we move the iterators forward again, past both DBG_VALUE and CFI directives. Notice that main purpose with ComputeCommonTailLength is to count number of instructions that are common in the tails, and that number is used by ProfitableToMerge to answer the question if we should attempt to do a tail merge or not.
The answer given by ProfitableToMerge depends on the result of comparing the iterators with MBB->begin(). To make those checks debug invariant we need to move the iterators backward past any debug instructions! (or we would need to modify the checks in some way)

While investigating this, and when using some more or less hand-crafted MIR-tests, it turns out that this is a rather complicated problem. And afaict BranchFolding isn't that good at being debug invariant. And when it is, it isn't that smart when it comes to dealing with any debug instructions or CFI directives that isn't really part of the tail.

One thing that I'm not 100% sure about is if CFI-directives should be allowed to impact codegen or not. It seems like CFI directives, at least sometimes, are generated even without -g. If CFI directives shouldn't impact codegen, then what should the rules be for passes like BranchFolding when merging tails with different CFI directives in the involved basic blocks. CFI directive could both follow an earlier instruction, or it can be put in the beginning of a BB.

I'm afraid that if we simply move the iterators backward until just before a "real" instruction that isn't common (or until we reach beginning of BB). Then at least the transform done by BranchFolding is likely to be debug invariant. But the cost might be that we currently drop lots of DBG/CFI instructions, which at least when it comes to CFI instructions could be bad.

Maybe BranchFolder should execute before PrologEpilogInserter (assuming CFI directives are first introduced by PEI?). Or is BranchFolder typicallly used to merge epilogues, so that is why it is executed after PEI.

Here is a test case showing at least one problem with BranchFolding. The test1 and test2 functions only differ by DBG_VALUE and CFI_INSTRUCTION instructions. So I guess we want to see the same real instructions being generated for both functions.

#################################################################################

RUN: llc -mtriple=x86_64-- -run-pass branch-folder -O3 -o - %s | FileCheck %s


name: test1
body: |
bb.0:
TEST8rr killed renamable $al, renamable $al, implicit-def $eflags
JCC_1 %bb.2, 5, implicit killed $eflags

bb.1:
MOV8mi $r12, 1, $noreg, 0, $noreg, 0
MOV8mi $r13, 1, $noreg, 0, $noreg, 0
RET 0

bb.2:
MOV8mi $r13, 1, $noreg, 0, $noreg, 0
RET 0
...

CHECK-LABEL: name: test1

CHECK: bb.0:

CHECK: TEST8rr

CHECK: JCC_1

CHECK: bb.1:

CHECK: successors: %bb.2

CHECK: MOV8mi $r12

CHECK: bb.2:

CHECK: MOV8mi $r13

CHECK: RET 0


name: test2
body: |
bb.0:
TEST8rr killed renamable $al, renamable $al, implicit-def $eflags
JCC_1 %bb.2, 5, implicit killed $eflags

bb.1:
MOV8mi $r12, 1, $noreg, 0, $noreg, 0
MOV8mi $r13, 1, $noreg, 0, $noreg, 0
RET 0

bb.2:
DBG_VALUE
DBG_VALUE
CFI_INSTRUCTION def_cfa_offset 8
MOV8mi $r13, 1, $noreg, 0, $noreg, 0
RET 0
...

CHECK-LABEL: name: test2

CHECK: bb.0:

CHECK: TEST8rr

CHECK: JCC_1

CHECK: bb.1:

CHECK: successors: %bb.2

CHECK: MOV8mi $r12

CHECK: bb.2:

CHECK: MOV8mi $r13

CHECK: RET 0

#################################################################################

The result is two different function, since we do not :

####################################################################

name: test1
bb.0:
successors: %bb.2(0x40000000), %bb.1(0x40000000)
TEST8rr killed renamable $al, renamable $al, implicit-def $eflags
JCC_1 %bb.2, 5, implicit $eflags
bb.1:
successors: %bb.2(0x80000000)
MOV8mi $r12, 1, $noreg, 0, $noreg, 0
bb.2:
MOV8mi $r13, 1, $noreg, 0, $noreg, 0
RET 0

name: test2
bb.0:
successors: %bb.2(0x40000000), %bb.1(0x40000000)
TEST8rr killed renamable $al, renamable $al, implicit-def $eflags
JCC_1 %bb.2, 5, implicit killed $eflags
bb.1:
MOV8mi $r12, 1, $noreg, 0, $noreg, 0
MOV8mi $r13, 1, $noreg, 0, $noreg, 0
RET 0
bb.2:
DBG_VALUE
DBG_VALUE
CFI_INSTRUCTION def_cfa_offset 8
MOV8mi $r13, 1, $noreg, 0, $noreg, 0
RET 0

####################################################################

Metadata

Metadata

Assignees

Labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions