-
Notifications
You must be signed in to change notification settings - Fork 664
inliner: fails to move OpLoopMerge to top of loop when inlining multi-block function into loop header #800
Copy link
Copy link
Closed
Description
This is like #787 but it applied to loops that aren't just single-block loops. It happens in general.
Example:
OpCapability Shader
OpMemoryModel Logical GLSL450
OpEntryPoint GLCompute %main "main"
OpSource OpenCL_C 120
%bool = OpTypeBool
%true = OpConstantTrue %bool
%int = OpTypeInt 32 1
%int_1 = OpConstant %int 1 ; these constants are used as markers
%int_2 = OpConstant %int 2
%int_3 = OpConstant %int 3
%int_4 = OpConstant %int 4
%int_5 = OpConstant %int 5
%void = OpTypeVoid
%voidfn = OpTypeFunction %void
%foo = OpFunction %void None %voidfn
%fooentry = OpLabel
%c1 = OpCopyObject %int %int_1
OpBranch %fooexit
%fooexit = OpLabel
%c2 = OpCopyObject %int %int_2
OpReturn
OpFunctionEnd
%main = OpFunction %void None %voidfn
%entry = OpLabel
OpBranch %loop
%loop = OpLabel
%c3 = OpCopyObject %int %int_3
%nil = OpFunctionCall %void %foo
%c4 = OpCopyObject %int %int_4
OpLoopMerge %merge %body None
OpBranchConditional %true %body %merge
%body = OpLabel
%c5 = OpCopyObject %int %int_5
OpBranchConditional %true %loop %merge
%merge = OpLabel
OpReturn
OpFunctionEnd
The salient bits are:
- Callee function foo has two blocks. So it will generate multiple blocks when inlined
- The call to foo occurs in the loop header block: between its Label and LoopMerge instructions
- This example is different from Inliner: Generates invalid structured control flow when function call is in a single-block loop, and callee has control flow #787 because in this case the loop header branches forward into a new body block rather than just looping back.
Inlining produces:
OpCapability Shader
OpMemoryModel Logical GLSL450
OpEntryPoint GLCompute %1 "main"
OpSource OpenCL_C 120
%bool = OpTypeBool
%true = OpConstantTrue %bool
%int = OpTypeInt 32 1
%int_1 = OpConstant %int 1
%int_2 = OpConstant %int 2
%int_3 = OpConstant %int 3
%int_4 = OpConstant %int 4
%int_5 = OpConstant %int 5
%void = OpTypeVoid
%11 = OpTypeFunction %void
%12 = OpFunction %void None %11
%13 = OpLabel
%14 = OpCopyObject %int %int_1
OpBranch %15
%15 = OpLabel
%16 = OpCopyObject %int %int_2
OpReturn
OpFunctionEnd
%1 = OpFunction %void None %11
%17 = OpLabel
OpBranch %18
%18 = OpLabel
%19 = OpCopyObject %int %int_3
%25 = OpCopyObject %int %int_1
OpBranch %26
%26 = OpLabel
%27 = OpCopyObject %int %int_2
%21 = OpCopyObject %int %int_4
OpLoopMerge %22 %23 None
OpBranchConditional %true %23 %22
%23 = OpLabel
%24 = OpCopyObject %int %int_5
OpBranchConditional %true %18 %22
%22 = OpLabel
OpReturn
OpFunctionEnd
The problem is that the OpLoopMerge occurs too late, i.e. no longer in what should be the loop header block. We should move it to just after the definition of %25.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels