-
Notifications
You must be signed in to change notification settings - Fork 664
Inliner: Generates invalid structured control flow when function call is in a single-block loop, and callee has control flow #787
Description
Here's a somewhat reduced example:
Original OpenCL C source:
int bar(global int *A, int n) {
if (n < 19) {
A[n] += 1;
}
return A[n];
}
kernel void foo(global int* A, int n) {
for (int i = 0; i < n; i++) {
A[i] = bar(A, n);
}
}
Compling with https://github.com/google/clspv we get:
OpCapability Shader
OpCapability VariablePointers
OpExtension "SPV_KHR_storage_buffer_storage_class"
OpExtension "SPV_KHR_variable_pointers"
OpMemoryModel Logical GLSL450
OpEntryPoint GLCompute %35 "foo"
OpSource OpenCL_C 120
OpDecorate %17 SpecId 0
OpDecorate %18 SpecId 1
OpDecorate %19 SpecId 2
OpDecorate %_runtimearr_uint ArrayStride 4
OpMemberDecorate %_struct_4 0 Offset 0
OpDecorate %_struct_4 Block
OpMemberDecorate %_struct_6 0 Offset 0
OpDecorate %_struct_6 Block
OpDecorate %gl_WorkGroupSize BuiltIn WorkgroupSize
OpDecorate %22 DescriptorSet 0
OpDecorate %22 Binding 0
OpDecorate %23 DescriptorSet 0
OpDecorate %23 Binding 1
OpDecorate %_ptr_StorageBuffer_uint ArrayStride 4
%uint = OpTypeInt 32 0
%_ptr_StorageBuffer_uint = OpTypePointer StorageBuffer %uint
%_runtimearr_uint = OpTypeRuntimeArray %uint
%_struct_4 = OpTypeStruct %_runtimearr_uint
%_ptr_StorageBuffer__struct_4 = OpTypePointer StorageBuffer %_struct_4
%_struct_6 = OpTypeStruct %uint
%_ptr_StorageBuffer__struct_6 = OpTypePointer StorageBuffer %_struct_6
%void = OpTypeVoid
%9 = OpTypeFunction %void
%bool = OpTypeBool
%11 = OpTypeFunction %uint %_ptr_StorageBuffer_uint %uint
%v3uint = OpTypeVector %uint 3
%_ptr_Private_v3uint = OpTypePointer Private %v3uint
%uint_0 = OpConstant %uint 0
%uint_1 = OpConstant %uint 1
%uint_19 = OpConstant %uint 19
%17 = OpSpecConstant %uint 1
%18 = OpSpecConstant %uint 1
%19 = OpSpecConstant %uint 1
%gl_WorkGroupSize = OpSpecConstantComposite %v3uint %17 %18 %19
%21 = OpVariable %_ptr_Private_v3uint Private %gl_WorkGroupSize
%22 = OpVariable %_ptr_StorageBuffer__struct_4 StorageBuffer
%23 = OpVariable %_ptr_StorageBuffer__struct_6 StorageBuffer
%24 = OpFunction %uint None %11
%25 = OpFunctionParameter %_ptr_StorageBuffer_uint
%26 = OpFunctionParameter %uint
%27 = OpLabel
%28 = OpSLessThan %bool %26 %uint_19
%29 = OpPtrAccessChain %_ptr_StorageBuffer_uint %25 %26
%30 = OpLoad %uint %29
OpSelectionMerge %33 None
OpBranchConditional %28 %31 %33
%31 = OpLabel
%32 = OpIAdd %uint %30 %uint_1
OpStore %29 %32
OpBranch %33
%33 = OpLabel
%34 = OpPhi %uint %30 %27 %32 %31
OpReturnValue %34
OpFunctionEnd
%35 = OpFunction %void None %9
%36 = OpLabel
%37 = OpAccessChain %_ptr_StorageBuffer_uint %22 %uint_0 %uint_0
%38 = OpAccessChain %_ptr_StorageBuffer_uint %23 %uint_0
%39 = OpLoad %uint %38
%40 = OpSGreaterThan %bool %39 %uint_0
OpSelectionMerge %50 None
OpBranchConditional %40 %41 %50
%41 = OpLabel
OpBranch %42
%42 = OpLabel
%43 = OpPhi %uint %46 %42 %uint_0 %41
%44 = OpFunctionCall %uint %24 %37 %39
%45 = OpAccessChain %_ptr_StorageBuffer_uint %22 %uint_0 %43
OpStore %45 %44
%46 = OpIAdd %uint %43 %uint_1
%47 = OpSLessThan %bool %46 %39
%48 = OpLogicalNot %bool %47
OpLoopMerge %49 %42 None
OpBranchConditional %48 %49 %42
%49 = OpLabel
OpBranch %50
%50 = OpLabel
OpBranch %51
%51 = OpLabel
OpReturn
OpFunctionEnd
The 'bar' function starts at block with id 27, and has a simple selection structure.
The interesting bit is the loop with a single block with id 42 in the entry point 'foo'.
After inlining, I get this result:

I've attached the SPIR-V:
b.spv.txt
The problem is that the old block 42 has been split with head 42 the callee's body has been inserted as the last part of 42 and with new block 58 and 57. Unfortunately, the OpLoopMerge originally in 42 is kept as the last instruction in block 57, when it really should have been kept as the second last instruction in 42.
Basically, the structured control flow properties are corrupted. The validator catches this, saying:
error: 102: Back-edges (57 -> 42) can only be formed between a block and a loop header.
It detects the back-edge from 57 to 42, and expects the OpLoopMerge to be in block 42. It should be there, but isn't, and so it complains that block 42 is not a loop header. It's right.
