inlining: relax finalizer inlining control-flow restriction#46651
inlining: relax finalizer inlining control-flow restriction#46651
Conversation
test/compiler/inline.jl
Outdated
| @test count(isnew, src.code) == 0 | ||
| # FIXME | ||
| @test_broken count(isnew, src.code) == 0 && count(iscall(DoAllocNoEscapeSparam), src.code) == 0 |
There was a problem hiding this comment.
This is not because of this PR, but I found that this test case has been broken.
On master:
julia> src = code_typed1(Tuple{Any}) do x
for i = 1:1000
DoAllocNoEscapeSparam(x)
end
end
CodeInfo(
1 ─ goto #7 if not true
2 ┄ %2 = φ (#1 => 1, #6 => %9)::Int64
│ Main.DoAllocNoEscapeSparam(x)::Any
│ %4 = (%2 === 1000)::Bool
└── goto #4 if not %4
3 ─ goto #5
4 ─ %7 = Base.add_int(%2, 1)::Int64
└── goto #5
5 ┄ %9 = φ (#4 => %7)::Int64
│ %10 = φ (#3 => true, #4 => false)::Bool
│ %11 = Base.not_int(%10)::Bool
└── goto #7 if not %11
6 ─ goto #2
7 ┄ return nothing
)There was a problem hiding this comment.
Sure thing, I will take a look today
b112600 to
28f12c5
Compare
base/compiler/ssair/passes.jl
Outdated
| all(check_defuse, defuse.defs) || return nothing | ||
|
|
||
| # For now: Require all statements in the basic block range to be nothrow. | ||
| all(minval:maxval) do idx::Int |
There was a problem hiding this comment.
This should probably only look at the statements in blocks on the domination path, but we can do that in the followup. I'd also like a error_is_fatal optimizer setting that assumes that any errors are fatal and thus finalizers don't need to run after errors, implying that we can skip this legality check.
|
With this restriction lifted, it's no longer sufficient to just inline the finalizer after the numerically last SSA value that uses it. Something needs to ensure the finalizer happens after the last use on any path to the exit: |
Keno
left a comment
There was a problem hiding this comment.
I think this is the right direction overall, but see the correctness issue mentioned in the main issue body.
bb2e304 to
252b159
Compare
c742a32 to
7dcc2b5
Compare
|
Just FYI, multiple windows buildbots hardlocked on this PR, and I had to manually reboot them. I'm not sure if that's a coincidence or not, but you can see the builds here:
So perhaps something on this branch causes things to jam up in a bad way? Of course, it is a failing of the current windows buildbots that they do not more forcefully kill the windows processes and get stuck like this, but that's something we are working to fix in the Buildkite era. |
7dcc2b5 to
b01d02b
Compare
6d143ca to
18bd85e
Compare
|
I've implemented the post-dominator analysis and updated this. I think this should be reasonably complete now. @aviatesk take a look, please. |
|
Looks great, thanks sooo much for implementing the post-domination analysis, that is very exciting itself. const FINALIZATION_COUNT = Ref(0)
init_finalization_count!() = FINALIZATION_COUNT[] = 0
get_finalization_count() = FINALIZATION_COUNT[]
@noinline add_finalization_count!(x) = FINALIZATION_COUNT[] += x
@noinline Base.@assume_effects :nothrow safeprint(io::IO, x...) = (@nospecialize; print(io, x...))
@test Core.Compiler.is_finalizer_inlineable(Base.infer_effects(add_finalization_count!, (Int,)))
mutable struct DoAllocWithFieldInter
x::Int
end
function register_finalizer!(obj::DoAllocWithFieldInter)
finalizer(obj) do this
add_finalization_count!(this.x)
end
end
function cfg_finalization6(io)
for i = -999:1000
o = DoAllocWithFieldInter(0)
register_finalizer!(o)
if i == 1000
o.x = i # with `setfield!`
elseif i > 0
safeprint(io, o.x, '\n')
end
# <= shouldn't the finalizer be inlined here?
end
end
let src = code_typed1(cfg_finalization6, (IO,))
@test count(isinvoke(:add_finalization_count!), src.code) == 1
end
let
init_finalization_count!()
cfg_finalization6(IOBuffer())
@test get_finalization_count() == 1000 # this fails
endCurrently the finalizer is inlined after the allocation site, so can't observe the field value changed by julia> src = code_typed1(cfg_finalization6, (IO,))
CodeInfo(
1 ── goto #11 if not true
2 ┄─ %2 = φ (#1 => -999, #10 => %20)::Int64
│ %3 = φ (#1 => -999, #10 => %21)::Int64
│ %4 = %new(Main.DoAllocWithFieldInter, 0)::DoAllocWithFieldInter
│ %5 = Base.getfield(%4, :x)::Int64
│ invoke Main.add_finalization_count!(%5::Int64)::Int64
│ %7 = (%2 === 1000)::Bool
└─── goto #4 if not %7
3 ── Base.setfield!(%4, :x, %2)::Int64
└─── goto #6
4 ── %11 = Base.slt_int(0, %2)::Bool
└─── goto #6 if not %11
5 ── %13 = Base.getfield(%4, :x)::Int64
└─── invoke Main.safeprint(io::IO, %13::Any, '\n'::Vararg{Any})::Any
6 ┄─ %15 = (%3 === 1000)::Bool
└─── goto #8 if not %15
7 ── goto #9
8 ── %18 = Base.add_int(%3, 1)::Int64
└─── goto #9
9 ┄─ %20 = φ (#8 => %18)::Int64
│ %21 = φ (#8 => %18)::Int64
│ %22 = φ (#7 => true, #8 => false)::Bool
│ %23 = Base.not_int(%22)::Bool
└─── goto #11 if not %23
10 ─ goto #2
11 ┄ return nothing
) |
aviatesk
left a comment
There was a problem hiding this comment.
Just minor style nits.
Should be fixed by the following, if you want to apply that and add the test Plus an appropriate adjustment to the other use of reachable_blocks: |
bd8e4a1 to
56c8a9c
Compare
base/compiler/ssair/passes.jl
Outdated
| function check_range_nothrow(range) | ||
| return all(finalizer_idx:maxval) do sidx::Int |
There was a problem hiding this comment.
This function seems to do nothing actually (maxval is actually set to 0). I will give a try to enable it.
There was a problem hiding this comment.
Good catch, thanks. It should obviously check the range
Eager `finalizer` inlining (#45272) currently has a restriction that requires all the def/uses to be in a same basic block. This commit relaxes that restriction a bit by allowing def/uses to involve control flow when all of them are dominated by a `finalizer` call to be inlined, since in that case it is safe to insert the body of `finalizer` at the end of all the def/uses, e.g. ```julia const FINALIZATION_COUNT = Ref(0) init_finalization_count!() = FINALIZATION_COUNT[] = 0 get_finalization_count() = FINALIZATION_COUNT[] @noinline add_finalization_count!(x) = FINALIZATION_COUNT[] += x @noinline Base.@assume_effects :nothrow safeprint(io::IO, x...) = (@nospecialize; print(io, x...)) mutable struct DoAllocWithFieldInter x::Int end function register_finalizer!(obj::DoAllocWithFieldInter) finalizer(obj) do this add_finalization_count!(this.x) end end function cfg_finalization3(io) for i = -999:1000 o = DoAllocWithFieldInter(i) register_finalizer!(o) if i == 1000 safeprint(io, o.x, '\n') elseif i > 0 safeprint(io, o.x) end end end let src = code_typed1(cfg_finalization3, (IO,)) @test count(isinvoke(:add_finalization_count!), src.code) == 1 end let init_finalization_count!() cfg_finalization3(IOBuffer()) @test get_finalization_count() == 1000 end ``` To support this transformation, the domtree code also gains the ability to represent post-dominator trees, which is generally useful.
56c8a9c to
64f7e99
Compare
ianatol
left a comment
There was a problem hiding this comment.
I don't fully understand all of the domination logic here, but in general it seems to make sense. LGTM!
| length(D::DFSTree) = length(D.from_pre) | ||
|
|
||
| function DFS!(D::DFSTree, blocks::Vector{BasicBlock}) | ||
| function DFS!(D::DFSTree, blocks::Vector{BasicBlock}, is_post_dominator::Bool) |
There was a problem hiding this comment.
| function DFS!(D::DFSTree, blocks::Vector{BasicBlock}, is_post_dominator::Bool) | |
| function DFS!(D::DFSTree, blocks::Vector{BasicBlock}, is_postdominator::Bool) |
Per #46651 (comment)
There was a problem hiding this comment.
Fair callout, but I don't like this variable name that much anyway, so rather than going through a new CI cycle for this, I'll probably just rename this on the next refactor. DFS is really independent of domination, so this should probably just be is_reversed.
Found while testing bigger examples.
Found while testing bigger examples.
Found while testing bigger examples.
Found while testing bigger examples. Fixes the logic introduced in #46651
Eager
finalizerinlining (#45272) currently has a restriction thatrequires all the def/uses to be in a same basic block.
This commit relaxes that restriction a bit by allowing def/uses to
involve control flow when all of them are dominated by a
finalizercall to be inlined, since in that case it is safe to insert the body of
finalizerat the end of all the def/uses, e.g.There is a room for further improvement though -- if we have a way to
compute the post-domination, we can also inline a
finalizercallin a case when a
finalizerblock post-dominateds all the def/uses e.g.