optimizer: enhance SROA, handle partially-initialized allocations#42834
optimizer: enhance SROA, handle partially-initialized allocations#42834
Conversation
7ccb600 to
a9ae9f2
Compare
a9ae9f2 to
72715ee
Compare
| break | ||
| for use in du.uses | ||
| def = find_def_for_use(ir, domtree, allblocks, du, use)[1] | ||
| (def == 0 || def == idx) && @goto skip |
There was a problem hiding this comment.
If def == 0 here, doesn't that mean that this field is truly unitialized? maybe I am misunderstanding
There was a problem hiding this comment.
def is the index where the object field is defined, so it could be:
- object allocation site:
def >= 1 setfield!call:def >= 1- PhiNode (i.e. the object field can be defined in different control flow, and different possibilities are merged at PhiNode):
def == 0
So def == 0 case here corresponds to something like:
let src = code_typed((Bool,)) do cond
r = Ref{Any}()
if cond
r[] = 42
else
r[] = 32
end
return r[] # ::PhiNode(32, 42)
end |> only |> first
@test_broken !any(src.code) do @nospecialize x
Meta.isexpr(x, :new)
end
endSince traversing such PhiNode could be complicated (I think?), I just leave it as TODO.
Maybe this is worth to be left as comment ?
| # Everything accounted for. Go field by field and perform idf | ||
| for (fidx, du) in pairs(fielddefuse) | ||
| for fidx in 1:ndefuse | ||
| du = fielddefuse[fidx] |
There was a problem hiding this comment.
A bit nitpicky, but is there any difference between the original and this change?
There was a problem hiding this comment.
Ah, no difference. If fielddefuse is type-unstable and it can contain arbitrary type objects (like user-given expressions fielddefuse = x.args where x::Expr), then iterators like pairs(fielddefuse) may produce tuple object whose type is very dynamic (which may cause lots of dispatch with almost no benefit), but for this case we don't need to care that since fielddefuse::Vector{SSADefUse}.
|
lgtm --- basically move doing idf back so as to include structs with potentially uninitialized fields, then handle those structs by making sure that any field we think could be uninitialized was defined somewhere |
16b1154 to
3c6dfeb
Compare
|
Will this break the ability to use |
72715ee to
a2742c9
Compare
|
Maybe ? If the definition is available in the analysis scope where |
Yea, the core change is here: julia/base/compiler/ssair/passes.jl Lines 826 to 829 in a2742c9 Previously we give up anytime when allocation contains uninitialized field ( fidx + 1 > length(defexpr.args)), but this PR tries to check if there is a "definitive" definition there (!(def == 0 || def == idx)) and then tries scalar replace.
|
|
Can you post results of |
|
ah, |
|
On this PR: julia> @btime (1+1)
0.032 ns (0 allocations: 0 bytes)
2
julia> @btime 1+Ref(1)[]
1.235 ns (0 allocations: 0 bytes)
2The essential difference is that julia> code_typed() do
a = 1 + Ref(1)[]
b = 1 + 1
a, b
end
1-element Vector{Any}:
CodeInfo(
1 ─ %1 = Base.add_int(1, 1)::Int64
│ %2 = Core.tuple(%1, 2)::Tuple{Int64, Int64}
└── return %2
) => Tuple{Int64, Int64} |
|
@nanosoldier |
|
It seems odd those aren't the same. We should fix that. The benchmarks are supposed to interpolate that Ref object to avoid the optimizer removing it. |
|
Hm I don't see the oddness. Whether or not |
@oscardssmith the correct invocation is |
|
I don't think that is the key difference, since: But maybe something with the effect-free computation interacting with no-inline? |
This is what @Roger-luo observed in Yao. Constant propagation occurs during type-inference, but optimizations can form constant expressions, which will then be optimized by LLVM. Which is okay in the normal case, but cause us to write a const-propagation pass on SSAIR for YaoCompiler. |
I think the actual difference in performance stems from: julia> f() = 1+1; g() = 1+Ref(1)[]; f(); g();
julia> Core.Compiler.invoke_api(only(methods(f)).specializations[1].cache) # with use constant calling convention
2
julia> Core.Compiler.invoke_api(only(methods(g)).specializations[1].cache) # just a usual function call
-1So the question might be what Valentin pointed out: constant-folding at optimization time (rather, re-inference to maximize general optimization opportunities?). Or introduce mutation-analysis into inference and constant-fold more. |
I don't think re-running inference is possible (opens up a whole bag of convergence questions). Doing constant-folding at optimization time is likely beneficial and cheap, and can enable other optimizations.
That might be the way to go, but I am cautious over the cost-benefit here. I would rather enable this as SSAIR optimization so that we can do it after inlining as well. |
|
Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. |
0b5f426 to
2520aa2
Compare
|
@nanosoldier |
|
Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. |
During adding more test cases for our SROA pass, I found our SROA doesn't handle allocation sites with uninitialized fields at all. This commit is based on #42833 and tries to handle such "unsafe" allocations, if there are safe `setfield!` definitions. For example, this commit allows the allocation `r = Ref{Int}()` to be eliminated in the following example (adapted from <https://hackmd.io/bZz8k6SHQQuNUW-Vs7rqfw?view>): ```julia julia> code_typed() do r = Ref{Int}() r[] = 42 b = sin(r[]) return b end |> only ``` This commit comes with a plenty of basic test cases for our SROA pass also.
2520aa2 to
b228e48
Compare
…liaLang#42834) During adding more test cases for our SROA pass, I found our SROA doesn't handle allocation sites with uninitialized fields at all. This commit is based on JuliaLang#42833 and tries to handle such "unsafe" allocations, if there are safe `setfield!` definitions. For example, this commit allows the allocation `r = Ref{Int}()` to be eliminated in the following example (adapted from <https://hackmd.io/bZz8k6SHQQuNUW-Vs7rqfw?view>): ```julia julia> code_typed() do r = Ref{Int}() r[] = 42 b = sin(r[]) return b end |> only ``` This commit comes with a plenty of basic test cases for our SROA pass also.
…liaLang#42834) During adding more test cases for our SROA pass, I found our SROA doesn't handle allocation sites with uninitialized fields at all. This commit is based on JuliaLang#42833 and tries to handle such "unsafe" allocations, if there are safe `setfield!` definitions. For example, this commit allows the allocation `r = Ref{Int}()` to be eliminated in the following example (adapted from <https://hackmd.io/bZz8k6SHQQuNUW-Vs7rqfw?view>): ```julia julia> code_typed() do r = Ref{Int}() r[] = 42 b = sin(r[]) return b end |> only ``` This commit comes with a plenty of basic test cases for our SROA pass also.
During adding more test cases for our SROA pass, I found our SROA doesn't
handle allocation sites with uninitialized fields at all.
This commit is built on top of #42833 and tries to handle such "unsafe" allocations,
if there are safe
setfield!definitions.For example, this commit allows the allocation
r = Ref{Int}()to beeliminated in the following example (adapted from https://hackmd.io/bZz8k6SHQQuNUW-Vs7rqfw?view):
This commit comes with a plenty of basic test cases for our SROA pass also.
Two test cases depend on #42831 as well.