Conversation
base/compiler/ssair/inlining.jl
Outdated
|
|
||
| # This assumes the caller has verified that all arguments to the _apply call are Tuples. | ||
| function rewrite_apply_exprargs!(ir::IRCode, idx::Int, argexprs::Vector{Any}, atypes::Vector{Any}, arg_start::Int) | ||
| function rewrite_apply_exprargs!(ir::IRCode, todo, idx::Int, argexprs::Vector{Any}, atypes::Vector{Any}, arginfos::Vector{Any}, arg_start::Int, sv) |
base/compiler/ssair/inlining.jl
Outdated
| else | ||
| state = Core.svec() | ||
| for i = 1:length(thisarginfo.each) | ||
| mthd = thisarginfo.each[i] |
There was a problem hiding this comment.
| mthd = thisarginfo.each[i] | |
| meth = thisarginfo.each[i] |
Don't think we've ever used this mangling before
base/compiler/ssair/inlining.jl
Outdated
| end | ||
|
|
||
| function inline_apply!(ir::IRCode, idx::Int, sig::Signature, params::OptimizationParams) | ||
| function inline_apply!(ir::IRCode, todo, idx::Int, sig::Signature, params::OptimizationParams, sv) |
base/compiler/ssair/inlining.jl
Outdated
| if i != length(thisarginfo.each) | ||
| valT = getfield_tfunc(T, Const(1)) | ||
| val_extracted = insert_node!(ir, idx, valT, | ||
| Expr(:call, Core.getfield, state1, 1)) |
There was a problem hiding this comment.
| Expr(:call, Core.getfield, state1, 1)) | |
| Expr(:call, GlobalRef(Core, :getfield), state1, 1)) |
| # this method does not access the method table or otherwise process generic | ||
| # functions. | ||
| function process_simple!(ir::IRCode, idx::Int, params::OptimizationParams, world::UInt) | ||
| function process_simple!(ir::IRCode, todo, idx::Int, params::OptimizationParams, world::UInt, sv) |
base/compiler/ssair/inlining.jl
Outdated
| MethodMatchInfo(meth, ambig) | ||
| end | ||
|
|
||
| function analyze_single_call!(ir, todo, idx, stmt, sig, calltype, infos, sv) |
| meth = info.applicable | ||
| if meth === false || info.ambig | ||
| # Too many applicable methods | ||
| # Or there is a (partial?) ambiguity |
There was a problem hiding this comment.
What happened to this? (the comment specifically, the rest I know just moved)
There was a problem hiding this comment.
Looks like it got lost in rebase. Will fix.
base/compiler/ssair/inlining.jl
Outdated
| end | ||
| for match in meth::Vector{Any} | ||
| (metharg, methsp, method) = (match[1]::Type, match[2]::SimpleVector, match[3]::Method) | ||
| # TODO: This could be better |
| r = process_simple!(ir, todo, idx, sv.params, sv.world, sv) | ||
| r === nothing && continue | ||
|
|
||
| stmt = ir.stmts[idx][:inst] |
There was a problem hiding this comment.
In theory, the purpose of ir.stmts[idx] existing as a type now is so that we don't need to pass idx, stmt, type, and so on as separate arguments in new code
| end | ||
| length(cases) == 0 && return | ||
| push!(todo, UnionSplit(idx, fully_covered, sig.atype, cases)) | ||
| end |
There was a problem hiding this comment.
| end | |
| nothing | |
| end |
| matches::Vector{MethodMatchInfo} | ||
| end | ||
|
|
||
| struct AbstractIterationInfo |
There was a problem hiding this comment.
Can you add docs describing when each of these is legal to appear as the stmtinfo and what it means elsewhere to discover them on a statement?
This supersedes #36169. Rather than re-implementing the iteration analysis as done there, this uses the new stmtinfo infrastrcture to propagate all the analysis done during inference all the way to inlining. As a result, it applies not only to splats of singletons, but also to splats of any other short iterable that inference can analyze. E.g.: ``` f(x) = (x...,) @code_typed f(1=>2) @benchmark f(1=>2) ``` Before: ``` julia> @code_typed f(1=>2) CodeInfo( 1 ─ %1 = Core._apply_iterate(Base.iterate, Core.tuple, x)::Tuple{Int64,Int64} └── return %1 ) => Tuple{Int64,Int64} julia> @benchmark f(1=>2) BenchmarkTools.Trial: memory estimate: 96 bytes allocs estimate: 3 -------------- minimum time: 242.659 ns (0.00% GC) median time: 246.904 ns (0.00% GC) mean time: 255.390 ns (1.08% GC) maximum time: 4.415 μs (93.94% GC) -------------- samples: 10000 evals/sample: 405 ``` After: ``` julia> @code_typed f(1=>2) CodeInfo( 1 ─ %1 = Base.getfield(x, 1)::Int64 │ %2 = Base.getfield(x, 2)::Int64 │ %3 = Core.tuple(%1, %2)::Tuple{Int64,Int64} └── return %3 ) => Tuple{Int64,Int64} julia> @benchmark f(1=>2) BenchmarkTools.Trial: memory estimate: 0 bytes allocs estimate: 0 -------------- minimum time: 1.701 ns (0.00% GC) median time: 1.925 ns (0.00% GC) mean time: 1.904 ns (0.00% GC) maximum time: 6.941 ns (0.00% GC) -------------- samples: 10000 evals/sample: 1000 ``` I also implemented the TODO, I had left in #36169 to inline the iterate calls themselves, which gives another 3x improvement over the solution in that PR: ``` julia> @code_typed f(1) CodeInfo( 1 ─ %1 = Core.tuple(x)::Tuple{Int64} └── return %1 ) => Tuple{Int64} julia> @benchmark f(1) BenchmarkTools.Trial: memory estimate: 0 bytes allocs estimate: 0 -------------- minimum time: 1.696 ns (0.00% GC) median time: 1.699 ns (0.00% GC) mean time: 1.702 ns (0.00% GC) maximum time: 5.389 ns (0.00% GC) -------------- samples: 10000 evals/sample: 1000 ``` Fixes #36087 Fixes #29114
|
Note to self that we can revert #29060 when this merges. |
|
Well I didn't get time to look at this in any detail, but absent any meaningful technical contribution I thought I'd leave a note of appreciation instead: thanks Keno for fixing this! Although we toyed with other ways to fix this in #29114, improving the optimizer does seem like the right thing. |
PR JuliaLang#36684 changes `iterate(IncrementalCompact)` to return an extra index, but leaves its arguments unchanged. However, the PR decremented the index argument in a particular recursive call to `iterate`. This caused `iterate` not to recognise that it was done when `allow_cfg_transforms` was turned on.
PR JuliaLang#36684 changes `iterate(IncrementalCompact)` to return an extra index, but leaves its arguments unchanged. However, the PR decremented the index argument in a particular recursive call to `iterate`. This caused `iterate` not to recognise that it was done when `allow_cfg_transforms` was turned on.
This supersedes JuliaLang#36169. Rather than re-implementing the iteration analysis as done there, this uses the new stmtinfo infrastrcture to propagate all the analysis done during inference all the way to inlining. As a result, it applies not only to splats of singletons, but also to splats of any other short iterable that inference can analyze. E.g.: ``` f(x) = (x...,) @code_typed f(1=>2) @benchmark f(1=>2) ``` Before: ``` julia> @code_typed f(1=>2) CodeInfo( 1 ─ %1 = Core._apply_iterate(Base.iterate, Core.tuple, x)::Tuple{Int64,Int64} └── return %1 ) => Tuple{Int64,Int64} julia> @benchmark f(1=>2) BenchmarkTools.Trial: memory estimate: 96 bytes allocs estimate: 3 -------------- minimum time: 242.659 ns (0.00% GC) median time: 246.904 ns (0.00% GC) mean time: 255.390 ns (1.08% GC) maximum time: 4.415 μs (93.94% GC) -------------- samples: 10000 evals/sample: 405 ``` After: ``` julia> @code_typed f(1=>2) CodeInfo( 1 ─ %1 = Base.getfield(x, 1)::Int64 │ %2 = Base.getfield(x, 2)::Int64 │ %3 = Core.tuple(%1, %2)::Tuple{Int64,Int64} └── return %3 ) => Tuple{Int64,Int64} julia> @benchmark f(1=>2) BenchmarkTools.Trial: memory estimate: 0 bytes allocs estimate: 0 -------------- minimum time: 1.701 ns (0.00% GC) median time: 1.925 ns (0.00% GC) mean time: 1.904 ns (0.00% GC) maximum time: 6.941 ns (0.00% GC) -------------- samples: 10000 evals/sample: 1000 ``` I also implemented the TODO, I had left in JuliaLang#36169 to inline the iterate calls themselves, which gives another 3x improvement over the solution in that PR: ``` julia> @code_typed f(1) CodeInfo( 1 ─ %1 = Core.tuple(x)::Tuple{Int64} └── return %1 ) => Tuple{Int64} julia> @benchmark f(1) BenchmarkTools.Trial: memory estimate: 0 bytes allocs estimate: 0 -------------- minimum time: 1.696 ns (0.00% GC) median time: 1.699 ns (0.00% GC) mean time: 1.702 ns (0.00% GC) maximum time: 5.389 ns (0.00% GC) -------------- samples: 10000 evals/sample: 1000 ``` Fixes JuliaLang#36087 Fixes JuliaLang#29114
PR #36684 changes `iterate(IncrementalCompact)` to return an extra index, but leaves its arguments unchanged. However, the PR decremented the index argument in a particular recursive call to `iterate`. This caused `iterate` not to recognise that it was done when `allow_cfg_transforms` was turned on.
PR #36684 changes `iterate(IncrementalCompact)` to return an extra index, but leaves its arguments unchanged. However, the PR decremented the index argument in a particular recursive call to `iterate`. This caused `iterate` not to recognise that it was done when `allow_cfg_transforms` was turned on.
PR JuliaLang#36684 changes `iterate(IncrementalCompact)` to return an extra index, but leaves its arguments unchanged. However, the PR decremented the index argument in a particular recursive call to `iterate`. This caused `iterate` not to recognise that it was done when `allow_cfg_transforms` was turned on.
PR #36684 changes `iterate(IncrementalCompact)` to return an extra index, but leaves its arguments unchanged. However, the PR decremented the index argument in a particular recursive call to `iterate`. This caused `iterate` not to recognise that it was done when `allow_cfg_transforms` was turned on.
This supersedes #36169. Rather than re-implementing the iteration
analysis as done there, this uses the new stmtinfo infrastrcture
to propagate all the analysis done during inference all the way
to inlining. As a result, it applies not only to splats of
singletons, but also to splats of any other short iterable
that inference can analyze. E.g.:
Before:
After:
I also implemented the TODO, I had left in #36169 to inline
the iterate calls themselves, which gives another 3x
improvement over the solution in that PR:
Fixes #36087
Fixes #29114