Add AbstractInterpreter to parameterize compilation pipeline#33955
Add AbstractInterpreter to parameterize compilation pipeline#33955
AbstractInterpreter to parameterize compilation pipeline#33955Conversation
|
Note that I don't think this is actually the API we want to expose to the user. Rather it's something of a private API between the compiler and some other package (potentially a standard lib) that will provide more user friendly functionality on top of it. |
|
This is an "onboarding" project for me to get up to speed on some more compiler internals, and also hopefully build some good capabilities into the compiler in the process. :) |
|
Looks like there's one additional call to typeinf_type that got missed. Also, the |
|
Also, there's a number of places that pull the interpreter out of the |
12af525 to
03bfcbf
Compare
|
@Keno here's my sketch of what we discussed earlier today; it works (you can try it out with the code samples given above) but I seem to have somehow regressed a few tests in the |
base/compiler/typeinfer.jl
Outdated
| @timeit function typeinf_ext(linfo::MethodInstance, world::UInt) | ||
| # This is a bridge for the C code calling `jl_typinf_func()` | ||
| typeinf_ext(mi::MethodInstance, world::UInt) = typeinf_ext(NativeInterpreter(world), mi, world) | ||
| function typeinf_ext(interp::AbstractInterpreter, linfo::MethodInstance, world::UInt) |
There was a problem hiding this comment.
same comment here, I don't think I want this to necessarily be an argument.
There was a problem hiding this comment.
How should we disambiguate this typeinf_ext(::AbstractInterpreter, ::MethodInstance) from the one above?
There was a problem hiding this comment.
I'm a bit confused what the difference between these is supposed to be @JeffBezanson?
|
7289da2 to
0584a54
Compare
|
Tests are passing! All that remains now is for @JeffBezanson to illuminate to us the difference between the two |
|
|
e420ebd to
3c7a59b
Compare
|
Great; in that case it makes sense to me to rename it to |
3c7a59b to
0ed32ab
Compare
|
I've rebased this. Of course this doesn't do much by itself, but I think it's a nice cleanup, so I'd like to get it merged and build on top of it in future PRs. |
|
I'm still concerned this "Abstract" type is really a concretion, and is nonsensical to "abstract" over, as we don't want this API to be stable. |
|
Yes, I understand this API isn't fully formed, but we need to start somewhere to enable exploration and I think this goes in the right direction. The point here is to be able to hook into inference at various points that are interesting (right now primarily the caching logic). As we go along, we can figure out if there is a certain number of interesting cases (e.g. fully-static, partially static, etc.) and provide them in Base, or if we need to provide something semi-stable externally, but let's start here. |
|
On further reflection, let's wait until after the 1.5 branch, so this API has a chance to mature at least a little bit during the 1.6 cycle, since I know there are a fair amount of external consumers that are reaching directly into these internal APIs (which is part of what this line of work is designed to help with by giving more sane higher level entry points). |
This allows selective overriding of the compilation pipeline through
multiple dispatch, enabling projects like `XLA.jl` to maintain separate
inference caches, inference algorithms or heuristic algorithms while
inferring and lowering code. In particular, it defines a new type,
`AbstractInterpreter`, that represents an abstract interpretation
pipeline. This `AbstractInterpreter` has a single defined concrete
subtype, `NativeInterpreter`, that represents the native Julia
compilation pipeline. The `NativeInterpreter` contains within it all
the compiler parameters previously contained within `Params`, split into
two pieces: `InferenceParams` and `OptimizationParams`, used within type
inference and optimization, respectively. The interpreter object is
then threaded throughout most of the type inference pipeline, and allows
for straightforward prototyping and replacement of the compiler
internals.
As a simple example of the kind of workflow this enables, I include here
a simple testing script showing how to use this to easily get a list
of the number of times a function is inferred during type inference by
overriding just two functions within the compiler. First, I will define
here some simple methods to make working with inference a bit easier:
```julia
using Core.Compiler
import Core.Compiler: InferenceParams, OptimizationParams, get_world_counter, get_inference_cache
"""
@infer_function interp foo(1, 2) [show_steps=true] [show_ir=false]
Infer a function call using the given interpreter object, return
the inference object. Set keyword arguments to modify verbosity:
* Set `show_steps` to `true` to see the `InferenceResult` step by step.
* Set `show_ir` to `true` to see the final type-inferred Julia IR.
"""
macro infer_function(interp, func_call, kwarg_exs...)
if !isa(func_call, Expr) || func_call.head != :call
error("@infer_function requires a function call")
end
local func = func_call.args[1]
local args = func_call.args[2:end]
kwargs = []
for ex in kwarg_exs
if ex isa Expr && ex.head === :(=) && ex.args[1] isa Symbol
push!(kwargs, first(ex.args) => last(ex.args))
else
error("Invalid @infer_function kwarg $(ex)")
end
end
return quote
infer_function($(esc(interp)), $(esc(func)), typeof.(($(args)...,)); $(esc(kwargs))...)
end
end
function infer_function(interp, f, tt; show_steps::Bool=false, show_ir::Bool=false)
# Find all methods that are applicable to these types
fms = methods(f, tt)
if length(fms) != 1
error("Unable to find single applicable method for $f with types $tt")
end
# Take the first applicable method
method = first(fms)
# Build argument tuple
method_args = Tuple{typeof(f), tt...}
# Grab the appropriate method instance for these types
mi = Core.Compiler.specialize_method(method, method_args, Core.svec())
# Construct InferenceResult to hold the result,
result = Core.Compiler.InferenceResult(mi)
if show_steps
@info("Initial result, before inference: ", result)
end
# Create an InferenceState to begin inference, give it a world that is always newest
world = Core.Compiler.get_world_counter()
frame = Core.Compiler.InferenceState(result, #=cached=# true, interp)
# Run type inference on this frame. Because the interpreter is embedded
# within this InferenceResult, we don't need to pass the interpreter in.
Core.Compiler.typeinf_local(interp, frame)
if show_steps
@info("Ending result, post-inference: ", result)
end
if show_ir
@info("Inferred source: ", result.result.src)
end
# Give the result back
return result
end
```
Next, we define a simple function and pass it through:
```julia
function foo(x, y)
return x + y * x
end
native_interpreter = Core.Compiler.NativeInterpreter()
inferred = @infer_function native_interpreter foo(1.0, 2.0) show_steps=true show_ir=true
```
This gives a nice output such as the following:
```julia-repl
┌ Info: Initial result, before inference:
└ result = foo(::Float64, ::Float64) => Any
┌ Info: Ending result, post-inference:
└ result = foo(::Float64, ::Float64) => Float64
┌ Info: Inferred source:
│ result.result.src =
│ CodeInfo(
│ @ REPL[1]:3 within `foo'
│ 1 ─ %1 = (y * x)::Float64
│ │ %2 = (x + %1)::Float64
│ └── return %2
└ )
```
We can then define a custom `AbstractInterpreter` subtype that will
override two specific pieces of the compilation process; managing the
runtime inference cache. While it will transparently pass all information
through to a bundled `NativeInterpreter`, it has the ability to force cache
misses in order to re-infer things so that we can easily see how many
methods (and which) would be inferred to compile a certain method:
```julia
struct CountingInterpreter <: Compiler.AbstractInterpreter
visited_methods::Set{Core.Compiler.MethodInstance}
methods_inferred::Ref{UInt64}
# Keep around a native interpreter so that we can sub off to "super" functions
native_interpreter::Core.Compiler.NativeInterpreter
end
CountingInterpreter() = CountingInterpreter(
Set{Core.Compiler.MethodInstance}(),
Ref(UInt64(0)),
Core.Compiler.NativeInterpreter(),
)
InferenceParams(ci::CountingInterpreter) = InferenceParams(ci.native_interpreter)
OptimizationParams(ci::CountingInterpreter) = OptimizationParams(ci.native_interpreter)
get_world_counter(ci::CountingInterpreter) = get_world_counter(ci.native_interpreter)
get_inference_cache(ci::CountingInterpreter) = get_inference_cache(ci.native_interpreter)
function Core.Compiler.inf_for_methodinstance(interp::CountingInterpreter, mi::Core.Compiler.MethodInstance, min_world::UInt, max_world::UInt=min_world)
# Hit our own cache; if it exists, pass on to the main runtime
if mi in interp.visited_methods
return Core.Compiler.inf_for_methodinstance(interp.native_interpreter, mi, min_world, max_world)
end
# Otherwise, we return `nothing`, forcing a cache miss
return nothing
end
function Core.Compiler.cache_result(interp::CountingInterpreter, result::Core.Compiler.InferenceResult, min_valid::UInt, max_valid::UInt)
push!(interp.visited_methods, result.linfo)
interp.methods_inferred[] += 1
return Core.Compiler.cache_result(interp.native_interpreter, result, min_valid, max_valid)
end
function reset!(interp::CountingInterpreter)
empty!(interp.visited_methods)
interp.methods_inferred[] = 0
return nothing
end
```
Running it on our testing function:
```julia
counting_interpreter = CountingInterpreter()
inferred = @infer_function counting_interpreter foo(1.0, 2.0)
@info("Cumulative number of methods inferred: $(counting_interpreter.methods_inferred[])")
inferred = @infer_function counting_interpreter foo(1, 2) show_ir=true
@info("Cumulative number of methods inferred: $(counting_interpreter.methods_inferred[])")
inferred = @infer_function counting_interpreter foo(1.0, 2.0)
@info("Cumulative number of methods inferred: $(counting_interpreter.methods_inferred[])")
reset!(counting_interpreter)
@info("Cumulative number of methods inferred: $(counting_interpreter.methods_inferred[])")
inferred = @infer_function counting_interpreter foo(1.0, 2.0)
@info("Cumulative number of methods inferred: $(counting_interpreter.methods_inferred[])")
```
Also gives us a nice result:
```
[ Info: Cumulative number of methods inferred: 2
┌ Info: Inferred source:
│ result.result.src =
│ CodeInfo(
│ @ /Users/sabae/src/julia-compilerhack/AbstractInterpreterTest.jl:81 within `foo'
│ 1 ─ %1 = (y * x)::Int64
│ │ %2 = (x + %1)::Int64
│ └── return %2
└ )
[ Info: Cumulative number of methods inferred: 4
[ Info: Cumulative number of methods inferred: 4
[ Info: Cumulative number of methods inferred: 0
[ Info: Cumulative number of methods inferred: 2
```
This disambiguates the two methods, allowing us to eliminate the redundant `world::UInt` parameter.
0ed32ab to
10b572c
Compare
This is the next step in the line of work started by #33955, though a lot of enabling work towards this was previously done by Jameson in his codegen-norecursion branch. The basic thrust here is to allow external packages to manage their own cache of compiled code that may have been generated using entirely difference inference or compiler options. The GPU compilers are one such example, but there are several others, including generating code using offload compilers, such as XLA or compilers for secure computation. A lot of this is just moving code arround to make it clear exactly which parts of the code are accessing the internal code cache (which is now its own type to make it obvious when it's being accessed), as well as providing clear extension points for custom cache implementations. The second part is to refactor CodeInstance construction to separate construction and insertion into the internal cache (so it can be inserted into an external cache instead if desired). The last part of the change is to give cgparams another hook that lets the caller replace the cache lookup to be used by codegen.
|
As I said before in other channels, and in the unaddressed comment just before you merged this, I still think this may significantly misrepresent the design and stability of the existing code, as there's very little that seems to me to be "abstract" about this PR. |
This is the next step in the line of work started by #33955, though a lot of enabling work towards this was previously done by Jameson in his codegen-norecursion branch. The basic thrust here is to allow external packages to manage their own cache of compiled code that may have been generated using entirely difference inference or compiler options. The GPU compilers are one such example, but there are several others, including generating code using offload compilers, such as XLA or compilers for secure computation. A lot of this is just moving code arround to make it clear exactly which parts of the code are accessing the internal code cache (which is now its own type to make it obvious when it's being accessed), as well as providing clear extension points for custom cache implementations. The second part is to refactor CodeInstance construction to separate construction and insertion into the internal cache (so it can be inserted into an external cache instead if desired). The last part of the change is to give cgparams another hook that lets the caller replace the cache lookup to be used by codegen.
This is the next step in the line of work started by #33955, though a lot of enabling work towards this was previously done by Jameson in his codegen-norecursion branch. The basic thrust here is to allow external packages to manage their own cache of compiled code that may have been generated using entirely difference inference or compiler options. The GPU compilers are one such example, but there are several others, including generating code using offload compilers, such as XLA or compilers for secure computation. A lot of this is just moving code arround to make it clear exactly which parts of the code are accessing the internal code cache (which is now its own type to make it obvious when it's being accessed), as well as providing clear extension points for custom cache implementations. The second part is to refactor CodeInstance construction to separate construction and insertion into the internal cache (so it can be inserted into an external cache instead if desired). The last part of the change is to give cgparams another hook that lets the caller replace the cache lookup to be used by codegen.
|
|
||
| An abstract base class that allows multiple dispatch to determine the method of | ||
| executing Julia code. The native Julia LLVM pipeline is enabled by using the | ||
| `TypeInference` concrete instantiatoin of this abstract class, others can be |
| end | ||
|
|
||
| function show(io::IO, ::Core.Compiler.NativeInterpreter) | ||
| print(io, "Core.Compiler.NativeInterpreter") |
There was a problem hiding this comment.
Core.Compiler.NativeInterpreter(...)
| # constants # | ||
| ############# | ||
|
|
||
| const DEFAULT_INTERPRETER = NativeInterpreter(UInt(0)) |
This is the next step in the line of work started by #33955, though a lot of enabling work towards this was previously done by Jameson in his codegen-norecursion branch. The basic thrust here is to allow external packages to manage their own cache of compiled code that may have been generated using entirely difference inference or compiler options. The GPU compilers are one such example, but there are several others, including generating code using offload compilers, such as XLA or compilers for secure computation. A lot of this is just moving code arround to make it clear exactly which parts of the code are accessing the internal code cache (which is now its own type to make it obvious when it's being accessed), as well as providing clear extension points for custom cache implementations. The second part is to refactor CodeInstance construction to separate construction and insertion into the internal cache (so it can be inserted into an external cache instead if desired). The last part of the change is to give cgparams another hook that lets the caller replace the cache lookup to be used by codegen.
* Refactor cache logic for easy replacement This is the next step in the line of work started by #33955, though a lot of enabling work towards this was previously done by Jameson in his codegen-norecursion branch. The basic thrust here is to allow external packages to manage their own cache of compiled code that may have been generated using entirely difference inference or compiler options. The GPU compilers are one such example, but there are several others, including generating code using offload compilers, such as XLA or compilers for secure computation. A lot of this is just moving code arround to make it clear exactly which parts of the code are accessing the internal code cache (which is now its own type to make it obvious when it's being accessed), as well as providing clear extension points for custom cache implementations. The second part is to refactor CodeInstance construction to separate construction and insertion into the internal cache (so it can be inserted into an external cache instead if desired). The last part of the change is to give cgparams another hook that lets the caller replace the cache lookup to be used by codegen. * Update base/compiler/cicache.jl Co-authored-by: Tim Besard <tim.besard@gmail.com> * Apply suggestions from code review Co-authored-by: Jameson Nash <vtjnash@gmail.com> * Rename always_cache_tree -> !allow_discard_tree Co-authored-by: Tim Besard <tim.besard@gmail.com> Co-authored-by: Jameson Nash <vtjnash@gmail.com>
* Refactor cache logic for easy replacement This is the next step in the line of work started by JuliaLang#33955, though a lot of enabling work towards this was previously done by Jameson in his codegen-norecursion branch. The basic thrust here is to allow external packages to manage their own cache of compiled code that may have been generated using entirely difference inference or compiler options. The GPU compilers are one such example, but there are several others, including generating code using offload compilers, such as XLA or compilers for secure computation. A lot of this is just moving code arround to make it clear exactly which parts of the code are accessing the internal code cache (which is now its own type to make it obvious when it's being accessed), as well as providing clear extension points for custom cache implementations. The second part is to refactor CodeInstance construction to separate construction and insertion into the internal cache (so it can be inserted into an external cache instead if desired). The last part of the change is to give cgparams another hook that lets the caller replace the cache lookup to be used by codegen. * Update base/compiler/cicache.jl Co-authored-by: Tim Besard <tim.besard@gmail.com> * Apply suggestions from code review Co-authored-by: Jameson Nash <vtjnash@gmail.com> * Rename always_cache_tree -> !allow_discard_tree Co-authored-by: Tim Besard <tim.besard@gmail.com> Co-authored-by: Jameson Nash <vtjnash@gmail.com>
This allows selective overriding of the compilation pipeline through
multiple dispatch, enabling projects like
XLA.jlto maintain separateinference caches, inference algorithms or heuristic algorithms while
inferring and lowering code. In particular, it defines a new type,
AbstractInterpreter, that represents an abstract interpretationpipeline. This
AbstractInterpreterhas a single defined concretesubtype,
NativeInterpreter, that represents the native Juliacompilation pipeline. The
NativeInterpretercontains within it allthe compiler parameters previously contained within
Params, split intotwo pieces:
InferenceParamsandOptimizationParams, used within typeinference and optimization, respectively. The interpreter object is
then threaded throughout most of the type inference pipeline, and allows
for straightforward prototyping and replacement of the compiler
internals.
As a simple example of the kind of workflow this enables, I include here
a simple testing script showing how to use this to easily get a list
of the number of times a function is inferred during type inference by
overriding just two functions within the compiler. First, I will define
here some simple methods to make working with inference a bit easier:
Next, we define a simple function and pass it through:
This gives a nice output such as the following:
We can then define a custom
AbstractInterpretersubtype that willoverride two specific pieces of the compilation process; managing the
runtime inference cache. While it will transparently pass all information
through to a bundled
NativeInterpreter, it has the ability to force cachemisses in order to re-infer things so that we can easily see how many
methods (and which) would be inferred to compile a certain method:
Running it on our testing function:
Also gives us a nice result: