Allow PrecompileTools to see MI's inferred by foreign abstract interpreters#54069
Merged
Allow PrecompileTools to see MI's inferred by foreign abstract interpreters#54069
Conversation
Member
Author
|
I locally ported this to diff --git a/Project.toml b/Project.toml
index db564395a..47217fe27 100644
--- a/Project.toml
+++ b/Project.toml
@@ -23,6 +23,7 @@ Libdl = "8f399da3-3557-5675-b5ff-fb832c97cbdb"
LinearAlgebra = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e"
Logging = "56ddb016-857b-54e1-b83d-db4d58db5568"
NVTX = "5da4648a-3479-48b8-97b9-01cb529c0a1f"
+PrecompileTools = "aea7be01-6a6a-4083-8856-8a6e6704d82a"
Preferences = "21216c6a-2e73-6563-6e65-726566657250"
PrettyTables = "08abe8d2-0d0c-5749-adfa-8a2ac140af0d"
Printf = "de0858da-6303-5e67-8744-51eddeeeb8d7"
@@ -65,6 +66,7 @@ Libdl = "1"
LinearAlgebra = "1"
Logging = "1"
NVTX = "0.3.2"
+PrecompileTools = "1.2.1"
Preferences = "1"
PrettyTables = "2"
Printf = "1"
diff --git a/src/precompile.jl b/src/precompile.jl
index dc1b2ce2e..ddbe26dd2 100644
--- a/src/precompile.jl
+++ b/src/precompile.jl
@@ -14,3 +14,15 @@ precompile(run_and_collect, (Cmd,))
precompile(cudaconvert, (Function,))
precompile(Core.kwfunc(cudacall), (NamedTuple{(:threads, :blocks), Tuple{Int64, Int64}},typeof(cudacall),CuFunction,Type{Tuple{}}))
precompile(Core.kwfunc(launch), (NamedTuple{(:threads, :blocks), Tuple{Int64, Int64}},typeof(launch),CuFunction))
+
+using PrecompileTools: @setup_workload, @compile_workload
+@setup_workload let
+ @compile_workload begin
+ target = PTXCompilerTarget(; cap=v"7.5")
+ params = CUDACompilerParams(; cap=v"7.5", ptx=v"7.5")
+ config = CompilerConfig(target, params)
+ mi = GPUCompiler.methodinstance(typeof(identity), Tuple{Nothing})
+ job = CompilerJob(mi, config)
+ GPUCompiler.code_native(devnull, job)
+ end
+endThis works correctly now. |
Member
|
Does that mean external absints using the integrated (internal) cache is now able to persist its codeinst caches across sessions? |
Member
Author
|
They were already able to do so (after #52233), but the integration with PrecompileTools didn't work. E.g. this test https://github.com/JuliaGPU/GPUCompiler.jl/blob/9f90e9d8c7b1049932083847cffcd442699c81c0/test/native_tests.jl#L594 was not working yet. |
KristofferC
pushed a commit
that referenced
this pull request
Apr 17, 2024
…reters (#54069) Partially reverts #49391 PrecompileTools uses the timing infrastructure to snoop on the inference process. The reason for #49391 was that this could lead to accidental pollution of the caches with foreign results (JuliaDebug/SnoopCompile.jl#338) After #52233 and especially #53336 we now filter results by cache owner and don't try to cache foreign code using the native pipeline. Motivated by JuliaGPU/GPUCompiler.jl#567 which demonstrated that a foreign code instance would not be cached without PrecompileTools. (cherry picked from commit c0611e8)
59 tasks
KristofferC
added a commit
that referenced
this pull request
May 28, 2024
Backported PRs: - [x] #53665 <!-- use afoldl instead of tail recursion for tuples --> - [x] #53976 <!-- LinearAlgebra: LazyString in interpolated error messages --> - [x] #54005 <!-- make `view(::Memory, ::Colon)` produce a Vector --> - [x] #54010 <!-- Overload `Base.literal_pow` for `AbstractQ` --> - [x] #54069 <!-- Allow PrecompileTools to see MI's inferred by foreign abstract interpreters --> - [x] #53750 <!-- inference correctness: fields and globals can revert to undef --> - [x] #53984 <!-- Profile: fix heap snapshot is valid char check --> - [x] #54102 <!-- Explicitly compute stride in unaliascopy for SubArray --> - [x] #54070 <!-- Fix integer overflow in `skip(s::IOBuffer, typemax(Int64))` --> - [x] #54013 <!-- Support case-changes to Annotated{String,Char}s --> - [x] #53941 <!-- Fix writing of AnnotatedChars to AnnotatedIOBuffer --> - [x] #54137 <!-- Fix typo in docs for `partialsortperm` --> - [x] #54129 <!-- use correct size when creating output data from an IOBuffer --> - [x] #54153 <!-- Fixup IdSet docstring --> - [x] #54143 <!-- Fix `make install` from tarballs --> - [x] #54151 <!-- LinearAlgebra: Correct zero element in `_generic_matvecmul!` for block adj/trans --> - [x] #54213 <!-- Add `public` statement to `Base.GC` --> - [x] #54222 <!-- Utilize correct tbaa when emitting stores of unions. --> - [x] #54233 <!-- set MAX_OS_WRITE on unix --> - [x] #54255 <!-- fix `_checked_mul_dims` in the presence of 0s and overflow. --> - [x] #54259 <!-- Fix typo in `readuntil` --> - [x] #54251 <!-- fix typo in gc_mark_memory8 when chunking a large array --> - [x] #54276 <!-- Fix solve for complex `Hermitian` with non-vanishing imaginary part on diagonal --> - [x] #54248 <!-- ensure package callbacks are invoked when no valid precompile file exists for an "auto loaded" stdlib --> - [x] #54308 <!-- Implement eval-able AnnotatedString 2-arg show --> - [x] #54302 <!-- Specialised substring equality for annotated strs --> - [x] #54243 <!-- prevent `package_callbacks` to run multiple time for a single package --> - [x] #54350 <!-- add a precompile signature to Artifacts code that is used by JLLs --> - [x] #54331 <!-- correctly track freed bytes in jl_genericmemory_to_string --> - [x] #53509 <!-- revert moving "creating packages" from Pkg.jl --> - [x] #54335 <!-- When accessing the data pointer for an array, first decay it to a Derived Pointer --> - [x] #54239 <!-- Make sure `fieldcount` constant-folds for `Tuple{...}` --> - [x] #54288 - [x] #54067 - [x] #53715 <!-- Add read/write specialisation for IOContext{AnnIO} --> - [x] #54289 <!-- Rework annotation ordering/optimisations --> - [x] #53815 <!-- create phantom task for GC threads --> - [x] #54130 <!-- inference: handle `LimitedAccuracy` in `handle_global_assignment!` --> - [x] #54428 <!-- Move ConsoleLogging.jl into Base --> - [x] #54332 <!-- Revert "add unsetindex support to more copyto methods (#51760)" --> - [x] #53826 <!-- Make all command-line options documented in all related files --> - [x] #54465 <!-- typeintersect: conservative typevar subtitution during `finish_unionall` --> - [x] #54514 <!-- typeintersect: followup cleanup for the nothrow path of type instantiation --> - [x] #54499 <!-- make `@doc x` work without REPL loaded --> - [x] #54210 <!-- attach finalizer in `mmap` to the correct object --> - [x] #54359 <!-- Pkg REPL: cache `pkg_mode` lookup --> Non-merged PRs with backport label: - [ ] #54471 <!-- Actually setup jit targets when compiling packageimages instead of targeting only one --> - [ ] #54457 <!-- Make `String(::Memory)` copy --> - [ ] #54323 <!-- inference: fix too conservative effects for recursive cycles --> - [ ] #54322 <!-- effects: add new `@consistent_overlay` macro --> - [ ] #54191 <!-- make `AbstractPipe` public --> - [ ] #53957 <!-- tweak how filtering is done for what packages should be precompiled --> - [ ] #53882 <!-- Warn about cycles in extension precompilation --> - [ ] #53707 <!-- Make ScopedValue public --> - [ ] #53452 <!-- RFC: allow Tuple{Union{}}, returning Union{} --> - [ ] #53402 <!-- Add `jl_getaffinity` and `jl_setaffinity` --> - [ ] #53286 <!-- Raise an error when using `include_dependency` with non-existent file or directory --> - [ ] #52694 <!-- Reinstate similar for AbstractQ for backward compatibility --> - [ ] #51479 <!-- prevent code loading from lookin in the versioned environment when building Julia -->
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Partially reverts #49391
PrecompileTools uses the timing infrastructure to snoop on the inference process.
The reason for #49391 was that this could lead to accidental pollution of the caches
with foreign results (JuliaDebug/SnoopCompile.jl#338)
After #52233 and especially #53336 we now filter results by cache owner and
don't try to cache foreign code using the native pipeline.
Motivated by JuliaGPU/GPUCompiler.jl#567 which demonstrated
that a foreign code instance would not be cached without PrecompileTools.