Allow for querying of build_id from objects#53943
Merged
Conversation
Julia objects can be perma-allocated in package images, each package image has a corresponding top module. We currently can map Julia objects to loaded image, but we don't keep track of the corresponding top module. This can be useful to ask for the build_id of the package image we are using. Non perma-allocated objects are mapped to `Main`.
3 tasks
vchuravy
commented
Apr 9, 2024
59 tasks
vchuravy
added a commit
that referenced
this pull request
Apr 19, 2024
For GPUCompiler we would like to support a native on disk cache of LLVM IR. One of the longstanding issues has been the cache invalidation of such an on disk cache. With #52233 we now have an integrated cache for the inference results and we can rely on `CodeInstance` to be stable across sessions. Due to #52119 we can also rely on the `objectid` to be stable. My inital thought was to key the native disk cache in GPUCompiler on the objectid of the corresponding CodeInstance (+ some compilation parameters). While discussing this with @rayegun yesterday we noted that having a CodeInstance with the same objectid might not be enough provenance. E.g we are not gurantueed that the CodeInstance is from the same build artifact and the same precise source code. For the package images we are tracking this during loading and validate all contents at once, and we keep explicitly track of the provenance chain. This PR adds a lookup up table where we map from "external_blobs" e.g. loaded images, to the corresponding top module of each image, and uses this to determine the build_id of the package image. (cherry picked from commit d47cbf6)
KristofferC
pushed a commit
that referenced
this pull request
May 27, 2024
This reverts commit fcad4b9.
Member
|
The tests for this fails on the 1.11 branch (https://buildkite.com/julialang/julia-release-1-dot-11/builds/80#018fa5f5-1b9e-41e8-8330-0cfca9f5128c): so I have reverted this PR on it. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
For GPUCompiler we would like to support a native on disk cache of LLVM IR.
One of the longstanding issues has been the cache invalidation of such an on disk cache.
With #52233 we now have an integrated cache for the inference results and we can rely
on
CodeInstanceto be stable across sessions. Due to #52119 we can also rely on theobjectidto be stable.My inital thought was to key the native disk cache in GPUCompiler on the objectid of
the corresponding CodeInstance (+ some compilation parameters).
While discussing this with @rayegun yesterday we noted that having a CodeInstance with
the same objectid might not be enough provenance. E.g we are not gurantueed that the
CodeInstance is from the same build artifact and the same precise source code.
For the package images we are tracking this during loading and validate all contents
at once, and we keep explicitly track of the provenance chain.
This PR adds a lookup up table where we map from "external_blobs" e.g. loaded images,
to the corresponding top module of each image, and uses this to determine the
build_id of the package image.
Objects that are not perma allocated are mapped to
Main.Mainitself is a bit weird:So
Mainis itself perma-allocated throughBaseand thus the sysimage.Mainitself is not a topmod. So maybe it is not correct module to return forruntime allocated objects.
Edit: I changed this to return
nothingfor runtime allocated objects.