Download BazelRegistryJson only once per registry#19292
Download BazelRegistryJson only once per registry#19292fmeum wants to merge 1 commit intobazelbuild:masterfrom
BazelRegistryJson only once per registry#19292Conversation
By caching `BazelRegistryJson` in `IndexRegistry` and caching `IndexRegistry` instances per registry URL, `bazel_registry.json` is only downloaded once per registry instead of once for each module in the final dependency graph in `computeFinalDepGraph`. On my local machine, this shaves 4s off of the time spent on module resolution for Bazel itself.
d4346e6 to
8fe04f3
Compare
| private Optional<BazelRegistryJson> getBazelRegistryJson(ExtendedEventHandler eventHandler) | ||
| throws IOException, InterruptedException { | ||
| if (bazelRegistryJson == null) { | ||
| synchronized (this) { |
There was a problem hiding this comment.
I don't think the synchronization is necessary here? at worst we'll just fetch it again.
I'm usually wary of synchronized (this) because if someone else does synchronized (yourObject), you're suddenly potentially deadlocked.
There was a problem hiding this comment.
That's true, although every time this is fetched again results in a slower overall fetch. This currently doesn't matter since all get he's are sequential (see the profile screenshot in the other PR), but might become relevant when we change that. Not sure.
There was a problem hiding this comment.
on second thought, after looking at the profile graph you posted in #19291 (comment), it looks like we might be trying to fetch the registry json file with many threads at around the same time. So synchronizing might be faster.
There was a problem hiding this comment.
Ah, I think I read that graph wrong. The MODULE.bazel files are fetched mostly concurrently, right? It's just all the "repo spec" fetches that are sequential. But why are there two "download file:" blocks in each MODULE.bazel fetch?
There was a problem hiding this comment.
That's for reading yanked info.
There was a problem hiding this comment.
Ah the second download is for the metadata.json to fetch the yanked version. There's a bit of room for improvement there (though not much) since we'd be fetching the same metadata.json file for all versions of the same module.
The much bigger thing is the repo spec fetching, which used to be lazy until we introduced the lockfile. We should definitely parallelize those. Exactly how, I'm not sure yet (we could use the skyframe threads, or just create a separate thread pool maybe)
|
wow, thanks for catching this! |
src/main/java/com/google/devtools/build/lib/bazel/bzlmod/RegistryFactoryImpl.java
Show resolved
Hide resolved
| private Optional<BazelRegistryJson> getBazelRegistryJson(ExtendedEventHandler eventHandler) | ||
| throws IOException, InterruptedException { | ||
| if (bazelRegistryJson == null) { | ||
| synchronized (this) { |
There was a problem hiding this comment.
That's true, although every time this is fetched again results in a slower overall fetch. This currently doesn't matter since all get he's are sequential (see the profile screenshot in the other PR), but might become relevant when we change that. Not sure.
|
@bazel-io flag |
|
@bazel-io fork 6.4.0 |
By caching `BazelRegistryJson` in `IndexRegistry` and caching `IndexRegistry` instances per registry URL, `bazel_registry.json` is only downloaded once per registry instead of once for each module in the final dependency graph in `computeFinalDepGraph`. On my local machine, this shaves 4s off of the time spent on module resolution for Bazel itself. Closes bazelbuild#19292. PiperOrigin-RevId: 558940780 Change-Id: I89b03a4c246b10f39b89a79852c922a6504f00bf
By caching `BazelRegistryJson` in `IndexRegistry` and caching `IndexRegistry` instances per registry URL, `bazel_registry.json` is only downloaded once per registry instead of once for each module in the final dependency graph in `computeFinalDepGraph`. On my local machine, this shaves 4s off of the time spent on module resolution for Bazel itself. Closes #19292. Commit 8337dd7 PiperOrigin-RevId: 558940780 Change-Id: I89b03a4c246b10f39b89a79852c922a6504f00bf Co-authored-by: Fabian Meumertzheim <fabian@meumertzhe.im>
|
The changes in this PR have been included in Bazel 6.4.0 RC1. Please test out the release candidate and report any issues as soon as possible. If you're using Bazelisk, you can point to the latest RC by setting USE_BAZEL_VERSION=last_rc. |
By caching
BazelRegistryJsoninIndexRegistryand cachingIndexRegistryinstances per registry URL,bazel_registry.jsonis only downloaded once per registry instead of once for each module in the final dependency graph incomputeFinalDepGraph.On my local machine, this shaves 4s off of the time spent on module resolution for Bazel itself.