Conversation
|
I benchmarked against an earlier version of this code. Running across all of crates.io (including solana) went from |
8979c6c to
0ff9d29
Compare
|
I tested |
|
Good to know! I will have to rerun when I get a chance. The most likely explanation is a random hiccup on my computer, that's pretty likely when carefully measuring time of a process that runs for an hour. |
I ran |
0ff9d29 to
ee74105
Compare
|
For |
|
A more targeted run, that only looked at crates containing the word |
| type Activations = im_rc::HashMap< | ||
| ActivationKey, | ||
| (Summary, ContextAge), | ||
| nohash_hasher::BuildNoHashHasher<ActivationKey>, |
There was a problem hiding this comment.
Note to self: This PR does a number of different things, and I'm trying to understand which ones are critical to the performance win. I tried switching this one nohash_hasher::BuildNoHashHasher back to rustc_hash::FxBuildHasher, and the Solana only benchmark I was running yesterday runs in 162.53min (or 164.63min there seems to be a surprising amount of run to run variability).
There was a problem hiding this comment.
So it seems that rustc-hash is already very optimized when hashing a single u64:
https://github.com/rust-lang/rustc-hash/blob/master/src/lib.rs#L99.
In this case the performance increase may only be due to fact that we don't hash the full ActivationKeyInner but only its address.
|
I wanted to spend some time teasing apart exactly what part of the old hash was slow. The old hash but modified to hash the pointer of the With the holiday I will probably not work on this till Monday. Sorry for the delay. |
|
I think we should also define a custom |
|
So I was able to get about 50% of benefit was much less churn. I change the order so that we only do the #[derive(Clone, PartialEq, Eq, Debug)]
pub struct ActivationsKey(InternedString, SemverCompatibility, SourceId);
impl std::hash::Hash for ActivationsKey {
fn hash<H: std::hash::Hasher>(&self, state: &mut H) {
std::ptr::NonNull::from(self.0.as_str()).hash(state);
self.1.hash(state);
// self.2.hash(state); // Packages that only differ by SourceId are rare enough to not be worth hashing
}
}So the other half the performance is because despite being interned So that leads to two questions about maintainability. Pinging @epage for strong and useful opinions about cleaning up our code base:
|
|
☔ The latest upstream changes (presumably #14692) made this pull request unmergeable. Please resolve the merge conflicts. |
|
If we can get the performance benefit without storing the activation key inside of a If there are cheap things we can do, like re-order fields, that seems like an easy win. Cleaning up |
This is a perf improvement I suggested #14665 (comment) I mostly want this landed to make it easier to compare the cost and benefits of more complicated changes. As this is the thing to compare against.
|
☔ The latest upstream changes (possibly 081d7ba) made this pull request unmergeable. Please resolve the merge conflicts. |
### What does this PR try to resolve? Despite being interned `SourceId::Eq` is not a `ptr::eq`. Which in turn is because `SourceId`s concept of identity is a complete mess. The code relies on having to IDs that are `Eq` but do not have the same values for their fields. As one measure of this `SourceId` has an `impl Hash` which does something different from `fn full_hash` and `fn stable_hash`. Separately `SourceIdInner` has a different implementation. Similar levels of complexity exist for `Eq`. Every one of these `impl`s was added due to a real bug/issue we've had that needs to stay fixed. Not all of witch are reproducible enough to have made it into our test suite. I [have some ideas](#14665 (comment)) for how to reorganize the types so that this is easier to reason about and faster. But given the history and the complexity I want to move extremely carefully. ### How should we test and review this PR? The test pass, and it's a one line change, but this still needs careful review. ### Additional information r? @ehuss I remember you and Alex working very hard to track down most of these bugs.
|
With #14800 and #14915 being merged and this being stale, I'm going to go ahead and close. If someone wants to resume similar work, I recommend checking out #14665 (comment) and #15649 |
What does this PR try to resolve?
Follow-up of #14663 which has been splitted.
How should we test and review this PR?
Commit 1 enables the
clippy::derive_ord_xor_partial_ordlint to be sure thatOrdandPartialOrdare implemented correctly everywhere in the repository.Commit 2 moves the
ActivationKeytype incore.Commit 3 adds interning for the resolver activation keys so they can be used with
nohash-hasher, decreasing resolving time by another 30%. This is possible since the interning is implemented with a staticHashSet, so each interned element can be reduced to its address.Performance comparison with the test added in the previous PR by setting
LAST_CRATE_VERSION_COUNT = 100:Firefox profiles, can be read with https://profiler.firefox.com:
perf.tar.gz
r? Eh2406