I have an IdDict with SimpleVector keys, I see a lot of time is spent calling jl_object_id_ to hash the keys, specifically int64hash, which is called to hash any 64 bit elements in the key and it's also called in bitmix.
It seems jl_object_id_ is mainly used for hashtables like IdDict and IdSet in which case would it be fine to switch to something like FxHasher (used in the rust compiler) to replace both bitmix and int64hash?
Replacing bitmix in hash_svec with FxHasher gives a 10-100x improvement on @benchmark hash(x) setup=x=Core.svec(rand(Int64, 10^n)...) for n in 1 to 6.
However, FxHasher for 64 bit ints is just a multiplication by a constant, though if objectid is only used in internal hashtables, like IdDict, I think this should be fine. Whilst objectid is used to hash general objects on the Julia side, its passed through a hash so I don't think it should affect hash in Julia too much.
Alternatively, we could try using the non AES intrinsic version of ahash, this should still be much faster than what we have currently and higher quality than FxHasher. Again, ahash says it's specifically for hashtables.
Alternatively, to replace only int64hash, Squirrel3 is quite simple and seems to be reasonably fast (about 1.5-2x faster than int64hash) and should have higher quality hashes than FxHasher.
Does it seem reasonable to make a change along these lines? I see that currently int64hash is invertible, do we care about this? I don't see it being used anywhere.
This would also speed up my robinhood implementation of IdDict.
I have an
IdDictwithSimpleVectorkeys, I see a lot of time is spent callingjl_object_id_to hash the keys, specifically int64hash, which is called to hash any 64 bit elements in the key and it's also called in bitmix.It seems
jl_object_id_is mainly used for hashtables likeIdDictandIdSetin which case would it be fine to switch to something like FxHasher (used in the rust compiler) to replace bothbitmixandint64hash?Replacing
bitmixinhash_svecwith FxHasher gives a 10-100x improvement on@benchmark hash(x) setup=x=Core.svec(rand(Int64, 10^n)...)for n in 1 to 6.However, FxHasher for 64 bit ints is just a multiplication by a constant, though if objectid is only used in internal hashtables, like
IdDict, I think this should be fine. Whilst objectid is used to hash general objects on the Julia side, its passed through a hash so I don't think it should affecthashin Julia too much.Alternatively, we could try using the non AES intrinsic version of ahash, this should still be much faster than what we have currently and higher quality than FxHasher. Again, ahash says it's specifically for hashtables.
Alternatively, to replace only
int64hash, Squirrel3 is quite simple and seems to be reasonably fast (about 1.5-2x faster than int64hash) and should have higher quality hashes than FxHasher.Does it seem reasonable to make a change along these lines? I see that currently
int64hashis invertible, do we care about this? I don't see it being used anywhere.This would also speed up my robinhood implementation of
IdDict.