Skip to content

Conversation

@dolio
Copy link
Contributor

@dolio dolio commented Sep 19, 2025

This implements a new builtin that does a murmur hash more directly on a value, and omitting several Reference occurrences that are just for type information. The hope is that this will be quicker than serializing to bytes and then hashing the bytes.

It's possible that even using the type information could be faster with some of the new machinery, because References are lifted out when you create a Value now, and I use that to pass around arrays of Hash64 -> Hash64 functions that augment the hash with the reference information. But, I think the way the API is structured doesn't allow it to precompute a small value to add for each reference, and instead it's just a closure that does the full work every time.

@dolio
Copy link
Contributor Author

dolio commented Sep 19, 2025

So, obviously it depends what you're hashing. But some quick tests suggest this is in fact a lot faster. Like, If you hash

ns = List.map (n -> (n, (1,2,3))) (Nat.range 0 1000)

The old murmurHash takes ~30ms and the new one takes ~650μs.

Copy link
Member

@pchiusano pchiusano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome!!

@dolio
Copy link
Contributor Author

dolio commented Sep 23, 2025

Okay, I think this is finally in a mergeable state.

@dolio dolio marked this pull request as ready for review September 23, 2025 15:47
@pchiusano pchiusano merged commit a1e5359 into trunk Sep 23, 2025
32 checks passed
@pchiusano pchiusano deleted the topic/direct-murmur branch September 23, 2025 16:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants