Conversation
timvisee
approved these changes
Nov 20, 2024
Member
timvisee
left a comment
There was a problem hiding this comment.
I'm not confident I understand all the nitty-gritty. However, I'm assuming that you've already extensively tested this. Based on a quick skim I don't have any remarks, and I like that you split it into a sub crate.
generall
reviewed
Nov 22, 2024
lib/common/common/src/bitpacking.rs
Outdated
| } | ||
|
|
||
| #[inline] | ||
| pub fn push(&mut self, value: u32, bits: u8) { |
Member
There was a problem hiding this comment.
Could you please provide some examples of how to use this function in real-life scenario. For instance, where bits is coming from? I see in the provided tests, that bits are used to generate value. Does it means we can derive bits from value inside the push function?
generall
approved these changes
Nov 25, 2024
Member
generall
left a comment
There was a problem hiding this comment.
ToDo: comment that the Bitpacker abstraction expects some external entity to keep track of value bit lengths
generall
approved these changes
Nov 26, 2024
d23a0c1 to
69ef2fb
Compare
69ef2fb to
87daebc
Compare
timvisee
pushed a commit
that referenced
this pull request
Dec 9, 2024
Co-authored-by: generall <generall@thinkstation.fritz.box>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR adds a new module
common::bitpackingwith two structs:BitPackerandBitUnpacker. They would be used for the HNSW graph links compression in subsequent PRs.Comparison with other implementations:
bitpacking: it's fast, SIMD-optimized, but application is limited: blocks should be of size of 32/128/256 integers, all stored integers within a block should have the same bitsize.bitstream_io: the idea is somewhat similar to this PR, but this crate resulted to be slower for links compression/in-memory use-case.tantivy_bitpacker: optimized for a random access reads within a packed array.The most important function is
BitUnpacker::read()as it would be used in a hot loop. The inner buffer size is 8 bytes (a singleu64) and it's optimized to read by whole 8 bytes at once.The intended use-case for this module would look like the following (example numbers are arbitrary):
Total size of the example: