-
Notifications
You must be signed in to change notification settings - Fork 8.3k
Bit-sliced format for vectors #77088
Description
Company or project name
ClickHouse
Use case
Adjusting the precision and speed of vector search on a query level by reading quantized data, which is stored in a format designed in a way that reading subsets of data will lead to reading different levels of quantized data.
This will speed up vector search without the need of building an index and without added complexity of data updates.
Describe the solution you'd like
Let's suppose you have a vector with BFloat16 of 1024 dimensions.
So, you take the most significant bit from every dimension of the vector, it will be 1024 bits, 128 bytes, and store these 128-bytes for every row it in one stream. Take the second most significant bit from every dimension, and store it in the second stream.
On the query level, you can adjust the desired quantization as the number of bits, using a function, that will optimize to reading subcolumns. It will reconstruct approximate floating point values, using the available information.
This bit-slicing can be performed either as is, so the first bit of BFloat16 designates the sign, then 8 exponent bits, then 7 mantissa bits. The downside of this approach is that the first exponent bits are often unusable, and the information loss is higher. But at the same time, the streams with "low-information" bits will compress ideally, so performance when we read more bits will be good anyway.
Or bit-slicing can be performed with estimating quantiles. For example, for every block of data, we store 15 (= 16 - 1) quantiles in a dictionary stream (similar to LowCardinality and JSON dictionaries), then the first 4 bits will designate the corresponding inter-quantile range, and the next bits will split the range uniformly.
Describe alternatives you've considered
No response
Additional context
This isn't an indexed search, and it is still O(number of records).