-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Try to allow f16 vectors in storage #3333
Description
Is your feature request related to a problem? Please describe.
Currently qdrant uses hard-coded f32 as a data type for all vectors (except quantized).
But in many applications storing f32 might be too expensive and redundant.
It would be interesting to try to eventually migrate to dynamic switch between f16 and f32 precisions.
Describe the solution you'd like
As a first step of the migration, we'd like to experiment with compile-time switch first.
To do this, we would need to introduce a feature-flag, that would switch from f32 to f16.
Make sure that all qdrant functions work properly with f16 type enabled, such as:
- quantization
- mmaps
- querying
- indexing
- ...
Describe alternatives you've considered
Make storage type generic, but this easily might be too much effort for a single PR, so it is better to postpone for another iteration.
Additional context
We would need to pay extra attention for query types, as interface (REST and gRPC) should not change once we switch to f16.
Note for contributors: Please consider this as tracking issue. If you think that it would be beneficial to split the task into multiple smaller PRs, please you are welcome to do so. Bounty will be rewarded for each PR independently