Skip to content

Try to allow f16 vectors in storage #3333

@generall

Description

@generall

Is your feature request related to a problem? Please describe.

Currently qdrant uses hard-coded f32 as a data type for all vectors (except quantized).
But in many applications storing f32 might be too expensive and redundant.

It would be interesting to try to eventually migrate to dynamic switch between f16 and f32 precisions.

Describe the solution you'd like

As a first step of the migration, we'd like to experiment with compile-time switch first.

To do this, we would need to introduce a feature-flag, that would switch from f32 to f16.

Make sure that all qdrant functions work properly with f16 type enabled, such as:

  • quantization
  • mmaps
  • querying
  • indexing
  • ...

Describe alternatives you've considered

Make storage type generic, but this easily might be too much effort for a single PR, so it is better to postpone for another iteration.

Additional context

We would need to pay extra attention for query types, as interface (REST and gRPC) should not change once we switch to f16.


Note for contributors: Please consider this as tracking issue. If you think that it would be beneficial to split the task into multiple smaller PRs, please you are welcome to do so. Bounty will be rewarded for each PR independently

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions