Skip to content

Add enable_hnsw option for payload field schema#7887

Merged
generall merged 3 commits intoqdrant:devfrom
TY0909:add_enable_hnsw_for_payload_schema
Jan 9, 2026
Merged

Add enable_hnsw option for payload field schema#7887
generall merged 3 commits intoqdrant:devfrom
TY0909:add_enable_hnsw_for_payload_schema

Conversation

@TY0909
Copy link
Contributor

@TY0909 TY0909 commented Jan 9, 2026

All Submissions:

  • Contributions should target the dev branch. Did you create your branch from dev?
  • Have you followed the guidelines in our Contributing document?
  • Have you checked to ensure there aren't other open Pull Requests for the same update/change?

New Feature Submissions:

  1. Does your submission pass tests?
  2. Have you formatted your code locally using cargo +nightly fmt --all command prior to submission?
  3. Have you checked your code using cargo clippy --workspace --all-features command?

Changes to Core Features:

  • Have you added an explanation of what your changes do and why you'd like us to include them?
  • Have you written new tests for your core changes, as applicable?
  • Have you successfully ran tests with your changes locally?

Description

This PR adds an optional enable_hnsw parameter to all payload field index types, providing fine-grained control over HNSW graph construction for individual indexed fields.

Changes

Core Data Types

  • Added enable_hnsw: Option<bool> field to all 8 payload index parameter types:
    • KeywordIndexParams
    • IntegerIndexParams
    • FloatIndexParams
    • GeoIndexParams
    • TextIndexParams
    • BoolIndexParams
    • DatetimeIndexParams
    • UuidIndexParams

API Layer

  • gRPC: Updated proto definitions in collections.proto with documentation
  • gRPC Conversions: Updated all From/TryFrom implementations in conversions.rs to handle new field
  • OpenAPI: Added field definition to all index param schemas in openapi.json
  • Generated Code: Regenerated qdrant.rs with new field definitions

Core Logic

  • Modified HNSW index builder in hnsw.rs to filter fields based on enable_hnsw flag
  • Added helper methods enable_hnsw() to PayloadSchemaParams and PayloadFieldSchema types
  • Fields with enable_hnsw=false are skipped during additional HNSW links construction

Tests

  • Updated all existing tests to include enable_hnsw: None for backward compatibility
  • Modified tests in:
    • payload_index_test.rs
    • segment_on_disk_snapshot.rs
    • mutable_text_index.rs
    • full_text_index/tests/mod.rs
    • tokenizers/mod.rs
    • test_payload_indexing.py

Motivation

When payload_m > 0 is configured in HNSW settings, Qdrant builds additional graph links for all indexed payload fields to improve filtered search performance.

However, when a collection contains both dense and sparse vectors, payload indexes intended only for sparse vector filtering will still affect the dense vector's HNSW graph. This creates additional graph links in the dense vector's HNSW index that will never be used in queries. This causes the following problems:

  1. Increased indexing time: Building HNSW links for every indexed field can significantly slow down indexing
  2. Higher memory usage: Additional links consume memory
  3. Not always beneficial: Some fields may not benefit from HNSW links depending on query patterns

This feature allows users to:

  • Selectively disable HNSW graph building for specific fields that don't need it
  • Reduce resource consumption in large-scale deployments
  • Fine-tune performance based on actual query patterns
  • Index fields for filtering without the overhead of HNSW link construction

Behavior

  • Default value: true (maintains full backward compatibility)
  • When enable_hnsw=false: The field will still be indexed normally for filtering, but won't have additional HNSW links built
  • Requirement: Only takes effect when the collection's HNSW config has payload_m > 0
  • Log output: When disabled, logs: enable_hnsw=false. Skip building additional index for field {field_name}

Example Usage

REST API

PUT /collections/{collection_name}/index
{
  "field_name": "category",
  "field_schema": {
    "type": "keyword",
    "enable_hnsw": false
  }
}

gRPC

KeywordIndexParams {
  is_tenant: false,
  on_disk: false,
  enable_hnsw: false  // Disable HNSW for this field
}

Backward Compatibility

Fully backward compatible

  • Default value is true, preserving existing behavior
  • Existing collections and indices work without modification
  • Optional field in all APIs (gRPC, REST, OpenAPI)

EC2 Default User added 2 commits January 9, 2026 01:28
Add optional enable_hnsw parameter to all payload index types to control
whether additional HNSW graph links are built for each indexed field.

- Add enable_hnsw field to all 8 payload index param types
- Update gRPC proto definitions and conversions
- Update OpenAPI schema
- Modify HNSW graph builder to respect enable_hnsw flag
- Add enable_hnsw() helper methods to PayloadSchemaParams and PayloadFieldSchema
- Update all tests to include new field (default: None)

When enable_hnsw is true and payload_M > 0, additional HNSW links will
be built for the payload field. Default value is true for backward compatibility.
coderabbitai[bot]

This comment was marked as resolved.

@generall
Copy link
Member

generall commented Jan 9, 2026

Thanks for the PR! This feature was in our backlog with high priority, so we try to review and merge ASAP

@generall generall requested review from generall and xzfc January 9, 2026 08:25
@qdrant qdrant deleted a comment from coderabbitai bot Jan 9, 2026
Copy link
Member

@timvisee timvisee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much! This is exactly how I imagined it.

How about we rename this property to enrich_hnsw, or is enable_hnsw better? Wdyt?

@generall
Copy link
Member

generall commented Jan 9, 2026

I would keep enable_hnsw

@generall generall merged commit a494452 into qdrant:dev Jan 9, 2026
15 checks passed
@TY0909 TY0909 deleted the add_enable_hnsw_for_payload_schema branch January 12, 2026 01:17
generall pushed a commit that referenced this pull request Feb 9, 2026
* feat: Add enable_hnsw option for payload field indexes

Add optional enable_hnsw parameter to all payload index types to control
whether additional HNSW graph links are built for each indexed field.

- Add enable_hnsw field to all 8 payload index param types
- Update gRPC proto definitions and conversions
- Update OpenAPI schema
- Modify HNSW graph builder to respect enable_hnsw flag
- Add enable_hnsw() helper methods to PayloadSchemaParams and PayloadFieldSchema
- Update all tests to include new field (default: None)

When enable_hnsw is true and payload_M > 0, additional HNSW links will
be built for the payload field. Default value is true for backward compatibility.

* Fix Some format problems

* fix: address comment problem

---------

Co-authored-by: EC2 Default User <ec2-user@ip-10-78-171-148.ec2.internal>
@timvisee timvisee mentioned this pull request Feb 17, 2026
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants