Skip to content

Refactor: merge _search_on_level into search_on_level#7403

Merged
xzfc merged 3 commits intodevfrom
no-_search_on_level
Oct 16, 2025
Merged

Refactor: merge _search_on_level into search_on_level#7403
xzfc merged 3 commits intodevfrom
no-_search_on_level

Conversation

@xzfc
Copy link
Member

@xzfc xzfc commented Oct 14, 2025

This PR merges _search_on_level into search_on_level and removes the dead code that used it.


The current dev has two related functions: GraphLayersBase::search_on_level and GraphLayersBase::_search_on_level.

The former is high-level, and the latter is a bit more flexible as it allows reusing the SearchContext and VisitedListHandle after calling it.

But the only place where these are reused is inside the following dead code:

for existing_link in existing_links.iter() {
if !visited_list.check(existing_link) && ready_list[existing_link as usize] {
search_context.process_candidate(ScoredPointOffset {
idx: existing_link,
score: points_scorer.score_point(existing_link),
});
}
}

This code is dead because existing_links is always empty:

  • link_new_point/link_new_point_on_level or add_new_point1 is called only once per point, so it shouldn't be filled by them before.
  • It can't be filled by the backlinks mechanism because this point can't appear in search until marked as ready.
  • Perhaps it was in use early on for the additional links feature, but as of now, the additional links are built in a separate scratch graph.

This is confirmed by codecov:
https://app.codecov.io/gh/qdrant/qdrant/blob/dev/lib%2Fsegment%2Fsrc%2Findex%2Fhnsw_index%2Fgraph_layers_builder.rs#L524

Footnotes

  1. add_new_point is from incremental HNSW

@xzfc xzfc added the chore Nice to have improvement label Oct 14, 2025
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 14, 2025

📝 Walkthrough

Walkthrough

Reworks HNSW per-level search to be self-contained: GraphLayersBase::_search_on_level → GraphLayersBase::search_on_level and GraphLayersWithVectors::_search_on_level_with_vectors → GraphLayersWithVectors::search_on_level_with_vectors. New methods accept a level_entry and ef, allocate their own VisitedListHandle and local SearchContext(s), run the search loop internally, and return a FixedLengthPriorityQueue with results. GraphLayersBuilder is updated to call the new APIs and to drive heuristic and non‑heuristic linking using the returned nearest queue (nearest.iter_unsorted(), nearest.max(), etc.). Method signatures for the two traits changed accordingly.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • HNSW healing #6543 — Directly touches GraphLayers::search_on_level and SearchContext usage; changes to FixedLengthPriorityQueue handling are strongly related.
  • Heuristic optimization #6501 — Updates GraphLayersBuilder per-level search and heuristic neighbor-selection flow to consume nearest queues; closely connected to these builder changes.
  • Refactor GraphLayersBuilder::link_new_point #6161 — Modifies linking/heuristic paths in GraphLayersBuilder; related to replacing SearchContext/VisitedList with FixedLengthPriorityQueue.

Suggested reviewers

  • timvisee
  • generall
  • IvanPleshkov

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Title Check ✅ Passed The title succinctly and accurately describes the main refactoring change by naming the two methods involved and indicating that one is being merged into the other, making it clear and specific without unnecessary detail.
Description Check ✅ Passed The description directly explains that the PR merges _search_on_level into search_on_level and removes dead code, provides context on why the code is dead, and even links to supporting evidence, making it clearly related to the changes in the pull request.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch no-_search_on_level

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6c763cf and 29ea7dc.

📒 Files selected for processing (1)
  • lib/segment/src/index/hnsw_index/graph_layers.rs (4 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.rs

📄 CodeRabbit inference engine (.github/review-rules.md)

**/*.rs: Prefer explicit SomeType::from(x) over implicit x.into() in Rust code
Do not use transmute_from_u8, transmute_to_u8, transmute_from_u8_to_slice, transmute_from_u8_to_mut_slice, transmute_to_u8_slice in new code; use bytemuck or zerocopy instead

Files:

  • lib/segment/src/index/hnsw_index/graph_layers.rs
**/src/**/*.rs

📄 CodeRabbit inference engine (.github/review-rules.md)

**/src/**/*.rs: Prefer exhaustive match arms over a catch-all _ arm to avoid missing new enum variants (except in tests/benchmarks or when provably safe)
Prefer explicit field ignoring with : _ over .. in struct patterns (except in tests/benchmarks or when provably safe)

Files:

  • lib/segment/src/index/hnsw_index/graph_layers.rs
🧬 Code graph analysis (1)
lib/segment/src/index/hnsw_index/graph_layers.rs (3)
lib/common/common/src/fixed_length_priority_queue.rs (1)
  • new (34-42)
lib/segment/src/index/hnsw_index/search_context.rs (1)
  • new (16-21)
lib/segment/src/common/operation_error.rs (1)
  • check_process_stopped (248-253)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: lint
  • GitHub Check: test-consistency
  • GitHub Check: e2e-tests
  • GitHub Check: test-consensus-compose
  • GitHub Check: test-shard-snapshot-api-s3-minio
  • GitHub Check: integration-tests
  • GitHub Check: integration-tests-consensus
  • GitHub Check: rust-tests-no-rocksdb (ubuntu-latest)
  • GitHub Check: rust-tests (macos-latest)
  • GitHub Check: rust-tests (windows-latest)
  • GitHub Check: rust-tests (ubuntu-latest)
🔇 Additional comments (2)
lib/segment/src/index/hnsw_index/graph_layers.rs (2)

64-104: LGTM! Self-contained search implementation.

The refactoring successfully internalizes state management (VisitedListHandle, SearchContext) while preserving the original search logic. The method now:

  • Manages its own lifecycle for temporary data structures
  • Returns results directly via FixedLengthPriorityQueue<ScoredPointOffset>
  • Maintains proper cancellation handling and visited-node tracking

This improves maintainability by eliminating external state dependencies.


192-244: LGTM! Dual-phase search correctly implemented.

The refactoring properly implements the two-phase search strategy for vector compression:

  1. Links phase: Fast approximate search using compressed vectors (links_search_context)
  2. Base phase: Refinement scoring using original vectors (base_search_context)

Key aspects validated:

  • level_entry is scored in base space when processed from the candidate queue
  • Early termination (lines 214-220) correctly scores the final candidate before breaking
  • Both search contexts are self-contained and properly scoped
  • Cancellation handling is in place

The internalized state management eliminates external dependencies while preserving the quantized search semantics.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

coderabbitai[bot]

This comment was marked as resolved.

{
let ready_list = self.ready_list.read();
for existing_link in existing_links.iter() {
if !visited_list.check(existing_link) && ready_list[existing_link as usize] {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great that you proved that this code is unreachable. We also need to remove this logic from gpu implementation in the future

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xzfc xzfc merged commit e640a78 into dev Oct 16, 2025
15 checks passed
@xzfc xzfc deleted the no-_search_on_level branch October 16, 2025 09:30
@coderabbitai coderabbitai bot mentioned this pull request Oct 16, 2025
timvisee pushed a commit that referenced this pull request Nov 14, 2025
* Call search_on_level instead of _search_on_level

* Inline _search_on_level into search_on_level

* Inline _search_on_level_with_vectors into search_on_level_with_vectors
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

chore Nice to have improvement

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants