Refactor: merge `_search_on_level` into `search_on_level` by xzfc · Pull Request #7403 · qdrant/qdrant

xzfc · 2025-10-14T15:10:41Z

This PR merges _search_on_level into search_on_level and removes the dead code that used it.

The current dev has two related functions: GraphLayersBase::search_on_level and GraphLayersBase::_search_on_level.

The former is high-level, and the latter is a bit more flexible as it allows reusing the SearchContext and VisitedListHandle after calling it.

But the only place where these are reused is inside the following dead code:

qdrant/lib/segment/src/index/hnsw_index/graph_layers_builder.rs

Lines 523 to 530 in 02de826

    
           for existing_link in existing_links.iter() { 
        
               if !visited_list.check(existing_link) && ready_list[existing_link as usize] { 
        
                   search_context.process_candidate(ScoredPointOffset { 
        
                       idx: existing_link, 
        
                       score: points_scorer.score_point(existing_link), 
        
                   }); 
        
               } 
        
           }

This code is dead because existing_links is always empty:

link_new_point/link_new_point_on_level or add_new_point¹ is called only once per point, so it shouldn't be filled by them before.
It can't be filled by the backlinks mechanism because this point can't appear in search until marked as ready.
Perhaps it was in use early on for the additional links feature, but as of now, the additional links are built in a separate scratch graph.

This is confirmed by codecov:
https://app.codecov.io/gh/qdrant/qdrant/blob/dev/lib%2Fsegment%2Fsrc%2Findex%2Fhnsw_index%2Fgraph_layers_builder.rs#L524

add_new_point is from incremental HNSW ↩

coderabbitai · 2025-10-14T15:19:34Z

📝 Walkthrough

Walkthrough

Reworks HNSW per-level search to be self-contained: GraphLayersBase::_search_on_level → GraphLayersBase::search_on_level and GraphLayersWithVectors::_search_on_level_with_vectors → GraphLayersWithVectors::search_on_level_with_vectors. New methods accept a level_entry and ef, allocate their own VisitedListHandle and local SearchContext(s), run the search loop internally, and return a FixedLengthPriorityQueue with results. GraphLayersBuilder is updated to call the new APIs and to drive heuristic and non‑heuristic linking using the returned nearest queue (nearest.iter_unsorted(), nearest.max(), etc.). Method signatures for the two traits changed accordingly.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

HNSW healing #6543 — Directly touches GraphLayers::search_on_level and SearchContext usage; changes to FixedLengthPriorityQueue handling are strongly related.
Heuristic optimization #6501 — Updates GraphLayersBuilder per-level search and heuristic neighbor-selection flow to consume nearest queues; closely connected to these builder changes.
Refactor GraphLayersBuilder::link_new_point #6161 — Modifies linking/heuristic paths in GraphLayersBuilder; related to replacing SearchContext/VisitedList with FixedLengthPriorityQueue.

Suggested reviewers

timvisee
generall
IvanPleshkov

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title Check	✅ Passed	The title succinctly and accurately describes the main refactoring change by naming the two methods involved and indicating that one is being merged into the other, making it clear and specific without unnecessary detail.
Description Check	✅ Passed	The description directly explains that the PR merges `_search_on_level` into `search_on_level` and removes dead code, provides context on why the code is dead, and even links to supporting evidence, making it clearly related to the changes in the pull request.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch no-_search_on_level

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6c763cf and 29ea7dc.

📒 Files selected for processing (1)

lib/segment/src/index/hnsw_index/graph_layers.rs (4 hunks)

🧰 Additional context used

📓 Path-based instructions (2)

**/*.rs

📄 CodeRabbit inference engine (.github/review-rules.md)

**/*.rs: Prefer explicit SomeType::from(x) over implicit x.into() in Rust code
Do not use transmute_from_u8, transmute_to_u8, transmute_from_u8_to_slice, transmute_from_u8_to_mut_slice, transmute_to_u8_slice in new code; use bytemuck or zerocopy instead

Files:

lib/segment/src/index/hnsw_index/graph_layers.rs

**/src/**/*.rs

📄 CodeRabbit inference engine (.github/review-rules.md)

**/src/**/*.rs: Prefer exhaustive match arms over a catch-all _ arm to avoid missing new enum variants (except in tests/benchmarks or when provably safe)
Prefer explicit field ignoring with : _ over .. in struct patterns (except in tests/benchmarks or when provably safe)

Files:

lib/segment/src/index/hnsw_index/graph_layers.rs

🧬 Code graph analysis (1)

lib/segment/src/index/hnsw_index/graph_layers.rs (3)

lib/common/common/src/fixed_length_priority_queue.rs (1)

new (34-42)

lib/segment/src/index/hnsw_index/search_context.rs (1)

new (16-21)

lib/segment/src/common/operation_error.rs (1)

check_process_stopped (248-253)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)

GitHub Check: lint
GitHub Check: test-consistency
GitHub Check: e2e-tests
GitHub Check: test-consensus-compose
GitHub Check: test-shard-snapshot-api-s3-minio
GitHub Check: integration-tests
GitHub Check: integration-tests-consensus
GitHub Check: rust-tests-no-rocksdb (ubuntu-latest)
GitHub Check: rust-tests (macos-latest)
GitHub Check: rust-tests (windows-latest)
GitHub Check: rust-tests (ubuntu-latest)

🔇 Additional comments (2)

lib/segment/src/index/hnsw_index/graph_layers.rs (2)

64-104: LGTM! Self-contained search implementation.

The refactoring successfully internalizes state management (VisitedListHandle, SearchContext) while preserving the original search logic. The method now:

Manages its own lifecycle for temporary data structures

Returns results directly via FixedLengthPriorityQueue<ScoredPointOffset>

Maintains proper cancellation handling and visited-node tracking

This improves maintainability by eliminating external state dependencies.

192-244: LGTM! Dual-phase search correctly implemented.

The refactoring properly implements the two-phase search strategy for vector compression:

Links phase: Fast approximate search using compressed vectors (links_search_context)

Base phase: Refinement scoring using original vectors (base_search_context)

Key aspects validated:

level_entry is scored in base space when processed from the candidate queue

Early termination (lines 214-220) correctly scores the final candidate before breaking

Both search contexts are self-contained and properly scoped

Cancellation handling is in place

The internalized state management eliminates external dependencies while preserving the quantized search semantics.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

IvanPleshkov · 2025-10-15T09:16:20Z

lib/segment/src/index/hnsw_index/graph_layers_builder.rs

-            {
-                let ready_list = self.ready_list.read();
-                for existing_link in existing_links.iter() {
-                    if !visited_list.check(existing_link) && ready_list[existing_link as usize] {


Great that you proved that this code is unreachable. We also need to remove this logic from gpu implementation in the future

I've tried to find where this logic is implemented for GPU, but can't find it. I don't see it there:

https://github.com/qdrant/qdrant/blob/969cc82018dd722262b93d003175b3d69961a135/lib/segment/src/index/hnsw_index/gpu/shaders/run_insert_vector.comp

https://github.com/qdrant/qdrant/blob/969cc82018dd722262b93d003175b3d69961a135/lib/segment/src/index/hnsw_index/gpu/shaders/search_context.comp

(not that I want to remove it right now, just curious)

* Call search_on_level instead of _search_on_level * Inline _search_on_level into search_on_level * Inline _search_on_level_with_vectors into search_on_level_with_vectors

xzfc added 2 commits October 14, 2025 14:37

Call search_on_level instead of _search_on_level

51faea5

Inline _search_on_level into search_on_level

6c763cf

xzfc requested review from IvanPleshkov, generall and timvisee October 14, 2025 15:10

xzfc added the chore Nice to have improvement label Oct 14, 2025

This comment was marked as resolved.

Sign in to view

Inline _search_on_level_with_vectors into search_on_level_with_vectors

29ea7dc

IvanPleshkov approved these changes Oct 15, 2025

View reviewed changes

timvisee approved these changes Oct 15, 2025

View reviewed changes

xzfc merged commit e640a78 into dev Oct 16, 2025
15 checks passed

xzfc deleted the no-_search_on_level branch October 16, 2025 09:30

coderabbitai bot mentioned this pull request Oct 16, 2025

Add ACORN-1 search #7414

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor: merge `_search_on_level` into `search_on_level`#7403

Refactor: merge `_search_on_level` into `search_on_level`#7403
xzfc merged 3 commits intodevfrom
no-_search_on_level

xzfc commented Oct 14, 2025

Uh oh!

coderabbitai bot commented Oct 14, 2025 •

edited

Loading

Walkthrough

Estimated code review effort

Possibly related PRs

Suggested reviewers

Uh oh!

This comment was marked as resolved.

Uh oh!

IvanPleshkov Oct 15, 2025

Uh oh!

xzfc Oct 16, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	for existing_link in existing_links.iter() {
	if !visited_list.check(existing_link) && ready_list[existing_link as usize] {
	search_context.process_candidate(ScoredPointOffset {
	idx: existing_link,
	score: points_scorer.score_point(existing_link),
	});
	}
	}

Conversation

xzfc commented Oct 14, 2025

Footnotes

Uh oh!

coderabbitai bot commented Oct 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Estimated code review effort

Possibly related PRs

Suggested reviewers

Pre-merge checks and finishing touches

Uh oh!

This comment was marked as resolved.

Uh oh!

IvanPleshkov Oct 15, 2025

Choose a reason for hiding this comment

Uh oh!

xzfc Oct 16, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

coderabbitai bot commented Oct 14, 2025 •

edited

Loading