Ensure vector queries handle advanceShallow correctly by benwtrent · Pull Request #14858 · apache/lucene

benwtrent · 2025-06-27T19:07:43Z

jpountz

does getMaxScore need fixing as well? I see that it adds context.docBase too.

jpountz · 2025-06-27T20:07:07Z

lucene/core/src/java/org/apache/lucene/search/AbstractKnnVectorQuery.java

            public int advanceShallow(int docid) {
+              if (docid == NO_MORE_DOCS) {
+                return NO_MORE_DOCS;
+              }


Should it instead be something like below?

if (docid >= context.reader.maxDoc()) { // out of range return NO_MORE_DOCS; }

Otherwise we may be computing blocks based on hits that belong to other segments?

Ah yeah! Let me get that

vigyasharma · 2025-06-28T07:16:42Z

does getMaxScore need fixing as well? I see that it adds context.docBase too.

I suppose this gets handled by the idx < upper check on L427, which would ensure we stop at segment boundary?

vigyasharma · 2025-06-28T07:18:17Z

Sneaky bug! Thanks for fixing @benwtrent

benwtrent · 2025-06-28T12:24:10Z

@jpountz what do you think if we just backport: #12146 ?

That handles the issue and greatly simplifies things. I am actually not sure why it wasn't backported to begin with.

jpountz · 2025-06-28T20:24:04Z

This looks like a safe change to backport to me.

benwtrent · 2025-06-30T12:35:51Z

@jpountz @vigyasharma I changed this PR to just be a backport of #12146

It fixes both maxscore & advshallow bugs while greatly simplifying the code.

Basically, advanceShallow for knn queries can flip back from hitting NO_MORE_DOCS to a valid doc ID again. This can cause search higher level queries to flip back and forth breaking assumptions and causing a CPU core to be locked up in-definitely (until server reboot). Applies the fix provided here, but requires gathering score docs again: apache/lucene#14858 closes: #130239

Ensure vector queries handle advanceShallow correctly

eb01b55

github-project-automation bot added this to OpenSearch Lucene & Core Performance Tracking Jun 27, 2025

github-project-automation bot moved this to Open in OpenSearch Lucene & Core Performance Tracking Jun 27, 2025

adding changes

a46c84c

benwtrent mentioned this pull request Jun 27, 2025

Patch for Lucene bug 14857 elastic/elasticsearch#130254

Merged

jpountz reviewed Jun 27, 2025

View reviewed changes

Adjusting to just be a backport of apache#12146

edc4271

jpountz approved these changes Jun 30, 2025

View reviewed changes

benwtrent merged commit 16b9a87 into apache:branch_9_12 Jun 30, 2025
2 checks passed

github-project-automation bot moved this from Open to Merged in OpenSearch Lucene & Core Performance Tracking Jun 30, 2025

benwtrent deleted the bugfix/14857 branch June 30, 2025 13:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ensure vector queries handle advanceShallow correctly#14858

Ensure vector queries handle advanceShallow correctly#14858
benwtrent merged 3 commits intoapache:branch_9_12from
benwtrent:bugfix/14857

benwtrent commented Jun 27, 2025

Uh oh!

jpountz left a comment

Uh oh!

jpountz Jun 27, 2025

Uh oh!

benwtrent Jun 27, 2025

Uh oh!

vigyasharma commented Jun 28, 2025

Uh oh!

vigyasharma commented Jun 28, 2025

Uh oh!

benwtrent commented Jun 28, 2025

Uh oh!

jpountz commented Jun 28, 2025

Uh oh!

benwtrent commented Jun 30, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

benwtrent commented Jun 27, 2025

Uh oh!

jpountz left a comment

Choose a reason for hiding this comment

Uh oh!

jpountz Jun 27, 2025

Choose a reason for hiding this comment

Uh oh!

benwtrent Jun 27, 2025

Choose a reason for hiding this comment

Uh oh!

vigyasharma commented Jun 28, 2025

Uh oh!

vigyasharma commented Jun 28, 2025

Uh oh!

benwtrent commented Jun 28, 2025

Uh oh!

jpountz commented Jun 28, 2025

Uh oh!

benwtrent commented Jun 30, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants