Allow a slight difference in rescored docs by thecoop · Pull Request #139931 · elastic/elasticsearch

thecoop · 2025-12-23T09:53:30Z

With #139769, and recent changes to new native scorers, the rescored values may be slightly different to the original scores. Update the tests to allow this

Fixes #139912
Fixes #139869
Fixes #139859
Fixes #140014
Fixes #140078

elasticsearchmachine · 2025-12-23T09:53:54Z

Pinging @elastic/es-search-relevance (Team:Search Relevance)

benwtrent · 2025-12-23T12:55:06Z

My concern is that the delta might be because either we didn't rescore at all (and thus it's the same as the quantized score), or we are actually returning the wrong scores for docs (e.g. score for doc 1 for doc 2).

We should assert on doc order (if we don't already) and that the scores are different than the quantized scores.

thecoop · 2026-01-02T11:08:42Z

We already assert on doc order, I've added a check that rescoring changes scores compared to the original query

thecoop · 2026-01-02T14:29:56Z

server/src/test/java/org/elasticsearch/search/vectors/RescoreKnnVectorQueryTests.java

-                    assertThat(rescoredDocs.scoreDocs.length, equalTo(k));
+                    assertThat(rescoredDocs.scoreDocs, arrayWithSize(k));
+
+                    if (innerQuery instanceof KnnFloatVectorQuery) {


The boolean query doesnt cause any differences - as it's not actually doing a kNN search over quantized data

benwtrent · 2026-01-05T15:09:53Z

server/src/test/java/org/elasticsearch/search/vectors/RescoreKnnVectorQueryTests.java

+    /*
+     * Original KNN scoring and rescoring can use slightly different calculation methods,
+     * so there may be a very slight difference in the scores after rescoring.
+     */
+    private static final float DELTA = 1e-6f;


Is this difference due to script execution paths vs utilizing native/bulk scoring?

If this is due to script vs. bulk rescoring. Could we adjust the calculation to use a flat index and a regular knn query instead of the script? I would assume the flat utilization would execute through the bulk off-heap scoring stuff vs. the script.

I am worried about these slight differences...

#139769 has got similar changes to this, so I'm pretty sure these changes are due to that PR

thecoop · 2026-01-06T14:37:21Z

Looking closer, this might actually be a result of #139769 - which uses panama in test classes, which could cause some FP differences

Follow on from elastic#139769 to update some more tests for FP differences

* upstream/main: (191 commits) Overall Decision for Deciders prioritizes THROTTLE (elastic#140237) Apply group by all logic not only to top-level aggregates (elastic#140248) [ES|QL] Refactor MV_UNION and MV_INTERSECTION to use shared set operation helper (elastic#139982) Avoid reading entire bloom filter file on reader open (elastic#139374) Mark bloom filter files for random access (elastic#139375) Ensure that the buffer used for ES93BloomFilterStoredFieldsFormat is zeroed (elastic#139034) Add busy assertion to avoid race condition for testStalledShardMigrationProperlyDetected (elastic#140230) Remove line number check for testTransitiveFindsDeepCallChain (elastic#140228) Allow a slight difference in rescored docs (elastic#139931) Mute org.elasticsearch.xpack.inference.integration.AuthorizationTaskExecutorIT testCreatesEisChatCompletion_DoesNotRemoveEndpointWhenNoLongerAuthorized elastic#138480 Start exchange sink fetchers concurrently (elastic#140196) Allow allocation to replacement target node on vacate completion (elastic#140150) Ignore JNA cleaner threads in SecureHdfsRepositoryAnalysisRestIT (elastic#139925) DeterministicQueue refactor and enhancement (elastic#140151) Always error out if CCS expression shows up when CCS is not supported (elastic#139009) Use IllegalArgumentException over RepositoryException for readonly-repository checks (elastic#140200) Guard promql capabilities in AnalyzerTests (elastic#140232) [Inference API] Fix flaky AuthorizationTaskExecutorIT tests (elastic#139978) Cleaning up exitable vector value impls (elastic#140190) [Inference API] Fix auth exception listener not called bug (elastic#139966) ...

Follow on from elastic#139769 to update some more tests for FP differences

thecoop added >test Issues or PRs that are addressing/adding tests :Search Relevance/Vectors Vector search labels Dec 23, 2025

elasticsearchmachine added v9.4.0 Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch labels Dec 23, 2025

thecoop added 2 commits January 2, 2026 10:12

Allow a slight difference in rescored docs

689a2e9

Add check that rescoring has happened

4e06af1

thecoop force-pushed the rescoring-delta-fix branch from bbaa6e3 to 4e06af1 Compare January 2, 2026 11:08

thecoop commented Jan 2, 2026

View reviewed changes

benwtrent reviewed Jan 5, 2026

View reviewed changes

benwtrent approved these changes Jan 5, 2026

View reviewed changes

Merge branch 'main' into rescoring-delta-fix

3154413

thecoop requested a review from ChrisHegarty January 6, 2026 14:39

Some tidying

1f83934

thecoop merged commit 49f163e into elastic:main Jan 7, 2026
36 checks passed

thecoop deleted the rescoring-delta-fix branch January 7, 2026 09:10

ywangd pushed a commit to ywangd/elasticsearch that referenced this pull request Jan 7, 2026

Allow a slight difference in rescored docs (elastic#139931)

0c5c6c2

Follow on from elastic#139769 to update some more tests for FP differences

sidosera pushed a commit to sidosera/elasticsearch that referenced this pull request Jan 7, 2026

Allow a slight difference in rescored docs (elastic#139931)

0ede722

Follow on from elastic#139769 to update some more tests for FP differences

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow a slight difference in rescored docs#139931

Allow a slight difference in rescored docs#139931
thecoop merged 4 commits intoelastic:mainfrom
thecoop:rescoring-delta-fix

thecoop commented Dec 23, 2025 •

edited

Loading

Uh oh!

elasticsearchmachine commented Dec 23, 2025

Uh oh!

benwtrent commented Dec 23, 2025

Uh oh!

thecoop commented Jan 2, 2026

Uh oh!

thecoop Jan 2, 2026 •

edited

Loading

Uh oh!

benwtrent Jan 5, 2026

Uh oh!

benwtrent Jan 5, 2026

Uh oh!

thecoop Jan 6, 2026

Uh oh!

thecoop commented Jan 6, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

thecoop commented Dec 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

elasticsearchmachine commented Dec 23, 2025

Uh oh!

benwtrent commented Dec 23, 2025

Uh oh!

thecoop commented Jan 2, 2026

Uh oh!

thecoop Jan 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

benwtrent Jan 5, 2026

Choose a reason for hiding this comment

Uh oh!

benwtrent Jan 5, 2026

Choose a reason for hiding this comment

Uh oh!

thecoop Jan 6, 2026

Choose a reason for hiding this comment

Uh oh!

thecoop commented Jan 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

thecoop commented Dec 23, 2025 •

edited

Loading

thecoop Jan 2, 2026 •

edited

Loading

thecoop commented Jan 6, 2026 •

edited

Loading