Explain Function Score Query by john-wagster · Pull Request #111807 · elastic/elasticsearch

john-wagster · 2024-08-12T17:02:35Z

Addressing an NPE found when building a custom plugin leveraging a script.
#109177

This occurs when calling execute on a ScoreScript like this:

public double execute(ExplanationHolder explanation) {
    explanation.set("An example optional custom description to explain details for this script's execution; we'll provide a default one if you leave this out.");
   ...
}

Example Returned Hit with the Explanation Customized

{
  "_shard" : "[test][0]",
  "_node" : "6aOP4_Q5SOGIyr4oIxj0TQ",
  "_index" : "test",
  "_id" : "2",
  "_score" : 0.5685853,
  "_source" : {
    "important_field" : "foo foo foo"
  },
  "_explanation" : {
    "value" : 0.5685853,
    "description" : "function score, product of:",
    "details" : [
      ...
      {
        "value" : 3.0,
        "description" : "min of:",
        "details" : [
          {
            "value" : 3.0,
            "description" : "An example optional custom description to explain details for this script's execution; we'll provide a default one if you leave this out.",
            "details" : [
              {
                "value" : 0.18952844,
                "description" : "_score: ",
                "details" : [
                  {
                    "value" : 0.18952844,
                    "description" : "weight(important_field:foo in 1) [PerFieldSimilarity], result of:",
                    "details" : [
                      {
                        "value" : 0.18952844,
                        "description" : "score(freq=3.0), computed as boost * idf * tf from:",
                        "details" : [
                          {
                            "value" : 2.2,
                            "description" : "boost",
                            "details" : [ ]
                          },
                          {
                            "value" : 0.13353139,
                            "description" : "idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:",
                            "details" : [
                              {
                                "value" : 3,
                                "description" : "n, number of documents containing term",
                                "details" : [ ]
                              },
                              {
                                "value" : 3,
                                "description" : "N, total number of documents with field",
                                "details" : [ ]
                              }
                            ]
                          },
                          {
                            "value" : 0.6451613,
                            "description" : "tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:",
                            "details" : [
                              {
                                "value" : 3.0,
                                "description" : "freq, occurrences of term within document",
                                "details" : [ ]
                              },
                              {
                                "value" : 1.2,
                                "description" : "k1, term saturation parameter",
                                "details" : [ ]
                              },
                              {
                                "value" : 0.75,
                                "description" : "b, length normalization parameter",
                                "details" : [ ]
                              },
                              {
                                "value" : 3.0,
                                "description" : "dl, length of field",
                                "details" : [ ]
                              },
                              {
                                "value" : 2.0,
                                "description" : "avgdl, average length of field",
                                "details" : [ ]
                              }
                            ]
                          }
                        ]
                      }
                    ]
                  }
                ]
              }
            ]
          },
          {
            "value" : 3.4028235E38,
            "description" : "maxBoost",
            "details" : [ ]
          }
        ]
      }
    ]
  }
}

What the Hit Would Normally Look Like Without the Custom Explanation

{
  "_shard" : "[test][0]",
  "_node" : "Fg3ne_PXSi2ljhqljA6QOQ",
  "_index" : "test",
  "_id" : "2",
  "_score" : 0.5685853,
  "_source" : {
    "important_field" : "foo foo foo"
  },
  "_explanation" : {
    "value" : 0.5685853,
    "description" : "function score, product of:",
    "details" : [
      ...
      {
        "value" : 3.0,
        "description" : "min of:",
        "details" : [
          {
            "value" : 3.0,
            "description" : "script score function, computed with script:\"Script{type=inline, lang='expert_scripts', idOrCode='pure_df', options={}, params={field=important_field, term=foo}}\"",
            "details" : [
              {
                "value" : 0.18952844,
                "description" : "_score: ",
                "details" : [
                  {
                    "value" : 0.18952844,
                    "description" : "weight(important_field:foo in 1) [PerFieldSimilarity], result of:",
                    "details" : [
                      {
                        "value" : 0.18952844,
                        "description" : "score(freq=3.0), computed as boost * idf * tf from:",
                        "details" : [
                          {
                            "value" : 2.2,
                            "description" : "boost",
                            "details" : [ ]
                          },
                          {
                            "value" : 0.13353139,
                            "description" : "idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:",
                            "details" : [
                              {
                                "value" : 3,
                                "description" : "n, number of documents containing term",
                                "details" : [ ]
                              },
                              {
                                "value" : 3,
                                "description" : "N, total number of documents with field",
                                "details" : [ ]
                              }
                            ]
                          },
                          {
                            "value" : 0.6451613,
                            "description" : "tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:",
                            "details" : [
                              {
                                "value" : 3.0,
                                "description" : "freq, occurrences of term within document",
                                "details" : [ ]
                              },
                              {
                                "value" : 1.2,
                                "description" : "k1, term saturation parameter",
                                "details" : [ ]
                              },
                              {
                                "value" : 0.75,
                                "description" : "b, length normalization parameter",
                                "details" : [ ]
                              },
                              {
                                "value" : 3.0,
                                "description" : "dl, length of field",
                                "details" : [ ]
                              },
                              {
                                "value" : 2.0,
                                "description" : "avgdl, average length of field",
                                "details" : [ ]
                              }
                            ]
                          }
                        ]
                      }
                    ]
                  }
                ]
              }
            ]
          },
          {
            "value" : 3.4028235E38,
            "description" : "maxBoost",
            "details" : [ ]
          }
        ]
      }
    ]
  }
}

…porting building a plugin with a custom script score; previously threw an npe

elasticsearchmachine · 2024-08-12T17:03:01Z

Pinging @elastic/es-search-relevance (Team:Search Relevance)

elasticsearchmachine · 2024-08-12T17:05:13Z

Hi @john-wagster, I've created a changelog YAML for you.

jdconrad · 2024-08-12T17:13:32Z

server/src/main/java/org/elasticsearch/common/lucene/search/function/ScriptScoreFunction.java

        leafScript._setIndexName(indexName);
        leafScript._setShard(shardId);
        return new LeafScoreFunction() {
+            ScoreScript.ExplanationHolder explanation = new ScoreScript.ExplanationHolder();


We probably only want to create a new ExplanationHolder if explain is true for this query.

makes sense to me; I'll poke around to see if I can figure out how to get that param down to that level

So one problem with this is that if the plugin attempts to set an explanation even if explain isn't set to true then we will still throw an NPE. So it may be problematic to conditionally create an ExplanationHolder. I could create a dummy one at a higher level and pass that in, which would just be a no-op. Thoughts on this are welcome.

pushed up a change for this; would love some feedback on whether what I've done to both pass down that explain param and provide a dummy placeholder are reasonable.

I believe with the refactoring I've just done this is no longer a concern; let me know if you disagree for any reason @jdconrad

You're correct that it requires a null check in the script. I should have added a bit more context. In our docs, our example has a null check on the explanation holder for this reason. It's not ideal to have the user do this, but it seemed like the better trade off at the time rather than creating possibly millions of additional objects for a script that won't use them.

Yes, that's my miss. The docs mentioned this as did Ben in the issue raised and I missed that on my initial pass at this.

server/src/main/java/org/elasticsearch/common/lucene/search/function/ScriptScoreFunction.java

…the explanation so we maintain the subquery

john-wagster · 2024-08-13T00:38:45Z

server/src/main/java/org/elasticsearch/search/DefaultSearchContext.java

            );
            queryBoost = request.indexBoost();
            this.lowLevelCancellation = lowLevelCancellation;
+            SearchSourceBuilder builder = request.source();


Not sure if this is a reasonable expectation that source will be available when we expect to check for the explain flag.

john-wagster · 2024-08-13T00:39:14Z

server/src/main/java/org/elasticsearch/search/DefaultSearchContext.java

            this.lowLevelCancellation = lowLevelCancellation;
+            SearchSourceBuilder builder = request.source();
+            if (builder != null) {
+                Boolean requestExplain = request.source().explain();


This seemed like the most logic way to get explain from the original query but would be good to validate this.

benwtrent

I think this is overcomplicated. It can be simplified with minimal changes.

benwtrent · 2024-08-13T11:48:58Z

...-expert-scoring/src/main/java/org/elasticsearch/example/expertscript/ExpertScriptPlugin.java

                        public double execute(
                            ExplanationHolder explanation
                        ) {
+                            explanation.set("An example optional custom description to explain details for this script's execution; we'll provide a default one if you leave this out.");


these should check for null. These will be null when explain: false

benwtrent · 2024-08-13T11:50:11Z

server/src/main/java/org/elasticsearch/common/lucene/search/function/ScriptScoreFunction.java


 public class ScriptScoreFunction extends ScoreFunction {

+    private static final ScoreScript.ExplanationHolder DUMMY_EXPLAIN_HOLDER = new ScoreScript.ExplanationHolder();


underlying folks should check for null. Passing this in implies that explain is occurring. It might be the script's plugin explanation takes compute power, we need to be able to signal that explain actually isn't occurring.

So, we shouldn't have this dummy holder.

benwtrent · 2024-08-13T11:51:37Z

server/src/main/java/org/elasticsearch/common/lucene/search/function/ScriptScoreFunction.java

                scorer.docid = docId;
                scorer.score = subQueryScore;
-                double result = leafScript.execute(null);
+                double result;


I don't know why we are updating the score here at all. Doesn't this work with explainScore ? And if explainScore is being called, we know we are "explaining" and thus should create a ScoreScript.ExplanationHolder ?

That makes a lot of sense; I appreciate all your comments / direction. I'll come back with changes around all of your comments.

I believe all of your concerns have been addressed in the lastest push; let me know if you have additional concerns at this point @benwtrent

benwtrent · 2024-08-13T11:55:23Z

server/src/main/java/org/elasticsearch/common/lucene/search/function/ScriptScoreFunction.java

@@ -85,9 +110,13 @@ public Explanation explainScore(int docId, Explanation subQueryScore) throws IOE
                } else {
                    double score = score(docId, subQueryScore.getValue().floatValue());


What we should do here, is since we know explainScore means we are doing an explain call, we should handle the logic for creating a ScoreScript.ExplanationHolder here.

We should add a new public double score(int docId, float subQueryScore, ExplanationHolder holder) throws IOException { that both score and explainScore use where explainScore passes in a non-null holder, and scorer passes in null. This way we only create this object if we are explaining.

benwtrent · 2024-08-13T11:55:49Z

server/src/main/java/org/elasticsearch/index/query/SearchExecutionContext.java

+    public void explain(boolean explain) {
+        this.explain = explain;


we don't need all this. All this logic is already handled, we just need to take advantage of it. Let's remove all these context updates.

benwtrent · 2024-08-13T11:56:03Z

...er/src/main/java/org/elasticsearch/index/query/functionscore/ScriptScoreFunctionBuilder.java

            ScoreScript.Factory factory = context.compile(script, ScoreScript.CONTEXT);
            SearchLookup lookup = context.lookup();
            ScoreScript.LeafFactory searchScript = factory.newFactory(script.getParams(), lookup);
-            return new ScriptScoreFunction(script, searchScript, lookup, context.index().getName(), context.getShardId());


again, not necessary.

benwtrent · 2024-08-13T11:56:35Z

server/src/main/java/org/elasticsearch/search/DefaultSearchContext.java

            );
            queryBoost = request.indexBoost();
            this.lowLevelCancellation = lowLevelCancellation;
+            SearchSourceBuilder builder = request.source();


not necessary. explainScore is called already, which tells us to explain things. Thus we should use that as a signal to do so.

benwtrent · 2024-08-13T11:56:42Z

server/src/main/java/org/elasticsearch/search/SearchService.java

            QueryBuilder rewrittenForInnerHits = Rewriteable.rewrite(query, innerHitsRewriteContext, true);
            InnerHitContextBuilder.extractInnerHits(rewrittenForInnerHits, innerHitBuilders);
            searchExecutionContext.setAliasFilter(context.request().getAliasFilter().getQueryBuilder());
+            searchExecutionContext.explain(context.explain());


unnecessary.

…eFunction to support it right now

benwtrent · 2024-08-13T19:19:07Z

server/src/main/java/org/elasticsearch/common/lucene/search/function/ScriptScoreFunction.java

                if (leafScript instanceof ExplainableScoreScript) {
-                    leafScript.setDocument(docId);
-                    scorer.docid = docId;
-                    scorer.score = subQueryScore.getValue().floatValue();
                    exp = ((ExplainableScoreScript) leafScript).explain(subQueryScore);


I am not sure why this was removed, but it should be added back.

Ah yes, you just beat me to it; that's why the build was failing. Misunderstanding on my part. It's added back now.

benwtrent · 2024-08-13T19:27:36Z

...ert-scoring/src/yamlRestTest/resources/rest-api-spec/test/script_expert_scoring/20_score.yml

        rest_total_hits_as_int: true
        index: test
        body:
+          explain: true


Sorry for not mentioning this before, but we should have two tests here. One with explain and the other without.

The one with explain & your new assertion will likely need to skip old cluster versions (cluster_features: gte_v8.16.0` (or something to that effect).).

Then when backport to 8.15.1, that skip logic will need to be updated to account for 8.15.1.

Then once the backport is merged, main will need to be updated so that it tests against 8.15.1 as well.

makes sense; i'll update the test and track this as it's backported. Thanks!

...ert-scoring/src/yamlRestTest/resources/rest-api-spec/test/script_expert_scoring/20_score.yml

…ces/rest-api-spec/test/script_expert_scoring/20_score.yml Co-authored-by: Benjamin Trent <ben.w.trent@gmail.com>

allowing for a custom explanation to be passed through as part of supporting building a plugin with a custom script score; previously threw an npe

john-wagster · 2024-08-13T23:02:06Z

PR to backport to 8.15.1: #111864

* Explain Function Score Query (#111807) allowing for a custom explanation to be passed through as part of supporting building a plugin with a custom script score; previously threw an npe * updated test for 8.15.1

john-wagster · 2024-08-15T13:58:34Z

ported the test update for 8.15.1 forward into main: #111929

allowing for a custom explanation to be passed through as part of supporting building a plugin with a custom script score; previously threw an npe

john-wagster added 2 commits August 12, 2024 08:20

allowing for a custom explanation to be passed through as part of sup…

d950453

…porting building a plugin with a custom script score; previously threw an npe

Merge branch 'main' into explain-function-score-query

9614013

john-wagster added v8.15.1 :Search Relevance/Search Catch all for Search Relevance labels Aug 12, 2024

elasticsearchmachine added v8.16.0 Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch labels Aug 12, 2024

john-wagster added the >bug label Aug 12, 2024

Update docs/changelog/111807.yaml

d9805c8

jdconrad reviewed Aug 12, 2024

View reviewed changes

benwtrent reviewed Aug 12, 2024

View reviewed changes

server/src/main/java/org/elasticsearch/common/lucene/search/function/ScriptScoreFunction.java Show resolved Hide resolved

john-wagster added 5 commits August 12, 2024 15:50

only create explain holder if necessary and fixed how we return back …

ca2dc1f

…the explanation so we maintain the subquery

spelling

4382cac

source is not always available

4bd8934

explanationHolder is not always available

b63d64b

Merge branch 'main' into explain-function-score-query

29effaa

john-wagster commented Aug 13, 2024

View reviewed changes

benwtrent reviewed Aug 13, 2024

View reviewed changes

john-wagster added 4 commits August 13, 2024 08:50

Merge branch 'main' into explain-function-score-query

bb5be44

refactored so an explain flag is unnecessary

15fa612

refactored so an explain flag is unnecessary

caaea63

making this private because it's not clear we need to extend LeafScor…

e53cb4a

…eFunction to support it right now

benwtrent reviewed Aug 13, 2024

View reviewed changes

accidentally remove necessary bits

bcefc4c

benwtrent reviewed Aug 13, 2024

View reviewed changes

updated the tests to deal with backwards compatibility

6e6cee6

benwtrent reviewed Aug 13, 2024

View reviewed changes

...ert-scoring/src/yamlRestTest/resources/rest-api-spec/test/script_expert_scoring/20_score.yml Outdated Show resolved Hide resolved

Update plugins/examples/script-expert-scoring/src/yamlRestTest/resour…

9eae302

…ces/rest-api-spec/test/script_expert_scoring/20_score.yml Co-authored-by: Benjamin Trent <ben.w.trent@gmail.com>

benwtrent approved these changes Aug 13, 2024

View reviewed changes

john-wagster merged commit 935c0e4 into elastic:main Aug 13, 2024

john-wagster mentioned this pull request Aug 15, 2024

Updated Function Score Query Test with Explain Fixes for 8.15.1 #111929

Merged


		public class ScriptScoreFunction extends ScoreFunction {

		private static final ScoreScript.ExplanationHolder DUMMY_EXPLAIN_HOLDER = new ScoreScript.ExplanationHolder();

		@@ -85,9 +110,13 @@ public Explanation explainScore(int docId, Explanation subQueryScore) throws IOE
		} else {
		double score = score(docId, subQueryScore.getValue().floatValue());

		public void explain(boolean explain) {
		this.explain = explain;

Conversation

john-wagster commented Aug 12, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

elasticsearchmachine commented Aug 12, 2024

Uh oh!

elasticsearchmachine commented Aug 12, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

benwtrent left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

john-wagster commented Aug 13, 2024

Uh oh!

john-wagster commented Aug 15, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

john-wagster commented Aug 12, 2024 •

edited

Loading