Explain Function Score Query#111807
Conversation
…porting building a plugin with a custom script score; previously threw an npe
|
Pinging @elastic/es-search-relevance (Team:Search Relevance) |
|
Hi @john-wagster, I've created a changelog YAML for you. |
| leafScript._setIndexName(indexName); | ||
| leafScript._setShard(shardId); | ||
| return new LeafScoreFunction() { | ||
| ScoreScript.ExplanationHolder explanation = new ScoreScript.ExplanationHolder(); |
There was a problem hiding this comment.
We probably only want to create a new ExplanationHolder if explain is true for this query.
There was a problem hiding this comment.
makes sense to me; I'll poke around to see if I can figure out how to get that param down to that level
There was a problem hiding this comment.
So one problem with this is that if the plugin attempts to set an explanation even if explain isn't set to true then we will still throw an NPE. So it may be problematic to conditionally create an ExplanationHolder. I could create a dummy one at a higher level and pass that in, which would just be a no-op. Thoughts on this are welcome.
There was a problem hiding this comment.
pushed up a change for this; would love some feedback on whether what I've done to both pass down that explain param and provide a dummy placeholder are reasonable.
There was a problem hiding this comment.
I believe with the refactoring I've just done this is no longer a concern; let me know if you disagree for any reason @jdconrad
There was a problem hiding this comment.
You're correct that it requires a null check in the script. I should have added a bit more context. In our docs, our example has a null check on the explanation holder for this reason. It's not ideal to have the user do this, but it seemed like the better trade off at the time rather than creating possibly millions of additional objects for a script that won't use them.
There was a problem hiding this comment.
Yes, that's my miss. The docs mentioned this as did Ben in the issue raised and I missed that on my initial pass at this.
server/src/main/java/org/elasticsearch/common/lucene/search/function/ScriptScoreFunction.java
Show resolved
Hide resolved
| ); | ||
| queryBoost = request.indexBoost(); | ||
| this.lowLevelCancellation = lowLevelCancellation; | ||
| SearchSourceBuilder builder = request.source(); |
There was a problem hiding this comment.
Not sure if this is a reasonable expectation that source will be available when we expect to check for the explain flag.
| this.lowLevelCancellation = lowLevelCancellation; | ||
| SearchSourceBuilder builder = request.source(); | ||
| if (builder != null) { | ||
| Boolean requestExplain = request.source().explain(); |
There was a problem hiding this comment.
This seemed like the most logic way to get explain from the original query but would be good to validate this.
benwtrent
left a comment
There was a problem hiding this comment.
I think this is overcomplicated. It can be simplified with minimal changes.
| public double execute( | ||
| ExplanationHolder explanation | ||
| ) { | ||
| explanation.set("An example optional custom description to explain details for this script's execution; we'll provide a default one if you leave this out."); |
There was a problem hiding this comment.
these should check for null. These will be null when explain: false
|
|
||
| public class ScriptScoreFunction extends ScoreFunction { | ||
|
|
||
| private static final ScoreScript.ExplanationHolder DUMMY_EXPLAIN_HOLDER = new ScoreScript.ExplanationHolder(); |
There was a problem hiding this comment.
underlying folks should check for null. Passing this in implies that explain is occurring. It might be the script's plugin explanation takes compute power, we need to be able to signal that explain actually isn't occurring.
There was a problem hiding this comment.
So, we shouldn't have this dummy holder.
| scorer.docid = docId; | ||
| scorer.score = subQueryScore; | ||
| double result = leafScript.execute(null); | ||
| double result; |
There was a problem hiding this comment.
I don't know why we are updating the score here at all. Doesn't this work with explainScore ? And if explainScore is being called, we know we are "explaining" and thus should create a ScoreScript.ExplanationHolder ?
There was a problem hiding this comment.
That makes a lot of sense; I appreciate all your comments / direction. I'll come back with changes around all of your comments.
There was a problem hiding this comment.
I believe all of your concerns have been addressed in the lastest push; let me know if you have additional concerns at this point @benwtrent
| @@ -85,9 +110,13 @@ public Explanation explainScore(int docId, Explanation subQueryScore) throws IOE | |||
| } else { | |||
| double score = score(docId, subQueryScore.getValue().floatValue()); | |||
There was a problem hiding this comment.
What we should do here, is since we know explainScore means we are doing an explain call, we should handle the logic for creating a ScoreScript.ExplanationHolder here.
We should add a new public double score(int docId, float subQueryScore, ExplanationHolder holder) throws IOException { that both score and explainScore use where explainScore passes in a non-null holder, and scorer passes in null. This way we only create this object if we are explaining.
| public void explain(boolean explain) { | ||
| this.explain = explain; |
There was a problem hiding this comment.
we don't need all this. All this logic is already handled, we just need to take advantage of it. Let's remove all these context updates.
| ScoreScript.Factory factory = context.compile(script, ScoreScript.CONTEXT); | ||
| SearchLookup lookup = context.lookup(); | ||
| ScoreScript.LeafFactory searchScript = factory.newFactory(script.getParams(), lookup); | ||
| return new ScriptScoreFunction(script, searchScript, lookup, context.index().getName(), context.getShardId()); |
| ); | ||
| queryBoost = request.indexBoost(); | ||
| this.lowLevelCancellation = lowLevelCancellation; | ||
| SearchSourceBuilder builder = request.source(); |
There was a problem hiding this comment.
not necessary. explainScore is called already, which tells us to explain things. Thus we should use that as a signal to do so.
| QueryBuilder rewrittenForInnerHits = Rewriteable.rewrite(query, innerHitsRewriteContext, true); | ||
| InnerHitContextBuilder.extractInnerHits(rewrittenForInnerHits, innerHitBuilders); | ||
| searchExecutionContext.setAliasFilter(context.request().getAliasFilter().getQueryBuilder()); | ||
| searchExecutionContext.explain(context.explain()); |
| if (leafScript instanceof ExplainableScoreScript) { | ||
| leafScript.setDocument(docId); | ||
| scorer.docid = docId; | ||
| scorer.score = subQueryScore.getValue().floatValue(); | ||
| exp = ((ExplainableScoreScript) leafScript).explain(subQueryScore); |
There was a problem hiding this comment.
I am not sure why this was removed, but it should be added back.
There was a problem hiding this comment.
Ah yes, you just beat me to it; that's why the build was failing. Misunderstanding on my part. It's added back now.
| rest_total_hits_as_int: true | ||
| index: test | ||
| body: | ||
| explain: true |
There was a problem hiding this comment.
Sorry for not mentioning this before, but we should have two tests here. One with explain and the other without.
The one with explain & your new assertion will likely need to skip old cluster versions (cluster_features: gte_v8.16.0` (or something to that effect).).
Then when backport to 8.15.1, that skip logic will need to be updated to account for 8.15.1.
Then once the backport is merged, main will need to be updated so that it tests against 8.15.1 as well.
There was a problem hiding this comment.
makes sense; i'll update the test and track this as it's backported. Thanks!
...ert-scoring/src/yamlRestTest/resources/rest-api-spec/test/script_expert_scoring/20_score.yml
Outdated
Show resolved
Hide resolved
…ces/rest-api-spec/test/script_expert_scoring/20_score.yml Co-authored-by: Benjamin Trent <ben.w.trent@gmail.com>
allowing for a custom explanation to be passed through as part of supporting building a plugin with a custom script score; previously threw an npe
|
PR to backport to 8.15.1: #111864 |
* Explain Function Score Query (#111807) allowing for a custom explanation to be passed through as part of supporting building a plugin with a custom script score; previously threw an npe * updated test for 8.15.1
|
ported the test update for 8.15.1 forward into main: #111929 |
allowing for a custom explanation to be passed through as part of supporting building a plugin with a custom script score; previously threw an npe
allowing for a custom explanation to be passed through as part of supporting building a plugin with a custom script score; previously threw an npe
Addressing an NPE found when building a custom plugin leveraging a script.
#109177
This occurs when calling execute on a ScoreScript like this:
Example Returned Hit with the Explanation Customized
{ "_shard" : "[test][0]", "_node" : "6aOP4_Q5SOGIyr4oIxj0TQ", "_index" : "test", "_id" : "2", "_score" : 0.5685853, "_source" : { "important_field" : "foo foo foo" }, "_explanation" : { "value" : 0.5685853, "description" : "function score, product of:", "details" : [ ... { "value" : 3.0, "description" : "min of:", "details" : [ { "value" : 3.0, "description" : "An example optional custom description to explain details for this script's execution; we'll provide a default one if you leave this out.", "details" : [ { "value" : 0.18952844, "description" : "_score: ", "details" : [ { "value" : 0.18952844, "description" : "weight(important_field:foo in 1) [PerFieldSimilarity], result of:", "details" : [ { "value" : 0.18952844, "description" : "score(freq=3.0), computed as boost * idf * tf from:", "details" : [ { "value" : 2.2, "description" : "boost", "details" : [ ] }, { "value" : 0.13353139, "description" : "idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:", "details" : [ { "value" : 3, "description" : "n, number of documents containing term", "details" : [ ] }, { "value" : 3, "description" : "N, total number of documents with field", "details" : [ ] } ] }, { "value" : 0.6451613, "description" : "tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:", "details" : [ { "value" : 3.0, "description" : "freq, occurrences of term within document", "details" : [ ] }, { "value" : 1.2, "description" : "k1, term saturation parameter", "details" : [ ] }, { "value" : 0.75, "description" : "b, length normalization parameter", "details" : [ ] }, { "value" : 3.0, "description" : "dl, length of field", "details" : [ ] }, { "value" : 2.0, "description" : "avgdl, average length of field", "details" : [ ] } ] } ] } ] } ] } ] }, { "value" : 3.4028235E38, "description" : "maxBoost", "details" : [ ] } ] } ] } }What the Hit Would Normally Look Like Without the Custom Explanation
{ "_shard" : "[test][0]", "_node" : "Fg3ne_PXSi2ljhqljA6QOQ", "_index" : "test", "_id" : "2", "_score" : 0.5685853, "_source" : { "important_field" : "foo foo foo" }, "_explanation" : { "value" : 0.5685853, "description" : "function score, product of:", "details" : [ ... { "value" : 3.0, "description" : "min of:", "details" : [ { "value" : 3.0, "description" : "script score function, computed with script:\"Script{type=inline, lang='expert_scripts', idOrCode='pure_df', options={}, params={field=important_field, term=foo}}\"", "details" : [ { "value" : 0.18952844, "description" : "_score: ", "details" : [ { "value" : 0.18952844, "description" : "weight(important_field:foo in 1) [PerFieldSimilarity], result of:", "details" : [ { "value" : 0.18952844, "description" : "score(freq=3.0), computed as boost * idf * tf from:", "details" : [ { "value" : 2.2, "description" : "boost", "details" : [ ] }, { "value" : 0.13353139, "description" : "idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:", "details" : [ { "value" : 3, "description" : "n, number of documents containing term", "details" : [ ] }, { "value" : 3, "description" : "N, total number of documents with field", "details" : [ ] } ] }, { "value" : 0.6451613, "description" : "tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:", "details" : [ { "value" : 3.0, "description" : "freq, occurrences of term within document", "details" : [ ] }, { "value" : 1.2, "description" : "k1, term saturation parameter", "details" : [ ] }, { "value" : 0.75, "description" : "b, length normalization parameter", "details" : [ ] }, { "value" : 3.0, "description" : "dl, length of field", "details" : [ ] }, { "value" : 2.0, "description" : "avgdl, average length of field", "details" : [ ] } ] } ] } ] } ] } ] }, { "value" : 3.4028235E38, "description" : "maxBoost", "details" : [ ] } ] } ] } }