Add fielddata accessors (.value/.values/.distance()/etc) by rmuir · Pull Request #18169 · elastic/elasticsearch

rmuir · 2016-05-05T21:15:20Z

Currently fields in painless are recommended to be accessed as input.doc['field'].0.

Instead we should work closer to the current syntax input.doc['field'].value. To do this, we have to add ScriptDocValues to the whitelist so we can see its getValue() method.

This has other benefits, it means we expose more of the scripting api (see https://www.elastic.co/guide/en/elasticsearch/reference/master/modules-scripting-groovy.html#_doc_value_properties_and_methods)

So input.doc['field_name'].lat and doc['field_name'].arcDistance(34, 56) work too and so on, and geo fields become usable.

I TODO'd whitelisting any joda-time stuff to be a separate change.

Field loads for this common .value case with some temporary cheating. This is a 467% performance improvement for scripts accessing document fields from painless, so I think we should do it for now. We can do fancy stuff later and remove this, but it means at least common use cases are much faster.

I cutover the integration tests and docs to use this syntax.

jdconrad · 2016-05-05T21:25:26Z

modules/lang-painless/src/main/java/org/elasticsearch/painless/Definition.java

+        copyStruct("Strings", "List<String>", "Collection<String>", "Object");
+        copyStruct("Longs", "List<Object>", "Collection<Object>", "Object");
+        copyStruct("Doubles", "List<Object>", "Collection<Object>", "Object");
+        copyStruct("GeoPoints", "List<Object>", "Collection<Object>", "Object");


I think for Longs, Doubles, and GeoPoints it's safer to use List (List < def > ) so as an example Longs would be

copyStruct("Longs", "List", "Collection", "Object");

The reason being is that if anyone ever casts to one of these then we probably want to be able to handle the Object coming out of it without requiring a second cast.

((Longs)longField).get(0) is now a def type instead of an Object type. Really rare that this would ever happen, but I think for consistency this could be important.

jdconrad · 2016-05-05T21:26:14Z

Looks great! Just one comment.

rjernst · 2016-05-05T21:35:35Z

LGMT, awesome speedup!

rmuir · 2016-05-05T22:04:42Z

@jdconrad i cut these over to <Def>

jdconrad · 2016-05-05T22:26:26Z

LGTM. Thanks!

rmuir · 2016-05-05T22:30:18Z

thanks for reviewing. there is a lot to followup with for this one, I will take care of it.

clintongormley · 2016-05-06T08:48:45Z

@rmuir should i move these properties/methods (https://www.elastic.co/guide/en/elasticsearch/reference/master/modules-scripting-groovy.html#_doc_value_properties_and_methods) back to the general scripting section instead of leaving them in groovy? or perhaps add them to the painless docs?

rmuir · 2016-05-06T10:46:30Z

@clintongormley I don't know what to do here!

These properties/methods, lets call it "the scripting api", is really exposed to all script engines via Java. So today groovy, javascript, python, and painless support it.

Expressions basically exposes what looks like a subset of this API to users, but its a lie: it in fact bypasses the whole thing entirely, and the expressions API is inconsistent in random annoying ways: e.g. doc['field'].count() vs doc['field'].size(). So not really a proper subset :(

On one hand, I think painless needs to support this API, to get people off of groovy. On the other hand, I'm not sure if we want to declare it front-and-center "the scripting api", because its not very good. I don't mean this in an offensive way, the API is both slow and bad, its just a fact. Expressions is 2x faster than anything else because it bypasses it completely.

One reason it is cumbersome and slow is that it exposes a document's fields as always being multivalued lists, but its not clear to me user's even care about multi-valuedness whatsoever: (4ddf916)

There are also tons and tons of geo functions, but many of these look very obscure. I think its ok to continue supporting them, to not break anyones scripts, but do we really have to document all the obscure stuff?

I think it makes things overwhelming. An easy win might be to try to drop some of these obscure methods from the documentation. I think all the multi-valued methods (e.g. .values, .lats, .lons) should probably be dropped from the documentation too. It would be better to instead have a footnote at the bottom that says "each field is really a list so to get access to the other values you can use .get(1)/.size()/etc"

jdconrad · 2016-05-06T16:34:44Z

I really like the proposal to drop the more obscure stuff other than as a footnote. On the differences in API between Expressions and others -- maybe we can deprecate the ones that don't match and add ones that do match?

s1monw · 2016-05-09T08:03:36Z

On one hand, I think painless needs to support this API, to get people off of groovy. On the other hand, I'm not sure if we want to declare it front-and-center "the scripting api", because its not very good. I don't mean this in an offensive way, the API is both slow and bad, its just a fact. Expressions is 2x faster than anything else because it bypasses it completely.

I do understand that we need to provide an API which is similar in it's functionality but putting ourself into the position of building painless on top of an already known problem sounds weird to me. I think we can build a new API that allows for good performance based on a subset of the current API and deprecate the current one. All other scripting languages can potentially have access to both such that folks can move gradually but we can build a new and fast API? Does this make sense and / or is this feasible at all?

rmuir added 2 commits May 5, 2016 16:47

Painless: add fielddata accessors (.value/.values/.distance()/etc)

f8e38c0

add fast-path for docvalues field loads

9edb155

rmuir added >enhancement :Plugin Lang Painless v5.0.0-alpha3 labels May 5, 2016

jdconrad reviewed May 5, 2016
View reviewed changes

cutover to <Def> instead of <Object> here

ec9f24b

rmuir closed this in e3ce6c9 May 5, 2016

jdconrad mentioned this pull request May 6, 2016

Ongoing Painless Improvements #17992

Closed

18 tasks

rmuir mentioned this pull request May 8, 2016

Switch painless dynamic calls to invokedynamic, remove perf hack/cheat #18201

Closed

clintongormley changed the title ~~Painless: add fielddata accessors (.value/.values/.distance()/etc)~~ Add fielddata accessors (.value/.values/.distance()/etc) May 17, 2016

clintongormley added :Core/Infra/Scripting Scripting abstractions, Painless, and Mustache and removed :Plugin Lang Painless labels Feb 14, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add fielddata accessors (.value/.values/.distance()/etc)#18169

Add fielddata accessors (.value/.values/.distance()/etc)#18169
rmuir wants to merge 3 commits intoelastic:masterfrom
rmuir:expand_painless

rmuir commented May 5, 2016

Uh oh!

jdconrad May 5, 2016 •

edited

Loading

Uh oh!

jdconrad commented May 5, 2016

Uh oh!

rjernst commented May 5, 2016

Uh oh!

rmuir commented May 5, 2016 •

edited

Loading

Uh oh!

jdconrad commented May 5, 2016

Uh oh!

rmuir commented May 5, 2016

Uh oh!

clintongormley commented May 6, 2016

Uh oh!

rmuir commented May 6, 2016

Uh oh!

jdconrad commented May 6, 2016 •

edited

Loading

Uh oh!

s1monw commented May 9, 2016 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

rmuir commented May 5, 2016

Uh oh!

jdconrad May 5, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jdconrad commented May 5, 2016

Uh oh!

rjernst commented May 5, 2016

Uh oh!

rmuir commented May 5, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jdconrad commented May 5, 2016

Uh oh!

rmuir commented May 5, 2016

Uh oh!

clintongormley commented May 6, 2016

Uh oh!

rmuir commented May 6, 2016

Uh oh!

jdconrad commented May 6, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

s1monw commented May 9, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

jdconrad May 5, 2016 •

edited

Loading

rmuir commented May 5, 2016 •

edited

Loading

jdconrad commented May 6, 2016 •

edited

Loading

s1monw commented May 9, 2016 •

edited

Loading