Script: Write Field API with basic path resolution#89738
Script: Write Field API with basic path resolution#89738stu-elastic merged 13 commits intoelastic:mainfrom
Conversation
|
Pinging @elastic/es-core-infra (Team:Core/Infra) |
|
Hi @stu-elastic, I've created a changelog YAML for you. |
| } | ||
|
|
||
| public void testIsExists() { | ||
| Map<String, Object> a = new HashMap<>(); |
There was a problem hiding this comment.
Can we please add some exists() tests for the null and missing container value?
There was a problem hiding this comment.
I thought this was going to be changed to exists and we weren't going to worry about painless shortcuts.
There was a problem hiding this comment.
Ahh, I missed the test, will rename.
| List<Object> values; | ||
| if (v == null) { | ||
| values = new ArrayList<>(4); | ||
| } else if (v instanceof List<?> list) { |
There was a problem hiding this comment.
Can someone supply a unmodifiable list? Should we copy the original list for unexpected side-effects?
There was a problem hiding this comment.
Yeah, someone could pass an unmodifiable list, this came up during my testing, in fact.
Should we copy the original list for unexpected side-effects?
My thinking is that would violate expectations, if we attempt to append to an UnmodifiableList then we should propagate the UnsupportedOperationException.
There was a problem hiding this comment.
It might be worth specializing here with a good error message as these things can be hard to track down in scripting.
There was a problem hiding this comment.
As indicated by @grcevski's comments below, there are a number of places we'd have to specialize to handle UnmodifiableList (or UnmodifiableRandomAccessList) and, to be consistent, we may want to also do the same for unmodifiable Maps.
That's a lot of code to provide a marginally better error message. Do you think it's worth it?
There was a problem hiding this comment.
Probably not worth it especially because there's no reasonable way to do this in a performant way.
| } | ||
|
|
||
| if (value instanceof List<?> list) { | ||
| // Assume modifiable list |
There was a problem hiding this comment.
Should this be an assert here, and perhaps related to my earlier comment, should we consider always making sure we have mutable list.
There was a problem hiding this comment.
An assert is going to be turned off in production. The primary reason for using an UnmodifiableList is to error on modifications.
There was a problem hiding this comment.
Is it worth having a better error to let the user know what happened here as well?
| } | ||
|
|
||
| if (value instanceof List<?> list) { | ||
| list.removeIf(filter); |
There was a problem hiding this comment.
assert for mutable list here too.
There was a problem hiding this comment.
My thinking is we should let it fail in the standard way, what do you think?
| } | ||
|
|
||
| if (value instanceof List<?> list) { | ||
| if (index < list.size()) { |
There was a problem hiding this comment.
As above, an actual assert would be turned off and any attempt to modify behavior based on an UnmodifiableList works around the scripts intent. Thoughts?
jdconrad
left a comment
There was a problem hiding this comment.
This looks good to me as I already posted my major concern in the WIP PR about possible issues with out-of-sync write fields due to lack of a consistent backing structure. On the topic of an immutable list -- would it ever make sense for a user to set the value to one that already exists? An advanced user may work around this, so I guess the options are:
- assert as suggested by @grcevski
- better error message for the user
- do nothing and let users encounter it as the built in error
- copy values if we detect an immutable list though this probably fragile
Since scripting users do not have access to
The thinking is there's no easy way to detect unmodifiable lists since they are implemented in multiple classes and not part of the java stdlib API contract. That would mean sprinkling If users are tripped up by the errors, that is something we can add later. |
If they don't have List.of then I think it's OK to assume the list will be mutable. 3. looks like a reasonable choice. |
Adds `WriteField` path manipulation APIs: * `WriteField` `move(String)`: Moves this path to the destination path. Throws an error if destination path exists. * `WriteField` `overwrite(String)`: Same as `move(String)` but overwrites the destination path if it exists; * `void` `remove()`: Remove the leaf value from this path. See also: #89738 Refs: #79155
Adds
WriteFieldto the ingest context viafield(<path>).WriteFieldimplements APIs:StringgetName(): The pathbooleanexists(): Does the path exist in the document?WriteFieldset(def): Set the value at the path, creates nested path elements if necessaryWriteFieldappend(def): Appends value to the path, creates nested path elements if necessary, the value at path is always a List after this call?booleanisEmpty(): Does the path contain a value?intsize(): How many elements does the path contain?Iteratoriterator(): Iterate over all elements at the path.defget(def): Get the value at the path if it exists, otherwise return the given defaultdefget(int, def): Get the value at the path and index if it exists, otherwise return the given defaultbooleanhasValue(Predicate): Is there a value at the path that passes the filter?WriteFieldtransform(Function): Change all values at the pathWriteFielddeduplicate(): Remove duplicates from the pathWriteFieldremoveValuesIf(Predicate): Remove a values from the path that pass the filterWriteFieldremoveValue(int): Remove the index from the path, if it existsSome APIs remain unimplemented:
voidmove(String)voidoverwrite(String)voidremove()This PR does not handle equivalent paths, which are paths that differ in the source but flatten to the same field in the indexed document.
Path resolution is basic, each path element is assumed to be a key in the current container, starting with the root. If there is not an entry in the map at a given level, the algorithm checks to see if the remaining path exists as a flat key. The nested then flat nature algorithm handles the common case of some or none nesting followed by flat keys.
Refs: #79155