ESQL: Implement LOOKUP, an "inline" enrich by nik9000 · Pull Request #107987 · elastic/elasticsearch

nik9000 · 2024-04-28T11:53:13Z

This adds support for LOOKUP, a command that implements a sort of
inline ENRICH, using data that is passed in the request:

$ curl -uelastic:password -HContent-Type:application/json -XPOST \
    'localhost:9200/_query?error_trace&pretty&format=txt' \
-d'{
    "query": "ROW a=1::LONG | LOOKUP t ON a",
    "tables": {
        "t": {
            "a:long":     [    1,     4,     2],
            "v1:integer": [   10,    11,    12],
            "v2:keyword": ["cat", "dog", "wow"]
        }
    },
    "version": "2024.04.01"
}'
      v1       |      v2       |       a       
---------------+---------------+---------------
10             |cat            |1

This required these PRs:

Closes #107306

elasticsearchmachine · 2024-04-28T11:53:36Z

Hi @nik9000, I've created a changelog YAML for you.

nik9000 · 2024-04-28T11:54:26Z

I've opened this as draft because I've not added any real tests. And because stuff like:

curl -uelastic:password -HContent-Type:application/json -XPOST 'localhost:9200/_query?error_trace&pretty' -d'{
    "query": "ROW a=1::LONG | LOOKUP t1 ON a | LOOKUP t2 ON v1",
    "tables": {
        "t1": {
            "a:long":     [1, 4, 2],
            "v1:long":    [5, 8, 9]
        },
        "t2": {
            "v1:long":    [    5,     8,     9],
            "v2:keyword": ["cat", "dog", "wow"]
        }
    },
    "version": "2024.04.01"
}'

fails with array index out of bounds exceptions that don't make sense to me yet.

…l___

x-pack/plugin/esql/src/main/antlr/EsqlBaseLexer.g4

x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/Analyzer.java

x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/optimizer/OptimizerRules.java

x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/parser/ExpressionBuilder.java

elasticsearchmachine · 2024-04-28T12:31:09Z

Hi @nik9000, I've updated the changelog YAML for you.

…l___

nik9000 · 2024-04-28T19:13:43Z

So I ran a little test:

curl -uelastic:password -XDELETE localhost:9200/test
for a in {0..99}; do
   rm -f /tmp/evil
   for b in {0..999}; do
      echo '{"index": {}}' >> /tmp/evil
      echo '{"a": 1}' >> /tmp/evil
      echo '{"index": {}}' >> /tmp/evil
      echo '{"a": 4}' >> /tmp/evil
      echo '{"index": {}}' >> /tmp/evil
      echo '{"a": 2}' >> /tmp/evil
   done
   echo >> /tmp/evil
   echo -n "$a:  "
   curl -s -HContent-Type:application/json -uelastic:password -XPOST localhost:9200/test/_bulk?pretty --data-binary @/tmp/evil | grep \"errors\"
done

curl -HContent-Type:application/json -uelastic:password -XPOST localhost:9200/test/_forcemerge?max_num_segments=1
curl -HContent-Type:application/json -uelastic:password -XPOST localhost:9200/test/_refresh

Then I ran these two a bunch:

curl -uelastic:password -HContent-Type:application/json -XPOST \
    'localhost:9200/_query?error_trace&pretty' \
-d'{
    "query": "FROM test | EVAL v1=a+1 | STATS COUNT(v1)",
    "profile": true,
    "version": "2024.04.01"
}'
curl -uelastic:password -HContent-Type:application/json -XPOST \
    'localhost:9200/_query?error_trace&pretty' \
-d'{
    "query": "FROM test | LOOKUP t ON a | STATS COUNT(v1)",
    "profile": true,
    "tables": {
        "t": {
            "a:long":     [    1,     4,     2],
            "v1:integer": [   10,    11,    12]
        }
    },
    "version": "2024.04.01"
}'

That's 300,000 documents - not many, but small enough to test fast on my laptop. From the profile EVAL v1=a + 1 took 1ms or about 3 nanoseconds per value. The actual addition is likely less than .3ns per operation but I guess the block accounting gets us slower. LOOKUP t ON a took 10ms or about 33ns per value. That's not scientific, but it's believable. About 28ns of those 33ns is hashing - which isn't a huge surprise - we know we're using our slowest hash implementation. We have faster hash implementations. We just have to flip a few things around to plug them in properly.

nik9000 · 2024-04-28T19:38:36Z

My last push seems to have fixed double-layered lookup:

$ curl -uelastic:password -HContent-Type:application/json -XPOST 'localhost:9200/_query?error_trace&pretty' -d'{
    "query": "ROW a=1::LONG | LOOKUP t1 ON a | LOOKUP t2 ON v1",
    "tables": {
        "t1": {
            "a:long":     [1, 4, 2],
            "v1:long":    [5, 8, 9]
        },
        "t2": {
            "v1:long":    [    5,     8,     9],
            "v2:keyword": ["cat", "dog", "wow"]
        }
    },
    "version": "2024.04.01"
}'
{
  "columns" : [
    {
      "name" : "a",
      "type" : "long"
    },
    {
      "name" : "v1",
      "type" : "long"
    },
    {
      "name" : "v2",
      "type" : "keyword"
    }
  ],
  "values" : [
    [
      1,
      5,
      "cat"
    ]
  ]
}

...plugin/esql/compute/src/main/java/org/elasticsearch/compute/operator/HashLookupOperator.java

...plugin/esql/compute/src/main/java/org/elasticsearch/compute/operator/ColumnLoadOperator.java

...plugin/esql/compute/src/main/java/org/elasticsearch/compute/operator/HashLookupOperator.java

x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/action/ParseTables.java

nik9000 · 2024-04-29T15:28:05Z

This needs a bunch more tests and docs.

This is a feature that'll likely stay in experimental for a few versions regardless of how difficult it is for us to settle the external facing stuff.

costin · 2024-06-05T00:13:50Z

x-pack/plugin/esql/qa/testFixtures/src/main/resources/lookup.csv-spec

+
+intIntByKeywordKeyword
+required_capability: lookup
+ROW aa="foo", ab="zoo"


Please add a test where aa and ab (as lookup keys) have the same value - `row aa = "foo", ab="foo" | lookup big on aa, ab".
Followed by variations:

lookup on field with empty value

lookup on function over field:
~ right now it can throw a validation error
~ moving forward we should be able to support it (by extracting it into a synthetic eval)
can be a follow-up.

"lookup on function over field:" - you mean like LOOKUP foo ON CONCAT(a, b)?

costin

LGTM - thanks Nik for making this happen and incorporating the various feedback received.

costin · 2024-06-05T05:18:36Z

x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/Analyzer.java

+            return new Lookup(source, lookup.child(), tableNameExpression, lookup.matchFields(), localRelation);
+        }
+
+        private LocalRelation tableMapAsRelation(Source source, Map<String, Column> mapTable) {


Sorry, just saw this message now: this resolution of tables can be done in the PreAnalyze phase, similar to how we do it for enrich or the rest of the indices.

Added a task for it in #109353

costin · 2024-06-05T05:21:39Z

x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/Analyzer.java

+                EsField field = new EsField(name, column.type(), Map.of(), false, false);
+                attributes.add(new FieldAttribute(source, null, name, field));


The source of the attribute should be decoupled from the Attribute itself. MetadataAttribute itself is stretching the concept since it's just a special FieldAttribute (who's name starts with _).

costin · 2024-06-05T05:26:15Z

x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/plan/logical/join/Join.java

+    }
+
+    public boolean duplicatesResolved() {
+        return left().outputSet().intersect(right().outputSet()).isEmpty();


Then let's remove it.

costin · 2024-06-05T05:27:36Z

x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/plan/logical/join/JoinType.java

+            case 2 -> RIGHT;
+            case 4 -> FULL;
+            case 5 -> CROSS;
+            default -> throw new IllegalArgumentException("unsupported join [" + id + "]");


We have EsqlIllegalArgumentException to differentiate between exceptions thrown by ESQL code vs other packages/JDK.

costin · 2024-06-05T05:28:51Z

x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/planner/Mapper.java

+            if (left instanceof FragmentExec) {
+                if (right instanceof FragmentExec) {
+                    throw new EsqlIllegalArgumentException("can't plan binary [" + p.nodeName() + "]");
+                }
+                // in case of a fragment, push to it any current streaming operator
+                return new FragmentExec(p);
+            }
+            if (right instanceof FragmentExec) {
+                // in case of a fragment, push to it any current streaming operator
+                return new FragmentExec(p);
+            }


we won't have general binary plans, because we cannot map them.

We don't have yet binary plans, but this PR is the first step in the right direction. see #99841

costin · 2024-06-05T05:34:46Z

cc @astefan and @bpintea for awareness

nik9000 · 2024-06-05T13:48:21Z

OK! I've pushed the last update from a comment. I think we've extracted all of the unfinished comments into our lovely followup meta issue (#109353). I'm marking this auto-merge and beginning to land it. Thanks so much friends. 186 comments, 96 commits. Over a little more than a month. And 16 pre-PRs that had to land before that. And followups. This is a big one!

nik9000 · 2024-06-07T00:42:54Z

Oh darn. I thought the bwc failures were because of a release, but that doesn't make sense - we didn't have a branch cut. they were real. I fixed them and pushed.

This adds support for `LOOKUP`, a command that implements a sort of inline `ENRICH`, using data that is passed in the request: ``` $ curl -uelastic:password -HContent-Type:application/json -XPOST \ 'localhost:9200/_query?error_trace&pretty&format=txt' \ -d'{ "query": "ROW a=1::LONG | LOOKUP t ON a", "tables": { "t": { "a:long": [ 1, 4, 2], "v1:integer": [ 10, 11, 12], "v2:keyword": ["cat", "dog", "wow"] } }, "version": "2024.04.01" }' v1 | v2 | a ---------------+---------------+--------------- 10 |cat |1 ``` This required these PRs: * elastic#107624 * elastic#107634 * elastic#107701 * elastic#107762 * Closes elastic#107306

@timestamp

The second prototype replaced MultiTypeField.Unresolved with MultiTypeField, but this clashed with existing behaviour around mapping unused MultiTypeFields to `unsupported` and `null`, so this new attempt simply adds new fields, resulting in more than one field with the same name. We still need to store this new field in EsRelation, so that physical planner can insert it into FieldExtractExec, so this is quite similar to the second protototype. The following query works in this third prototype: ``` multiIndexIpString FROM sample_data* METADATA _index | EVAL client_ip = TO_IP(client_ip) | KEEP _index, @timestamp, client_ip, event_duration, message | SORT _index ASC, @timestamp DESC ``` As with the previous prototyep, we no longer need an aggregation to force the conversion function onto the data node, as the 'real' conversion is now done at field extraction time using the converter function previously saved in the EsRelation and replanned into the EsQueryExec. Support row-stride-reader for LoadFromMany Add missing ESQL version after rebase on main Fixed missing block release Simplify UnresolvedUnionTypes Support other commands, notably WHERE Update docs/changelog/107545.yaml Fix changelog Removed unused code Slight code reduction in analyser of union types Removed unused interface method Fix bug in copying blocks (array overrun) Convert MultiTypeEsField.UnresolvedField back to InvalidMappedField This is to ensure older behaviour still works. Simplify InvalidMappedField support Rather than complex code to recreate InvalidMappedField from MultiTypeEsField.UnresolvedField, we rely on the fact that this is the parent class anyway, so we can resolve this during plan serialization/deserialization anyway. Much simpler Simplify InvalidMappedField support further Combining InvalidMappedField and MultiTypeEsField.UnresolvedField into one class simplifies plan serialization even further. InvalidMappedField is used slightly differently in QL We need to separate the aggregatable used in the original really-invalid mapped field from the aggregatable used if the field can indeed be used as a union-type in ES|QL. Updated version limitation after 8.14 branch Try debug CI failures in multi-node clusters Support type conversion in rowstride reader on single leaf Disable union_types from CsvTests Keep track of per-shard converters for LoadFromMany Simplify block loader convert function Code cleanup Added unit test for ValuesSourceReaderOperator including field type conversions at block loading Added test for @timestamp and fixed related bug It turns out that most, but not all, DataType values have the same esType as typeName, and @timestamp is one that does not, using `date` for esType and `datetime` for typename. Our EsqlIndexResolver was recording multi-type fields with `esType`, while later the actual type conversion was using an evaluator that relied on DataTypes.typeFromName(typeName). So we fixed the EsqlIndexResolver to rather use typeName. Added more tests, with three indices combined and two type conversions Disable lucene-pushdown on union-type fields Since the union-type rewriter replaced conversion functions with new FieldAttributes, these were passing the check for being possible to push-down, which was incorrect. Now we prevent that. Set union-type aggregatable flag to false always This simplifies the push-down check. Fixed tests after rebase on main Add unit tests for union-types (same field, different type) Remove generic warnings Test code cleanup and clarifying comments Remove -IT_tests_only in favor of CsvTests assumeFalse Improved comment Code review updates Code review updates Remove changes to ql/EsRelation And it turned out the latest version of union type no longer needed these changes anyway, and was using the new EsRelation in the ESQL module without these changes. Port InvalidMappedField to ESQL Note, this extends the QL version of InvalidMappedField, so is not a complete port. This is necessary because of the intertwining of QL IndexResolver and EsqlIndexResolver. Once those classes are disentangled, we can completely break InvalidMappedField from QL and make it a forbidden type. Fix capabilities line after rebase on main Revert QL FieldAttribute and extend with ESQL FieldAttribute So as to remove any edits to QL code, we extend FieldAttribute in the ESQL code with the changes required, since is simply to include the `field` in the hascode and equals methods. Revert "Revert QL FieldAttribute and extend with ESQL FieldAttribute" This reverts commit 168c6c75436e26b83e083cd3de8e18062e116bc9. Switch UNION_TYPES from EsqlFeatures to EsqlCapabilities Make hashcode and equals aligned And removed unused method from earlier union-types work where we kept the NodeId during re-writing (which we no longer do). Replace required_feature with required_capability after rebase Switch union_types capability back to feature, because capabilities do not work in mixed clusters Revert "Switch union_types capability back to feature, because capabilities do not work in mixed clusters" This reverts commit 56d58bedf756dbad703c07bf4cdb991d4341c1ae. Added test for multiple columns from same fields Both IP and Date are tested Fix bug with incorrectly resolving invalid types And added more tests Fixed bug with multiple fields of same name This fix simply removes the original field already at the EsRelation level, which covers all test cases but has the side effect of having the final field no-longer be unsupported/null when the alias does not overwrite the field with the same name. This is not exactly the correct semantic intent. The original field name should be unsupported/null unless the user explicitly overwrote the name with `field=TO_TYPE(field)`, which effectively deletes the old field anyway. Fixed bug with multiple conversions of the same field This also fixes the issue with the previous fix that incorrectly reported the converted type for the original field. More tests with multiple fields and KEEP/DROP combinations Replace skip with capabilities in YML tests Fixed missing ql->esql import change afer merging main Merged two InvalidMappedField classes After the QL code was ported to esql.core, we can now make the edits directly in InvalidMappedField instead of having one extend the other. Move FieldAttribute edits from QL to ESQL ESQL: Prepare analyzer for LOOKUP (elastic#109045) This extracts two fairly uncontroversial changes that were in the main LOOKUP PR into a smaller change that's easier to review. ESQL: Move serialization for EsField (elastic#109222) This moves the serialization logic for `EsField` into the `EsField` subclasses to better align with the way rest of Elasticsearch works. It also switches them from ESQL's home grown `writeNamed` thing to `NamedWriteable`. These are wire compatible with one another. ESQL: Move serialization of `Attribute` (elastic#109267) This moves the serialization of `Attribute` classes used in ESQL into the classes themselves to better line up with the rest of Elasticsearch. ES|QL: add MV_APPEND function (elastic#107001) Adding `MV_APPEND(value1, value2)` function, that appends two values creating a single multi-value. If one or both the inputs are multi-values, the result is the concatenation of all the values, eg. ``` MV_APPEND([a, b], [c, d]) -> [a, b, c, d] ``` ~I think for this specific case it makes sense to consider `null` values as empty arrays, so that~ ~MV_APPEND(value, null) -> value~ ~It is pretty uncommon for ESQL (all the other functions, apart from `COALESCE`, short-circuit to `null` when one of the values is null), so let's discuss this behavior.~ [EDIT] considering the feedback from Andrei, I changed this logic and made it consistent with the other functions: now if one of the parameters is null, the function returns null [ES|QL] Convert string to datetime when the other size of an arithmetic operator is date_period or time_duration (elastic#108455) * convert string to datetime when the other side of binary operator is temporal amount ESQL: Move `NamedExpression` serialization (elastic#109380) This moves the serialization for the remaining `NamedExpression` subclass into the class itself, and switches all direct serialization of `NamedExpression`s to `readNamedWriteable` and friends. All other `NamedExpression` subclasses extend from `Attribute` who's serialization was moved ealier. They are already registered under the "category class" for `Attribute`. This also registers them as `NamedExpression`s. ESQL: Implement LOOKUP, an "inline" enrich (elastic#107987) This adds support for `LOOKUP`, a command that implements a sort of inline `ENRICH`, using data that is passed in the request: ``` $ curl -uelastic:password -HContent-Type:application/json -XPOST \ 'localhost:9200/_query?error_trace&pretty&format=txt' \ -d'{ "query": "ROW a=1::LONG | LOOKUP t ON a", "tables": { "t": { "a:long": [ 1, 4, 2], "v1:integer": [ 10, 11, 12], "v2:keyword": ["cat", "dog", "wow"] } }, "version": "2024.04.01" }' v1 | v2 | a ---------------+---------------+--------------- 10 |cat |1 ``` This required these PRs: * elastic#107624 * elastic#107634 * elastic#107701 * elastic#107762 * Closes elastic#107306 parent 32ac5ba755dd5c24364a210f1097ae093fdcbd75 author Craig Taverner <craig@amanzi.com> 1717779549 +0200 committer Craig Taverner <craig@amanzi.com> 1718115775 +0200 Fixed compile error after merging in main Fixed strange merge issues from main Remove version from ES|QL test queries after merging main Fixed union-types on nested fields Switch to Luigi's solution, and expand nested tests Cleanup after rebase

@timestamp

* Union Types Support The second prototype replaced MultiTypeField.Unresolved with MultiTypeField, but this clashed with existing behaviour around mapping unused MultiTypeFields to `unsupported` and `null`, so this new attempt simply adds new fields, resulting in more than one field with the same name. We still need to store this new field in EsRelation, so that physical planner can insert it into FieldExtractExec, so this is quite similar to the second protototype. The following query works in this third prototype: ``` multiIndexIpString FROM sample_data* METADATA _index | EVAL client_ip = TO_IP(client_ip) | KEEP _index, @timestamp, client_ip, event_duration, message | SORT _index ASC, @timestamp DESC ``` As with the previous prototyep, we no longer need an aggregation to force the conversion function onto the data node, as the 'real' conversion is now done at field extraction time using the converter function previously saved in the EsRelation and replanned into the EsQueryExec. Support row-stride-reader for LoadFromMany Add missing ESQL version after rebase on main Fixed missing block release Simplify UnresolvedUnionTypes Support other commands, notably WHERE Update docs/changelog/107545.yaml Fix changelog Removed unused code Slight code reduction in analyser of union types Removed unused interface method Fix bug in copying blocks (array overrun) Convert MultiTypeEsField.UnresolvedField back to InvalidMappedField This is to ensure older behaviour still works. Simplify InvalidMappedField support Rather than complex code to recreate InvalidMappedField from MultiTypeEsField.UnresolvedField, we rely on the fact that this is the parent class anyway, so we can resolve this during plan serialization/deserialization anyway. Much simpler Simplify InvalidMappedField support further Combining InvalidMappedField and MultiTypeEsField.UnresolvedField into one class simplifies plan serialization even further. InvalidMappedField is used slightly differently in QL We need to separate the aggregatable used in the original really-invalid mapped field from the aggregatable used if the field can indeed be used as a union-type in ES|QL. Updated version limitation after 8.14 branch Try debug CI failures in multi-node clusters Support type conversion in rowstride reader on single leaf Disable union_types from CsvTests Keep track of per-shard converters for LoadFromMany Simplify block loader convert function Code cleanup Added unit test for ValuesSourceReaderOperator including field type conversions at block loading Added test for @timestamp and fixed related bug It turns out that most, but not all, DataType values have the same esType as typeName, and @timestamp is one that does not, using `date` for esType and `datetime` for typename. Our EsqlIndexResolver was recording multi-type fields with `esType`, while later the actual type conversion was using an evaluator that relied on DataTypes.typeFromName(typeName). So we fixed the EsqlIndexResolver to rather use typeName. Added more tests, with three indices combined and two type conversions Disable lucene-pushdown on union-type fields Since the union-type rewriter replaced conversion functions with new FieldAttributes, these were passing the check for being possible to push-down, which was incorrect. Now we prevent that. Set union-type aggregatable flag to false always This simplifies the push-down check. Fixed tests after rebase on main Add unit tests for union-types (same field, different type) Remove generic warnings Test code cleanup and clarifying comments Remove -IT_tests_only in favor of CsvTests assumeFalse Improved comment Code review updates Code review updates Remove changes to ql/EsRelation And it turned out the latest version of union type no longer needed these changes anyway, and was using the new EsRelation in the ESQL module without these changes. Port InvalidMappedField to ESQL Note, this extends the QL version of InvalidMappedField, so is not a complete port. This is necessary because of the intertwining of QL IndexResolver and EsqlIndexResolver. Once those classes are disentangled, we can completely break InvalidMappedField from QL and make it a forbidden type. Fix capabilities line after rebase on main Revert QL FieldAttribute and extend with ESQL FieldAttribute So as to remove any edits to QL code, we extend FieldAttribute in the ESQL code with the changes required, since is simply to include the `field` in the hascode and equals methods. Revert "Revert QL FieldAttribute and extend with ESQL FieldAttribute" This reverts commit 168c6c75436e26b83e083cd3de8e18062e116bc9. Switch UNION_TYPES from EsqlFeatures to EsqlCapabilities Make hashcode and equals aligned And removed unused method from earlier union-types work where we kept the NodeId during re-writing (which we no longer do). Replace required_feature with required_capability after rebase Switch union_types capability back to feature, because capabilities do not work in mixed clusters Revert "Switch union_types capability back to feature, because capabilities do not work in mixed clusters" This reverts commit 56d58bedf756dbad703c07bf4cdb991d4341c1ae. Added test for multiple columns from same fields Both IP and Date are tested Fix bug with incorrectly resolving invalid types And added more tests Fixed bug with multiple fields of same name This fix simply removes the original field already at the EsRelation level, which covers all test cases but has the side effect of having the final field no-longer be unsupported/null when the alias does not overwrite the field with the same name. This is not exactly the correct semantic intent. The original field name should be unsupported/null unless the user explicitly overwrote the name with `field=TO_TYPE(field)`, which effectively deletes the old field anyway. Fixed bug with multiple conversions of the same field This also fixes the issue with the previous fix that incorrectly reported the converted type for the original field. More tests with multiple fields and KEEP/DROP combinations Replace skip with capabilities in YML tests Fixed missing ql->esql import change afer merging main Merged two InvalidMappedField classes After the QL code was ported to esql.core, we can now make the edits directly in InvalidMappedField instead of having one extend the other. Move FieldAttribute edits from QL to ESQL ESQL: Prepare analyzer for LOOKUP (#109045) This extracts two fairly uncontroversial changes that were in the main LOOKUP PR into a smaller change that's easier to review. ESQL: Move serialization for EsField (#109222) This moves the serialization logic for `EsField` into the `EsField` subclasses to better align with the way rest of Elasticsearch works. It also switches them from ESQL's home grown `writeNamed` thing to `NamedWriteable`. These are wire compatible with one another. ESQL: Move serialization of `Attribute` (#109267) This moves the serialization of `Attribute` classes used in ESQL into the classes themselves to better line up with the rest of Elasticsearch. ES|QL: add MV_APPEND function (#107001) Adding `MV_APPEND(value1, value2)` function, that appends two values creating a single multi-value. If one or both the inputs are multi-values, the result is the concatenation of all the values, eg. ``` MV_APPEND([a, b], [c, d]) -> [a, b, c, d] ``` ~I think for this specific case it makes sense to consider `null` values as empty arrays, so that~ ~MV_APPEND(value, null) -> value~ ~It is pretty uncommon for ESQL (all the other functions, apart from `COALESCE`, short-circuit to `null` when one of the values is null), so let's discuss this behavior.~ [EDIT] considering the feedback from Andrei, I changed this logic and made it consistent with the other functions: now if one of the parameters is null, the function returns null [ES|QL] Convert string to datetime when the other size of an arithmetic operator is date_period or time_duration (#108455) * convert string to datetime when the other side of binary operator is temporal amount ESQL: Move `NamedExpression` serialization (#109380) This moves the serialization for the remaining `NamedExpression` subclass into the class itself, and switches all direct serialization of `NamedExpression`s to `readNamedWriteable` and friends. All other `NamedExpression` subclasses extend from `Attribute` who's serialization was moved ealier. They are already registered under the "category class" for `Attribute`. This also registers them as `NamedExpression`s. ESQL: Implement LOOKUP, an "inline" enrich (#107987) This adds support for `LOOKUP`, a command that implements a sort of inline `ENRICH`, using data that is passed in the request: ``` $ curl -uelastic:password -HContent-Type:application/json -XPOST \ 'localhost:9200/_query?error_trace&pretty&format=txt' \ -d'{ "query": "ROW a=1::LONG | LOOKUP t ON a", "tables": { "t": { "a:long": [ 1, 4, 2], "v1:integer": [ 10, 11, 12], "v2:keyword": ["cat", "dog", "wow"] } }, "version": "2024.04.01" }' v1 | v2 | a ---------------+---------------+--------------- 10 |cat |1 ``` This required these PRs: * #107624 * #107634 * #107701 * #107762 * Closes #107306 parent 32ac5ba755dd5c24364a210f1097ae093fdcbd75 author Craig Taverner <craig@amanzi.com> 1717779549 +0200 committer Craig Taverner <craig@amanzi.com> 1718115775 +0200 Fixed compile error after merging in main Fixed strange merge issues from main Remove version from ES|QL test queries after merging main Fixed union-types on nested fields Switch to Luigi's solution, and expand nested tests Cleanup after rebase * Added more tests from code review Note that one test, `multiIndexIpStringStatsInline` is muted due to failing with the error: UnresolvedException: Invalid call to dataType on an unresolved object ?client_ip * Make CsvTests consistent with integration tests for capabilities The integration tests do not fail the tests if the capability does not even exist on cluster nodes, instead the tests are ignored. The same behaviour should happen with CsvTests for consistency. * Return assumeThat to assertThat, but change order This way we don't have to add more features to the test framework in this PR, but we would probably want a mute feature (like a `skip` line). * Move serialization of MultiTypeEsField to NamedWritable approach Since the sub-fields are AbstractConvertFunction expressions, and Expression is not yet fully supported as a category class for NamedWritable, we need a few slight tweaks to this, notably registering this explicitly in the EsqlPlugin, as well as calling PlanStreamInput.readExpression() instead of StreamInput.readNamedWritable(Expression.class). These can be removed later once Expression is fully supported as a category class. * Remove attempt to mute two failed tests We used required_capability to mute the tests, but this caused issues with CsvTests which also uses this as a spelling mistake checker for typing the capability name wrong, so we tried to use muted-tests.yml, but that only mutes tests in specific run configurations (ie. we need to mute each and every IT class separately). So now we just remove the tests entirely. We left a comment in the muted-tests.yml file for future reference about how to mute csv-spec tests. * Fix rather massive issue with performance of testConcurrentSerialization Recreating the config on every test was very expensive. * Code review by Nik --------- Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>

nik9000 added 5 commits April 26, 2024 16:52

WIP

443a597

WIP

51f9cec

WIP2

49530c8

Merge branch 'main' into lookup_real___

88c8f4d

Links

67d4749

nik9000 added >enhancement :Analytics/ES|QL AKA ESQL v8.15.0 labels Apr 28, 2024

Update docs/changelog/107987.yaml

2ac5403

Rename

19c3d01

nik9000 added 4 commits April 28, 2024 07:56

doc

5e8a787

Merge remote-tracking branch 'nik9000/lookup_real___' into lookup_rea…

80cc1dd

…l___

More docs

210fdc1

Explain more

7ddc829

nik9000 commented Apr 28, 2024

View reviewed changes

Update docs/changelog/107987.yaml

b778756

nik9000 added 2 commits April 28, 2024 14:52

Fix a silly

71e41d0

Merge remote-tracking branch 'nik9000/lookup_real___' into lookup_rea…

1a0370e

…l___

nik9000 commented Apr 29, 2024

View reviewed changes

...plugin/esql/compute/src/main/java/org/elasticsearch/compute/operator/HashLookupOperator.java Outdated Show resolved Hide resolved

nik9000 commented Apr 29, 2024

View reviewed changes

...plugin/esql/compute/src/main/java/org/elasticsearch/compute/operator/ColumnLoadOperator.java Show resolved Hide resolved

nik9000 commented Apr 29, 2024

View reviewed changes

...plugin/esql/compute/src/main/java/org/elasticsearch/compute/operator/HashLookupOperator.java Outdated Show resolved Hide resolved

nik9000 commented Apr 29, 2024

View reviewed changes

x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/action/ParseTables.java Outdated Show resolved Hide resolved

nik9000 added 2 commits April 29, 2024 13:46

Merge branch 'main' into lookup_real___

6336056

Tests

09b8f2e

nik9000 added 2 commits June 4, 2024 12:30

From review

a8cdbf1

Last one?

2b001ca

costin reviewed Jun 5, 2024

View reviewed changes

costin approved these changes Jun 5, 2024

View reviewed changes

Merge branch 'main' into lookup_real___

02e2b32

astefan mentioned this pull request Jun 5, 2024

LOOKUP shouldn't duplicate the output if the same field was already present in the input #109392

Closed

nik9000 added 4 commits June 5, 2024 08:56

On function

aa3bfe8

WIP

3a7de6e

don't dupe

4c63877

more

3a02b66

nik9000 added the auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) label Jun 5, 2024

nik9000 added 9 commits June 5, 2024 12:08

Docs

0284cf5

Merge branch 'main' into lookup_real___

1b4b19f

one more test

f56899a

Merge branch 'main' into lookup_real___

4260d60

fix mergre

c9be288

Merge branch 'main' into lookup_real___

379f774

Merge branch 'main' into lookup_real___

508e538

Merge branch 'main' into lookup_real___

6eecd48

Fix tests

93176bf

elasticsearchmachine merged commit 7916e6a into elastic:main Jun 7, 2024

nik9000 deleted the lookup_real___ branch June 7, 2024 01:39

astefan mentioned this pull request Aug 20, 2024

ES|QL: support name qualifiers #112016

Closed

3 tasks

		EsField field = new EsField(name, column.type(), Map.of(), false, false);
		attributes.add(new FieldAttribute(source, null, name, field));

Conversation

nik9000 commented Apr 28, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

elasticsearchmachine commented Apr 28, 2024

Uh oh!

nik9000 commented Apr 28, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

elasticsearchmachine commented Apr 28, 2024

Uh oh!

nik9000 commented Apr 28, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nik9000 commented Apr 28, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

nik9000 commented Apr 29, 2024

Uh oh!

costin Jun 5, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nik9000 Jun 5, 2024

Choose a reason for hiding this comment

Uh oh!

costin left a comment

Choose a reason for hiding this comment

Uh oh!

costin Jun 5, 2024

Choose a reason for hiding this comment

Uh oh!

costin Jun 5, 2024

Choose a reason for hiding this comment

Uh oh!

costin Jun 5, 2024

Choose a reason for hiding this comment

Uh oh!

costin Jun 5, 2024

Choose a reason for hiding this comment

Uh oh!

costin Jun 5, 2024

Choose a reason for hiding this comment

Uh oh!

costin commented Jun 5, 2024

Uh oh!

nik9000 commented Jun 5, 2024

Uh oh!

nik9000 commented Jun 7, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

nik9000 commented Apr 28, 2024 •

edited

Loading

nik9000 commented Apr 28, 2024 •

edited

Loading

nik9000 commented Apr 28, 2024 •

edited

Loading

costin Jun 5, 2024 •

edited

Loading