ES|QL: Fix KQL/QSTR with unmapped fields in NULLIFY mode#143399
ES|QL: Fix KQL/QSTR with unmapped fields in NULLIFY mode#143399luigidellaquila merged 13 commits intoelastic:mainfrom
Conversation
|
Pinging @elastic/es-analytical-engine (Team:Analytics) |
|
Hi @luigidellaquila, I've created a changelog YAML for you. |
|
Without having actually reviewed,
Update: please disregard, this is not about type compatibility, but about outright rejecting |
|
Had a look at the approach. Still a little unsure. It's a pretty neat idea, though, and I think it's growing on me. Injecting an Creating That said, @luigidellaquila , can we leave this open until the following are merged?
|
|
Thanks for the feedback @alex-spies
Actually, this is exactly what we do today with index patterns, a node/cluster could receive attributes that don't exist in the local indices, and the local planner just ignores them and emits a null. That's the main reason that makes me consider this approach (probably) safe
I thought about this and I don't have a very strong opinion. It could be useful for testing, but in practice this is just a null value, so I'm not sure.
Yes, absolutely! More tests = catching more potential corner cases |
What's important to check is if A simpler approach might be to ensure the added |
I tend to put correctness on top of performance, but I think in this case we are lucky. It works because
I also tried this path, but synthetic attributes are not supposed to be returned as a result, so I got other failures. |
Oh, that's a nice surprise. Thanks for checking! In this case, this looks like a nice improvement over the original approach and I think we should go for it, provided that we are a little more paranoid about testing things that could break. |
|
Important Review skippedAuto reviews are limited based on label configuration. 🏷️ Required labels (at least one) (2)
Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Path: .coderabbit.yml Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Tip Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs). Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
bpintea
left a comment
There was a problem hiding this comment.
LGTM.
Left two nits, but not worth going through CI if these will be the only left things.
I think I kind of like this solution better. 👍
| // For non-EsRelation sources (Row, LocalRelation): insert Eval nodes with null assignments | ||
| // This handles cases like: ROW x = 1 | EVAL y = unmapped_field | ||
| transformed = transformed.transformUp( | ||
| n -> n instanceof UnaryPlan unary && unary.child() instanceof LeafPlan leaf && (leaf instanceof EsRelation == false), |
| List<LogicalPlan> newChildren = new ArrayList<>(nAry.children().size()); | ||
| boolean changed = false; | ||
| for (var child : nAry.children()) { | ||
| if (child instanceof LeafPlan source && (source instanceof EsRelation == false)) { |
There was a problem hiding this comment.
Also here, ( ) are redundant.
|
Thanks @bpintea! |
|
Sorry, ignore all the above, I don't know what I was looking at. |
Fixing full text functions (KQL, QSTR) usage in queries with
unmapped_fields=NULLIFY.When using
SET unmapped_fields="nullify", KQL and QSTR functions would fail validation due toEvalnodes added right afterEsRelation.Here we propose a different approach: instead of adding an EVAL, we add unmapped fields directly to
EsRelation's output withDataType.NULL.ReplaceFieldWithConstantOrNull(local planner) rule will replace them with null literals anyway.This should be fine, this is the same thing we do with non-existing fields locally (with index patterns).
And it's conceptually the same thing we do with LOAD strategy.
For non-FROM queries (eg. ROW) we still fall back to adding an EVAL.
I also evaluated two other solutions, but they don't seem to work:
EVAL non_existing = nullmanually, and it would be hard to distinguish.Fixes: #142968
Fixes: #142959