PPL fillnull command enhancement#4421
Conversation
integ-test/src/test/java/org/opensearch/sql/calcite/CalciteNoPushdownIT.java
Outdated
Show resolved
Hide resolved
integ-test/src/test/java/org/opensearch/sql/ppl/FillNullCommandIT.java
Outdated
Show resolved
Hide resolved
integ-test/src/test/java/org/opensearch/sql/ppl/FillNullCommandIT.java
Outdated
Show resolved
Hide resolved
integ-test/src/test/java/org/opensearch/sql/ppl/FillNullCommandIT.java
Outdated
Show resolved
Hide resolved
integ-test/src/test/java/org/opensearch/sql/ppl/FillNullCommandIT.java
Outdated
Show resolved
Hide resolved
ppl/src/test/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizerTest.java
Show resolved
Hide resolved
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteFillNullCommandIT.java
Outdated
Show resolved
Hide resolved
| // Track if value= syntax was used (added in 3.4). Only needed to distinguish from with...in | ||
| // since both apply same value to all fields. using syntax is detected by checking if all | ||
| // replacement values are the same. | ||
| private final boolean useValueSyntax; |
There was a problem hiding this comment.
Is this only for anonymizer?
There was a problem hiding this comment.
Yes, I think this change is debatable, you can refer to this comment: #4421 (comment)
There was a problem hiding this comment.
I agree with you, we should just have the anonymizer output a consistent format if the underlying parse rule is irrecoverable without special tracking.
Personally I'd actually prefer the anonymizer defaulted to value= syntax, since it's consistent and more in line with other command syntax.
There was a problem hiding this comment.
I don't have strong preference on this. I think anonimized query log is needed for operator to check the input query in case of trouble shooting(I suppose we don't use it for analyzing use case statistics, but not quite sure), while we don't want to log sensitive information.
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteFillNullCommandIT.java
Outdated
Show resolved
Hide resolved
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteFillNullCommandIT.java
Outdated
Show resolved
Hide resolved
Signed-off-by: Kai Huang <ahkcs@amazon.com> # Conflicts: # integ-test/src/test/java/org/opensearch/sql/calcite/CalciteNoPushdownIT.java
Signed-off-by: Kai Huang <ahkcs@amazon.com>
Signed-off-by: Kai Huang <ahkcs@amazon.com>
1a6898a to
4712ec2
Compare
Signed-off-by: Kai Huang <ahkcs@amazon.com>
Signed-off-by: Kai Huang <ahkcs@amazon.com>
Signed-off-by: Kai Huang <ahkcs@amazon.com>
Signed-off-by: Kai Huang <ahkcs@amazon.com>
|
|
||
| // Check if the replacement type is compatible with the field type | ||
| // Allow NULL type family as it's compatible with any type | ||
| if (fieldFamily != replacementFamily |
There was a problem hiding this comment.
question: what are the semantics of family equality? Will it automatically down-cast?
Wondering what would happen if you tried to fillnull an int with 8589934592 (2^33)
There was a problem hiding this comment.
if the replacement value is 2^33, then the replacement value would be LONG, which will throw error if the field type is INT
Let me add an IT test for this case
There was a problem hiding this comment.
The type family equality check allows it. Both INT and BIGINT belong to SqlTypeFamily.NUMERIC, so the validation passes. Calcite's COALESCE function then handles the type widening automatically - it will widen the INT field to BIGINT to accommodate the larger value. No error is thrown; The result type becomes BIGINT.
I added an IT test (testFillNullWithLargeIntegerOnIntField) that confirms this behavior works correctly at runtime.
dai-chen
left a comment
There was a problem hiding this comment.
Thanks for the changes!
|
The backport to To backport manually, run these commands in your terminal: # Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/sql/backport-2.19-dev 2.19-dev
# Navigate to the new working tree
pushd ../.worktrees/sql/backport-2.19-dev
# Create a new branch
git switch --create backport/backport-4421-to-2.19-dev
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 e1a1bd88dfd92d9fe4c954387583c52c0bf6c29b
# Push it to GitHub
git push --set-upstream origin backport/backport-4421-to-2.19-dev
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/sql/backport-2.19-devThen, create a pull request where the |
Signed-off-by: Kai Huang <ahkcs@amazon.com> (cherry picked from commit e1a1bd8)
* PPL `fillnull` command enhancement (#4421) Signed-off-by: Kai Huang <ahkcs@amazon.com> (cherry picked from commit e1a1bd8) * fix Signed-off-by: Kai Huang <ahkcs@amazon.com> * fixes Signed-off-by: Kai Huang <ahkcs@amazon.com> * fix IT Signed-off-by: Kai Huang <ahkcs@amazon.com> * fix anonymizer Signed-off-by: Kai Huang <ahkcs@amazon.com> --------- Signed-off-by: Kai Huang <ahkcs@amazon.com>
* main-apple: (218 commits) Add ignorePrometheus Flag for integTest and docTest (opensearch-project#4442) Create fab-radar.yml PPL `fillnull` command enhancement (opensearch-project#4421) reverting to _doc + _id (opensearch-project#4435) Support `multisearch` command in calcite (opensearch-project#4332) Add 3.3 release notes (opensearch-project#4422) (opensearch-project#4423) [SQL/PPL] Fix the `count(*)` and `dc(field)` to be capped at MAX_INTEGER opensearch-project#4416 (opensearch-project#4418) Change the default search sort tiebreaker to `_shard_doc` for PIT search (opensearch-project#4378) [Enhancement] Add error handling for known limitation of sql `JOIN` (opensearch-project#4344) Bugfix: SQL type mapping for legacy JDBC output (opensearch-project#3613) Version bump: 3.3 (opensearch-project#4417) Add max/min eval functions (opensearch-project#4333) Support time modifiers in search command (opensearch-project#4224) Fix numbered token bug and make it optional output in patterns command (opensearch-project#4402) refactor span (opensearch-project#4334) Move release notes categories (opensearch-project#3818) [Doc] Enable doctest with Calcite (opensearch-project#4379) Mod function should return decimal instead of float when handle the operands are decimal literal (opensearch-project#4407) Scale of decimal literal should always be positive in Calcite (opensearch-project#4401) Enable Calcite by default and implicit fallback the unsupported commands (opensearch-project#4372) ...
What does this PR do?
Adds an alternative, option-style form for
fillnull(available since 3.4) alongside existingwith/usingsyntax. Semantics remain strict and schema-stable (no permissive casting in this PR).New syntax (option-style):
Existing syntax (still supported):
Notes / Limitations (intentional in this PR)
valueis mandatory (no default value yet).Future Work
Related Issues
Partially resolves #4419
Check List
--signoffor-s.