Skip to content

Change compare logical when comparing date related fields with string literal#3798

Merged
qianheng-aws merged 14 commits intoopensearch-project:mainfrom
xinyual:ignoreUselessTimestampCast
Jul 2, 2025
Merged

Change compare logical when comparing date related fields with string literal#3798
qianheng-aws merged 14 commits intoopensearch-project:mainfrom
xinyual:ignoreUselessTimestampCast

Conversation

@xinyual
Copy link
Copy Markdown
Contributor

@xinyual xinyual commented Jun 19, 2025

Description

The PR change the logical when compare with date/time/timestamp with a string literal to align with V2. Now we will cast the string literal to date/time/timestamp according to the compare target so it can be pushed down.
For example, previous, if we compare date < '2020-10-20 12:00:00', we will cast date to timestamp and append 00:00:00 to the date so they can compare. Now we cast the string literal to date as '2020-10-20'. Same as time.

For ppl like

source=XXX | where timestamp_field < '<timestamp string>'

for example,

source=XXX | where birthdate < '2020-10-20 00:00:00'

The calcite physical plan change from

EnumerableCalc(expr#0..12=[{inputs}], expr#13=[TIMESTAMP($t3)], expr#14=['2016-12-08 00:00:00':EXPR_TIMESTAMP VARCHAR], expr#15=[>($t13, $t14)], expr#16=['2018-11-09 00:00:00':EXPR_TIMESTAMP VARCHAR], expr#17=[<($t13, $t16)], expr#18=[AND($t15, $t17)], proj#0..12=[{exprs}], $condition=[$t18])\n  CalciteEnumerableIndexScan(table=[[OpenSearch, opensearch-sql_test_index_bank]], PushDownContext=[[PROJECT->[account_number, firstname, address, birthdate, gender, city, lastname, balance, employer, state, age, email, male]], OpenSearchRequestBuilder(sourceBuilder={\"from\":0,\"timeout\":\"1m\",\"_source\":{\"includes\":[\"account_number\",\"firstname\",\"address\",\"birthdate\",\"gender\",\"city\",\"lastname\",\"balance\",\"employer\",\"state\",\"age\",\"email\",\"male\"],\"excludes\":[]}}, requestedTotalSize=2147483647, pageSize=null, startFrom=0)])

to

CalciteEnumerableIndexScan(table=[[OpenSearch, opensearch-sql_test_index_bank]], PushDownContext=[[PROJECT->[account_number, firstname, address, birthdate, gender, city, lastname, balance, employer, state, age, email, male], FILTER->>($3, '2016-12-08 00:00:00'), FILTER-><($3, '2018-11-09 00:00:00')], OpenSearchRequestBuilder(sourceBuilder={\"from\":0,\"timeout\":\"1m\",\"query\":{\"bool\":{\"filter\":[{\"range\":{\"birthdate\":{\"from\":\"2016-12-08T00:00:00.000Z\",\"to\":null,\"include_lower\":false,\"include_upper\":true,\"boost\":1.0}}},{\"range\":{\"birthdate\":{\"from\":null,\"to\":\"2018-11-09T00:00:00.000Z\",\"include_lower\":true,\"include_upper\":false,\"boost\":1.0}}}],\"adjust_pure_negative\":true,\"boost\":1.0}},\"_source\":{\"includes\":[\"account_number\",\"firstname\",\"address\",\"birthdate\",\"gender\",\"city\",\"lastname\",\"balance\",\"employer\",\"state\",\"age\",\"email\",\"male\"],\"excludes\":[]},\"sort\":[{\"_doc\":{\"order\":\"asc\"}}]}, requestedTotalSize=2147483647, pageSize=null, startFrom=0)])"

same as time/date,
for date, the physical plan of PPL

source=opensearch-sql_test_index_date_formats | fields yyyy-MM-dd
| where yyyy-MM-dd > '2016-12-08 00:00:00.123456789' 
| where yyyy-MM-dd < '2018-11-09 00:00:00.000000000' 

from

EnumerableCalc(expr#0=[{inputs}], expr#1=[TIMESTAMP($t0)], expr#2=['2016-12-08 00:00:00.123456789':EXPR_TIMESTAMP VARCHAR], expr#3=[>($t1, $t2)], expr#4=['2018-11-09 00:00:00':EXPR_TIMESTAMP VARCHAR], expr#5=[<($t1, $t4)], expr#6=[AND($t3, $t5)], yyyy-MM-dd=[$t0], $condition=[$t6])\n  CalciteEnumerableIndexScan(table=[[OpenSearch, opensearch-sql_test_index_date_formats]], PushDownContext=[[PROJECT->[yyyy-MM-dd]], OpenSearchRequestBuilder(sourceBuilder={\"from\":0,\"timeout\":\"1m\",\"_source\":{\"includes\":[\"yyyy-MM-dd\"],\"excludes\":[]}}, requestedTotalSize=2147483647, pageSize=null, startFrom=0)])

to

CalciteEnumerableIndexScan(table=[[OpenSearch, opensearch-sql_test_index_date_formats]], PushDownContext=[[PROJECT->[yyyy-MM-dd], FILTER->>($0, '2016-12-08'), FILTER-><($0, '2018-11-09')], OpenSearchRequestBuilder(sourceBuilder={\"from\":0,\"timeout\":\"1m\",\"query\":{\"bool\":{\"filter\":[{\"range\":{\"yyyy-MM-dd\":{\"from\":\"2016-12-08\",\"to\":null,\"include_lower\":false,\"include_upper\":true,\"boost\":1.0}}},{\"range\":{\"yyyy-MM-dd\":{\"from\":null,\"to\":\"2018-11-09\",\"include_lower\":true,\"include_upper\":false,\"boost\":1.0}}}],\"adjust_pure_negative\":true,\"boost\":1.0}},\"_source\":{\"includes\":[\"yyyy-MM-dd\"],\"excludes\":[]},\"sort\":[{\"_doc\":{\"order\":\"asc\"}}]}, requestedTotalSize=2147483647, pageSize=null, startFrom=0)])

for time, the physical plan of PPL

source=opensearch-sql_test_index_date_formats | fields custom_time
| where custom_time > '2016-12-08 12:00:00.123456789' 
| where custom_time < '2018-11-09 19:00:00.123456789'

from

EnumerableCalc(expr#0=[{inputs}], expr#1=[TIMESTAMP($t0)], expr#2=['2016-12-08 12:00:00.123456789':EXPR_TIMESTAMP VARCHAR], expr#3=[>($t1, $t2)], expr#4=['2018-11-09 19:00:00.123456789':EXPR_TIMESTAMP VARCHAR], expr#5=[<($t1, $t4)], expr#6=[AND($t3, $t5)], custom_time=[$t0], $condition=[$t6])\n  CalciteEnumerableIndexScan(table=[[OpenSearch, opensearch-sql_test_index_date_formats]], PushDownContext=[[PROJECT->[custom_time]], OpenSearchRequestBuilder(sourceBuilder={\"from\":0,\"timeout\":\"1m\",\"_source\":{\"includes\":[\"custom_time\"],\"excludes\":[]}}, requestedTotalSize=2147483647, pageSize=null, startFrom=0)])

to

CalciteEnumerableIndexScan(table=[[OpenSearch, opensearch-sql_test_index_date_formats]], PushDownContext=[[PROJECT->[custom_time], FILTER->>($0, '12:00:00.123456789'), FILTER-><($0, '19:00:00.123456789')], OpenSearchRequestBuilder(sourceBuilder={\"from\":0,\"timeout\":\"1m\",\"query\":{\"bool\":{\"filter\":[{\"range\":{\"custom_time\":{\"from\":\"12:00:00.123456789\",\"to\":null,\"include_lower\":false,\"include_upper\":true,\"boost\":1.0}}},{\"range\":{\"custom_time\":{\"from\":null,\"to\":\"19:00:00.123456789\",\"include_lower\":true,\"include_upper\":false,\"boost\":1.0}}}],\"adjust_pure_negative\":true,\"boost\":1.0}},\"_source\":{\"includes\":[\"custom_time\"],\"excludes\":[]},\"sort\":[{\"_doc\":{\"order\":\"asc\"}}]}, requestedTotalSize=2147483647, pageSize=null, startFrom=0)])

Related Issues

Resolves #[Issue number to be closed when this PR is merged]
#3710

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • New functionality has javadoc added.
  • New functionality has a user manual doc added.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@LantaoJin
Copy link
Copy Markdown
Member

Please add a test case in CalciteExplainIT.

@xinyual
Copy link
Copy Markdown
Contributor Author

xinyual commented Jun 20, 2025

Please add a test case in CalciteExplainIT.

Already add one in CalcitePPLExplainIT

@LantaoJin
Copy link
Copy Markdown
Member

LantaoJin commented Jun 20, 2025

Please add a test case in CalciteExplainIT.

Already add one in CalcitePPLExplainIT

Please use CalciteExplainIT, they are for different purposes.
CalcitePPLExplainIT is an IT for explain command

@LantaoJin
Copy link
Copy Markdown
Member

Please update the PR description to add the Calcite physical plans before and after this patching.

return doubleValue();
} else if (isBoolean()) {
return booleanValue();
} else if (isTimestamp()) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if date/string comparing in a filter?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Date still needs to use timestamp to convert since we need to align with V2. We need to support something like date('2020-10-20') = '2020-10-20 00:00:00' since v2 will do the implicit conversion. So compare date/string cannot be pushed down.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean date/string comparing in a filter can pushdown in v2, not cannot in v3?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already change and now date/string, time/string can be pushed down as V2.

var result =
explainQueryToString(
String.format(
"source=%s | where birthdate < '2017-11-20 00:00:00'",
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

source = big5 | where `@timestamp` >= '2023-01-01 00:00:00.000000000' and `@timestamp` < '2023-01-03 00:00:00.000000000'

Here is an example of big5 query, can you test with similar usage?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Already add IT like this one.

Copy link
Copy Markdown
Member

@LantaoJin LantaoJin Jun 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, the case I listed was diff with the one you added. Please try where birthdate < '2018-11-09 00:00:00.000000000'

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. Already change it with nano second.

@LantaoJin LantaoJin added pushdown pushdown related issues calcite calcite migration releated labels Jun 20, 2025
xinyual added 8 commits June 23, 2025 11:02
Signed-off-by: xinyual <xinyual@amazon.com>
Signed-off-by: xinyual <xinyual@amazon.com>
Signed-off-by: xinyual <xinyual@amazon.com>
Signed-off-by: xinyual <xinyual@amazon.com>
Signed-off-by: xinyual <xinyual@amazon.com>
Signed-off-by: xinyual <xinyual@amazon.com>
This reverts commit f3b53af.

Signed-off-by: xinyual <xinyual@amazon.com>
Signed-off-by: xinyual <xinyual@amazon.com>
@xinyual xinyual force-pushed the ignoreUselessTimestampCast branch from be17d6d to 8720c74 Compare June 23, 2025 03:02
@xinyual
Copy link
Copy Markdown
Contributor Author

xinyual commented Jun 23, 2025

Force push to resolve dco problem

@xinyual
Copy link
Copy Markdown
Contributor Author

xinyual commented Jun 23, 2025

Please update the PR description to add the Calcite physical plans before and after this patching.

Already done. Please check it.

@xinyual
Copy link
Copy Markdown
Contributor Author

xinyual commented Jun 23, 2025

Please add a test case in CalciteExplainIT.

Already add one in CalcitePPLExplainIT

Please use CalciteExplainIT, they are for different purposes. CalcitePPLExplainIT is an IT for explain command

Already put it in correct place. Please check it.

@xinyual xinyual marked this pull request as ready for review June 23, 2025 04:19
Copy link
Copy Markdown
Member

@LantaoJin LantaoJin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

xinyual added 2 commits June 23, 2025 14:16
Signed-off-by: xinyual <xinyual@amazon.com>
Signed-off-by: xinyual <xinyual@amazon.com>
LantaoJin
LantaoJin previously approved these changes Jun 23, 2025
@LantaoJin
Copy link
Copy Markdown
Member

ping @qianheng-aws

Signed-off-by: xinyual <xinyual@amazon.com>
@xinyual xinyual changed the title Ignore useless timestamp cast Change compare logical when comparing date related fileds with string literal Jun 24, 2025
Signed-off-by: xinyual <xinyual@amazon.com>
@xinyual xinyual changed the title Change compare logical when comparing date related fileds with string literal Change compare logical when comparing date related fields with string literal Jun 25, 2025
@yuancu yuancu mentioned this pull request Jun 25, 2025
7 tasks
@Swiddis Swiddis added the enhancement New feature or request label Jun 26, 2025
qianheng-aws
qianheng-aws previously approved these changes Jun 27, 2025
}
}

public static RexNode transferCompareForDateRelated(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add javadoc for public static methods

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

return candidate;
}

public static SqlTypeName findCastType(RexNode left, RexNode right) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

return leftType == null ? rightType : leftType;
}

public static SqlTypeName returnCorrespondingSqlType(RexNode node) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Signed-off-by: xinyual <xinyual@amazon.com>
Signed-off-by: xinyual <xinyual@amazon.com>
@qianheng-aws qianheng-aws merged commit 22a88d1 into opensearch-project:main Jul 2, 2025
26 checks passed
@opensearch-trigger-bot
Copy link
Copy Markdown
Contributor

The backport to 2.19-dev failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/sql/backport-2.19-dev 2.19-dev
# Navigate to the new working tree
pushd ../.worktrees/sql/backport-2.19-dev
# Create a new branch
git switch --create backport/backport-3798-to-2.19-dev
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 22a88d105ce75ce740476610b9e5a8dacf4f71be
# Push it to GitHub
git push --set-upstream origin backport/backport-3798-to-2.19-dev
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/sql/backport-2.19-dev

Then, create a pull request where the base branch is 2.19-dev and the compare/head branch is backport/backport-3798-to-2.19-dev.

@LantaoJin
Copy link
Copy Markdown
Member

@xinyual could you manually backport it to 2.19-dev?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport 2.19-dev backport-failed backport-manually Filed a PR to backport manually. calcite calcite migration releated enhancement New feature or request pushdown pushdown related issues

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants