Skip to content

Conversation

@Veeupup
Copy link
Contributor

@Veeupup Veeupup commented Dec 11, 2023

Which issue does this PR close?

Closes #8483 .

Rationale for this change

What changes are included in this PR?

Are these changes tested?

sqllogictest

Are there any user-facing changes?

Signed-off-by: veeupup <code@tanweime.com>
@github-actions github-actions bot added sql SQL Planner logical-expr Logical plan and expressions physical-expr Changes to the physical-expr crates sqllogictest SQL Logic Tests (.slt) labels Dec 11, 2023
Copy link
Member

@Weijun-H Weijun-H left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks @Veeupup

}
And => match left_data_type {
DataType::Boolean => boolean_op!(&left, &right, and_kleene),
DataType::List(_field) => array_intersect(&[left, right]),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
DataType::List(_field) => array_intersect(&[left, right]),
DataType::List(..) => array_intersect(&[left, right]),

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

And | Or => if matches!((lhs, rhs), (Boolean | Null, Boolean | Null)) {
// Logical binary boolean operators can only be evaluated for
// boolean or null arguments.
// boolean or null arguments.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// boolean or null arguments.
// boolean or null arguments.

BinaryOperator::PGBitwiseShiftRight => Ok(Operator::BitwiseShiftRight),
BinaryOperator::PGBitwiseShiftLeft => Ok(Operator::BitwiseShiftLeft),
BinaryOperator::StringConcat => Ok(Operator::StringConcat),
BinaryOperator::PGOverlap => Ok(Operator::And),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Postgres array overlap (&&) returns a boolean, which is not same as array_intersect/list_intersect in Duckdb. But looks like you are treating And (&&) as an alias like Duckdb. Maybe you should not parse a PGOverlap to And.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@viirya I think PGOverlap is just a parsed token. What does this token do (return boolean or list_intersect) should be define latter on. It is fine to convert it to Operator::And.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw PGOverlap does not seem pg specific, although the comment said. It is also parsed in GenericDialect. Is the comment misleading or what? 🤔

https://github.com/sqlparser-rs/sqlparser-rs/blob/8d97330d42fa9acf63d8bf6412b762948e025496/src/parser/mod.rs#L2106-L2107

Copy link
Member

@viirya viirya Dec 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No...this PR makes AND as an alias operator of array_intersect function when it is applied with List parameters. So this line actually means an BinaryOperator::PGOverlap will be treated as AND operator in DataFusion (i.e., AND in DataFusion is compatible with Postgres' PGOverlap operator).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think PGOverlap is just a parsed token. What does this token do (return boolean or list_intersect) should be define latter on. It is fine to convert it to Operator::And.

Doesn't this PR already define AND as an alias of array_intersect?

This comment was marked as outdated.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Postgres array overlap is the same as array intersect. Return common elements in two array

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure?

https://www.postgresql.org/docs/9.6/functions-array.html

&& | overlap (have elements in common) | ARRAY[1,4,3] && ARRAY[2,1] | t

postgres=# select array[1, 2] && array[3, 4];
 ?column? 
----------
 f
(1 row)

postgres=# select array[1, 2, 3] && array[3, 4];
 ?column? 
----------
 t
(1 row)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only the return value is different, both find if there is common between two.

However, whether we return array or boolean, we can still convert PGOverlap to Operator::Overlap for this operator.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer the behavior like Duckdb between we can still get the follow on information based on the return value, but we can't find out the common in arrays from boolean value.

@viirya viirya changed the title Support List Intersect Operator Support && operator as alias of array intersect function Dec 11, 2023
@viirya viirya changed the title Support && operator as alias of array intersect function Support AND operator as alias of array intersect function Dec 11, 2023
----
[2, 3] [] [aa, cc] [true] [2.2, 3.3] [[2, 2], [3, 3]]

query ??????
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
query ??????
# AND operator as alias of array_intersect
query ??????

Comment on lines 2819 to 2824
SELECT [1,2,3] && [2,3,4],
[1,3,5] && [2,4,6],
['aa','bb','cc'] && ['cc','aa','dd'],
[true, false] && [true],
[1.1, 2.2, 3.3] && [2.2, 3.3, 4.4],
[[1, 1], [2, 2], [3, 3]] && [[2, 2], [3, 3], [4, 4]]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As it is an alias, it is better to add array_intersect alongside to compare its value.

Suggested change
SELECT [1,2,3] && [2,3,4],
[1,3,5] && [2,4,6],
['aa','bb','cc'] && ['cc','aa','dd'],
[true, false] && [true],
[1.1, 2.2, 3.3] && [2.2, 3.3, 4.4],
[[1, 1], [2, 2], [3, 3]] && [[2, 2], [3, 3], [4, 4]]
SELECT [1,2,3] && [2,3,4], array_intersect([1,2,3], [2,3,4]),
[1,3,5] && [2,4,6], array_intersect([1,3,5], [2,4,6]),
['aa','bb','cc'] && ['cc','aa','dd'], array_intersect(['aa','bb','cc'], ['cc','aa','dd']),
[true, false] && [true], array_intersect([true, false], [true]),
[1.1, 2.2, 3.3] && [2.2, 3.3, 4.4], array_intersect([1.1, 2.2, 3.3], [2.2, 3.3, 4.4]),
[[1, 1], [2, 2], [3, 3]] && [[2, 2], [3, 3], [4, 4]], array_intersect([[1, 1], [2, 2], [3, 3]], [[2, 2], [3, 3], [4, 4]],),

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you! test has been fixed!

Comment on lines 92 to 93
} else if matches!((lhs, rhs), (List(_), List(_))) {
Ok(Signature::uniform(lhs.clone()))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we need to make sure left and right List have same element type?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the long term, we should have the same signature as BuiltinFunction::ArrayIntersect

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reuse array_coercion to resolve the data type~

}
And => match left_data_type {
DataType::Boolean => boolean_op!(&left, &right, and_kleene),
DataType::List(_field) => array_intersect(&[left, right]),
Copy link
Member

@viirya viirya Dec 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if DataFusion already has some stuff to handle operator -> function aliasing. At least, we should parse/convert the AND operator to array_intersect function call at early stage, instead of silently turning to call array_intersect in binary physical operator. If we are going to support more aliasing operators like this, we will soon find these aliasing logic scatters everywhere in codebase and hard to track.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a good suggestion. Like what we have done for MakeArry in sql_array_literal. I think we can do the similar conversion for other List operator in sql to logicExpr stage. Including array_concat, append prepend, array_has

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about we make this change in the following PR? it seems that we need to refactor array_concat, append prepend, array_has too

Signed-off-by: veeupup <code@tanweime.com>
@alamb
Copy link
Contributor

alamb commented Jan 31, 2024

What is the status of this PR?
I don't think this PR is waiting on feedback, so marking it as a draft. Please mark it as ready for review when it is ready for another look 🙏

@alamb alamb marked this pull request as draft January 31, 2024 21:01
@github-actions
Copy link

Thank you for your contribution. Unfortunately, this pull request is stale because it has been open 60 days with no activity. Please remove the stale label or comment or this will be closed in 7 days.

@github-actions github-actions bot added the Stale PR has not had any activity for some time label Apr 15, 2024
@github-actions github-actions bot closed this Apr 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

logical-expr Logical plan and expressions physical-expr Changes to the physical-expr crates sql SQL Planner sqllogictest SQL Logic Tests (.slt) Stale PR has not had any activity for some time

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support AND operator as alias of list intersect/array intersect function like Duckdb

5 participants