Skip to content

Conversation

@PsiACE
Copy link
Member

@PsiACE PsiACE commented Aug 13, 2024

Which issue does this PR close?

Closes #11909 .

Rationale for this change

Ref: #11942

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

@github-actions github-actions bot added sqllogictest SQL Logic Tests (.slt) functions Changes to functions implementation labels Aug 13, 2024
PsiACE added 3 commits August 14, 2024 00:51
Signed-off-by: Chojan Shang <psiace@apache.org>
Signed-off-by: Chojan Shang <psiace@apache.org>
Signed-off-by: Chojan Shang <psiace@apache.org>
01)Projection: overlay(CAST(test.column1_utf8view AS Utf8), Utf8("foo"), Int64(2)) AS c1
01)Projection: overlay(test.column1_utf8view, Utf8View("foo"), Int64(2)) AS c1
02)--TableScan: test projection=[column1_utf8view]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likewise, can we please also add actually running these queries to the tests

like

query 
SELECT OVERLAY(column1_utf8view PLACING 'foo' FROM 2 ) as c1

?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done! 907e27e

Signed-off-by: Chojan Shang <psiace@apache.org>
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me -- thank you @PsiACE

}
}

macro_rules! process_overlay {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to have some sort of trait in arrow-rs that allowed us to write this as a generic function

It actually does have
https://github.com/apache/arrow-rs/blob/2461a16c19ee5032531b1c05dd7e7192bc842e0f/arrow-string/src/like.rs#L158-L161

But that is not public

@XiangpengHao do you know of anything that is pub?

We could also implement such a trait for DataFusion's convenience, and then propose upstreaming it 🤔

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was just behind -- it seems that @Omega359 did exactly this in StringArrayType #11941

trait StringArrayType<'a>: ArrayAccessor<Item = &'a str> + Sized {
fn iter(&self) -> ArrayIter<Self>;
}
impl<'a, T: OffsetSizeTrait> StringArrayType<'a> for &'a GenericStringArray<T> {
fn iter(&self) -> ArrayIter<Self> {
GenericStringArray::<T>::iter(self)
}
}
impl<'a> StringArrayType<'a> for &'a StringViewArray {
fn iter(&self) -> ArrayIter<Self> {
StringViewArray::iter(self)
}
}

Maybe we can start to pull that trait into its own module and start reusing it across the string functions 🤔

Also, there is the ArrayAccessor pattern used elegantly by @devanbenz in #11967 🤔

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current style is inspired by rpad in #11942 . I'll be rewriting it with ArrayAccessor, which I've used in other PRs, and it's a much more elegant way.

@alamb alamb merged commit ea2e7ab into apache:main Aug 15, 2024
@alamb
Copy link
Contributor

alamb commented Aug 15, 2024

Thanks @PsiACE

@PsiACE PsiACE deleted the overlay branch August 15, 2024 11:23
samuelcolvin pushed a commit to pydantic/datafusion that referenced this pull request Aug 15, 2024
* Implement native support StringView for overlay

Signed-off-by: Chojan Shang <psiace@apache.org>

* Re-write impl of overlay

Signed-off-by: Chojan Shang <psiace@apache.org>

* Minor update

Signed-off-by: Chojan Shang <psiace@apache.org>

* Add more tests

Signed-off-by: Chojan Shang <psiace@apache.org>

---------

Signed-off-by: Chojan Shang <psiace@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

functions Changes to functions implementation sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Update OVERLAY scalar function to support Utf8View

2 participants