Skip to content

[VL] Support monotonically_increasing_id function#10097

Closed
liujiayi771 wants to merge 2 commits intoapache:mainfrom
liujiayi771:monotonically_increasing_id
Closed

[VL] Support monotonically_increasing_id function#10097
liujiayi771 wants to merge 2 commits intoapache:mainfrom
liujiayi771:monotonically_increasing_id

Conversation

@liujiayi771
Copy link
Copy Markdown
Contributor

What changes were proposed in this pull request?

Support monotonically_increasing_id function in Velox backend. Velox had implemented it in facebookincubator/velox#9007 early on.

How was this patch tested?

Add a test case in ScalarFunctionsValidateSuite.

@github-actions github-actions bot added CORE works for Gluten Core VELOX CLICKHOUSE DOCS labels Jul 1, 2025
@github-actions
Copy link
Copy Markdown

github-actions bot commented Jul 1, 2025

Thanks for opening a pull request!

Could you open an issue for this pull request on Github Issues?

https://github.com/apache/incubator-gluten/issues

Then could you also rename commit message and pull request title in the following format?

[GLUTEN-${ISSUES_ID}][COMPONENT]feat/fix: ${detailed message}

See also:

@liujiayi771 liujiayi771 requested a review from zhli1142015 July 1, 2025 12:03
@github-actions
Copy link
Copy Markdown

github-actions bot commented Jul 1, 2025

Run Gluten Clickhouse CI on x86

@liujiayi771
Copy link
Copy Markdown
Contributor Author

@zhli1142015 @zhztheplayer Could you help to review?

@zhli1142015
Copy link
Copy Markdown
Contributor

I remembre that this function was disabled because of the issue below.
#7628
The discrepancy in results between Velox and Spark was due to the compiled expression being cached, resulting in a single expression instance whose ID kept incrementing.
https://github.com/facebookincubator/velox/blob/main/velox/expression/ExprCompiler.cpp#L376C29-L376C47
I thought we may need a fix in velox to allow marking this expression as “do not cache.”

@liujiayi771
Copy link
Copy Markdown
Contributor Author

liujiayi771 commented Jul 1, 2025

@zhli1142015 OK. I'll close this PR. But isn't this function itself non-deterministic anyway? As long as it generates an increasing non-repeating sequence, it shouldn't cause issues in practice—it just won't be perfectly consistent with Spark's results, right?

@liujiayi771
Copy link
Copy Markdown
Contributor Author

@zhli1142015 But the issue in #7628 is that calling monotonically_increasing_id twice in a single row is expected to yield the same result.

@liujiayi771 liujiayi771 closed this Jul 1, 2025
@zhli1142015
Copy link
Copy Markdown
Contributor

it shouldn't cause issues in practice?

It seems to me that this unit test verifies a non publicly documented behavior; but we re-enabled this function after we had fixed this issue internally.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants