Skip to content

Conversation

@rishvin
Copy link
Contributor

@rishvin rishvin commented Aug 2, 2025

Which issue does this PR close?

Closes #1820

Rationale for this change

Now that Comet's dependency has been updated to DF-49, the SHA2 fix apache/datafusion#16350 is available to Comet. This PR removes the Comet's SHA2 implementation and instead uses DataFusion's implementation.

What changes are included in this PR?

  • Cleans Comet's SHA2 related functionalities in favor of DataFusion one.
  • Refactor SHA2 serdes logic to align with the new pattern.
  • Extend related test case.

How are these changes tested?

  • Ran hash functions test case in CometExpressionSuite.

@rishvin rishvin changed the title Use Datafusion's Sha2 and remove Comet's implementation. chore: Use Datafusion's Sha2 and remove Comet's implementation. Aug 2, 2025
@codecov-commenter
Copy link

codecov-commenter commented Aug 2, 2025

Codecov Report

❌ Patch coverage is 54.54545% with 5 lines in your changes missing coverage. Please review.
✅ Project coverage is 58.75%. Comparing base (f09f8af) to head (440afb1).
⚠️ Report is 343 commits behind head on main.

Files with missing lines Patch % Lines
...k/src/main/scala/org/apache/comet/serde/hash.scala 50.00% 3 Missing and 2 partials ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main    #2063      +/-   ##
============================================
+ Coverage     56.12%   58.75%   +2.62%     
- Complexity      976     1253     +277     
============================================
  Files           119      137      +18     
  Lines         11743    13170    +1427     
  Branches       2251     2371     +120     
============================================
+ Hits           6591     7738    +1147     
- Misses         4012     4198     +186     
- Partials       1140     1234      +94     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@rishvin
Copy link
Contributor Author

rishvin commented Aug 4, 2025

Spark SQL Tests (ANSI mode) / spark-sql-sql/core-3/ubuntu-24.04/spark-4.0.0/java-17 (pull_request)

Looks like the test-setup failed, tests were not run.

@mbutrovich
Copy link
Contributor

Spark SQL Tests (ANSI mode) / spark-sql-sql/core-3/ubuntu-24.04/spark-4.0.0/java-17 (pull_request)

Looks like the test-setup failed, tests were not run.

Re-running.

Copy link
Contributor

@mbutrovich mbutrovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for managing this task through DataFusion and back to Comet, @rishvin! Approved pending the CI pipeline that failed for an unrelated network issue.

inputs: Seq[Attribute],
binding: Boolean): Option[ExprOuterClass.Expr] = {
if (!HashUtils.isSupportedType(expr)) {
return None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: It would be good to add a withInfo call to record the fallback reason.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @andygrove for the review. We have withInfo inside the isSupportedType. Let me know if I'm missing something.

Copy link
Member

@andygrove andygrove left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @rishvin

@mbutrovich mbutrovich merged commit e7cd746 into apache:main Aug 6, 2025
98 of 100 checks passed
@rishvin rishvin deleted the spark-sha2 branch August 6, 2025 19:20
// expression, however DataFusion does not support that yet.
val sha2Expr = expr.asInstanceOf[Sha2]
if (!sha2Expr.right.foldable) {
withInfo(expr, "For Sha2, non-foldable right argument is not supported")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original info message was much more useful for the end user.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created #2193. Will fix this later this week.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opened a tiny PR to fix the message: #2213

coderfender pushed a commit to coderfender/datafusion-comet that referenced this pull request Dec 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Use sha2 implementation from datafusion-spark crate

5 participants