Spark Runner: Change to use partitioner in GroupNonMergingWindowsFunctions#groupByKeyInGlobalWindow#32610
Conversation
|
Checks are failing. Will not request review until checks are succeeding. If you'd like to override that behavior, comment |
|
Run Java PreCommit |
|
I think task fails is not related to this PR. |
|
Run Java PreCommit |
1 similar comment
|
Run Java PreCommit |
|
Assigning reviewers. If you would like to opt out of this review, comment R: @damccorm for label build. Available commands:
The PR bot will only process comments in the main thread (not review comments). |
There was a problem hiding this comment.
LGTM overall, just one question from higher-level perspective. Given that the only partitioner that is ever passed to this method is returned by
private static @Nullable Partitioner getPartitioner(SparkTranslationContext context) {
Long bundleSize =
context.serializablePipelineOptions.get().as(SparkPipelineOptions.class).getBundleSize();
return (bundleSize > 0)
? null
: new HashPartitioner(context.getSparkContext().defaultParallelism());
}does this change have any practical impact? I would expect that the HashPartitioner with default parallelism is what would be used when the partitioner is unused, is this assumption wrong?
| "https://github.com/apache/beam/pull/31156": "noting that PR #31156 should run this test", | ||
| "https://github.com/apache/beam/pull/31798": "noting that PR #31798 should run this test" | ||
| "https://github.com/apache/beam/pull/31798": "noting that PR #31798 should run this test", | ||
| "https://github.com/apache/beam/pull/32610": "noting that PR #32610 should run this test" |
There was a problem hiding this comment.
I think we can remove these lines, the purpose of the file is to be just touched somehow for the tests to run. We might replace the old comment with a single one.
There was a problem hiding this comment.
@je-ik
Thanks! I rollbacked trigger files!
Yes, as you mentioned, following the code ultimately leads to using the HashPartitioner. |
kennknowles
left a comment
There was a problem hiding this comment.
Seems OK to me. I defer to Jan as well.
|
Thanks @twosom! |
…tions#groupByKeyInGlobalWindow (apache#32610) * change GroupNonMergingWindowsFunctions#groupByKeyInGlobalWindow to use partitioner
Please add a meaningful description for your change here
fixes #32608
This PR contains these changes
GroupNonMergingWindowsFunctions#groupByKeyInGlobalWindowto use partitionerThank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
addresses #123), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, commentfixes #<ISSUE NUMBER>instead.CHANGES.mdwith noteworthy changes.See the Contributor Guide for more tips on how to make review process smoother.
To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md
GitHub Actions Tests Status (on master branch)
See CI.md for more information about GitHub Actions CI or the workflows README to see a list of phrases to trigger workflows.