-
Notifications
You must be signed in to change notification settings - Fork 168
[MINOR] fix(spark): Fix NPE for ShuffleWriteClientImpl.unregisterShuffle #1367
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
jerqi
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks @wForget
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## master #1367 +/- ##
============================================
+ Coverage 53.22% 54.12% +0.90%
Complexity 2715 2715
============================================
Files 418 398 -20
Lines 23932 21573 -2359
Branches 2043 2043
============================================
- Hits 12738 11677 -1061
+ Misses 10408 9181 -1227
+ Partials 786 715 -71 ☔ View full report in Codecov by Sentry. |
zuston
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks for your contribution. @wForget
…fle (#1367) ### What changes were proposed in this pull request? Fix NPE for ShuffleWriteClientImpl.unregisterShuffle. ### Why are the changes needed? Since the `id` of `RssShuffleManager` in Executor is assigned in `getWriter`, this means that the `id` may be null before `getWriter`, so this may cause NPE to occur in `ShuffleWriteClientImpl.unregisterShuffle`. error log: ``` 23/12/12 14:25:47 ERROR RssShuffleManager: Errors on closing shuffle write client java.lang.NullPointerException at java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:936) at org.apache.uniffle.client.impl.ShuffleWriteClientImpl.unregisterShuffle(ShuffleWriteClientImpl.java:751) at org.apache.spark.shuffle.RssShuffleManager.stop(RssShuffleManager.java:591) at org.apache.spark.SparkEnv.stop(SparkEnv.scala:90) at org.apache.spark.executor.Executor.stop(Executor.scala:335) at org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$receive$1$$anon$1.run(CoarseGrainedExecutorBackend.scala:203) ``` ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? (cherry picked from commit f17e060)
* master: (23 commits) format use synchronized format modify to singleton format format format format format change Coordinator to singleton like ShuffleServerClientFactory.java [apache#1189] feat(server): Add netty used direct memory size metric (apache#1363) format format format rename file rename [apache#960] fix(dashboard): simplify dependency and correct the startup script (apache#1347) [apache#1348] improvement(metrics): Unify tags generation for shuffle-server metrics reporter (apache#1349) [MINOR] chore: fix kubernetes ci pipeline (apache#1368) [MINOR] fix(spark): Fix NPE for ShuffleWriteClientImpl.unregisterShuffle (apache#1367) ...
…re_1119_new_2023_12_23 * feature_1119_BUFFER_LIMIT_OF_HUGE_PARTITION: add file for commit [apache#1189] feat(server): Add netty used direct memory size metric (apache#1363) format format format rename file rename [apache#960] fix(dashboard): simplify dependency and correct the startup script (apache#1347) [apache#1348] improvement(metrics): Unify tags generation for shuffle-server metrics reporter (apache#1349) [MINOR] chore: fix kubernetes ci pipeline (apache#1368) [MINOR] fix(spark): Fix NPE for ShuffleWriteClientImpl.unregisterShuffle (apache#1367) format code format code 添加堆外内存监控的代码
…fle (apache#1367) ### What changes were proposed in this pull request? Fix NPE for ShuffleWriteClientImpl.unregisterShuffle. ### Why are the changes needed? Since the `id` of `RssShuffleManager` in Executor is assigned in `getWriter`, this means that the `id` may be null before `getWriter`, so this may cause NPE to occur in `ShuffleWriteClientImpl.unregisterShuffle`. error log: ``` 23/12/12 14:25:47 ERROR RssShuffleManager: Errors on closing shuffle write client java.lang.NullPointerException at java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:936) at org.apache.uniffle.client.impl.ShuffleWriteClientImpl.unregisterShuffle(ShuffleWriteClientImpl.java:751) at org.apache.spark.shuffle.RssShuffleManager.stop(RssShuffleManager.java:591) at org.apache.spark.SparkEnv.stop(SparkEnv.scala:90) at org.apache.spark.executor.Executor.stop(Executor.scala:335) at org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$receive$1$$anon$1.run(CoarseGrainedExecutorBackend.scala:203) ``` ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested?
What changes were proposed in this pull request?
Fix NPE for ShuffleWriteClientImpl.unregisterShuffle.
Why are the changes needed?
Since the
idofRssShuffleManagerin Executor is assigned ingetWriter, this means that theidmay be null beforegetWriter, so this may cause NPE to occur inShuffleWriteClientImpl.unregisterShuffle.error log:
Does this PR introduce any user-facing change?
No.
How was this patch tested?
(Please test your changes, and provide instructions on how to test it: