[VL] Add uniffle integration#3767
Conversation
|
Thanks for opening a pull request! Could you open an issue for this pull request on Github Issues? https://github.com/oap-project/gluten/issues Then could you also rename commit message and pull request title in the following format? See also: |
|
Run Gluten Clickhouse CI |
2 similar comments
|
Run Gluten Clickhouse CI |
|
Run Gluten Clickhouse CI |
cpp/core/jni/JniWrapper.cc
Outdated
| // rename CelebornClient RssClient | ||
| std::shared_ptr<CelebornClient> celebornClient = | ||
| std::make_shared<CelebornClient>(vm, partitionPusher, unifflePushPartitionDataMethod); | ||
| partitionWriterCreator = std::make_shared<CelebornPartitionWriterCreator>(std::move(celebornClient)); |
There was a problem hiding this comment.
Would you explain a bit why Celeborn's code is reused here?
There was a problem hiding this comment.
CelebornClient wraps in the partitionWriterCreator mainly do the work related to take out partition data, when evictPartitions is called. That will be all the same with different rss framework
|
|
I'll look into the GHA action, add one in this patch later |
|
@summaryzb The org.apache.spark.shuffle.UniffleColumnarBatchSerializer class is missing in the commit? |
Is it possible to reuse |
|
Run Gluten Clickhouse CI |
Yes, it's almost the same. Change it to reuse |
|
Run Gluten Clickhouse CI |
|
Run Gluten Clickhouse CI |
2 similar comments
|
Run Gluten Clickhouse CI |
|
Run Gluten Clickhouse CI |
|
Run Gluten Clickhouse CI |
|
@PZD-CHINA @philo-he |
...fle/velox/src/main/java/org/apache/spark/shuffle/gluten/uniffle/GlutenRssShuffleManager.java
Outdated
Show resolved
Hide resolved
@summaryzb, suppose uniffle-0.8.0 should be pre-installed in CI docker. Please let us know if there is any other dependency required. Thanks! |
|
This PR is stale because it has been open 45 days with no activity. Remove stale label or comment or this will be closed in 10 days. |
|
Run Gluten Clickhouse CI |
a1a5e24 to
e5310aa
Compare
|
Run Gluten Clickhouse CI |
1 similar comment
|
Run Gluten Clickhouse CI |
|
@philo-he PTAL |
|
Uniffle run tpcds with only one worker may encounter some stability issues occasionally,so this integration test only run with tpch |
|
@jackylee-ch PTAL |
philo-he
left a comment
There was a problem hiding this comment.
Thanks for your work! Please also respond to Hongze's comments. Thanks!
#3767 (comment)
| // rename CelebornClient RssClient | ||
| std::shared_ptr<CelebornClient> uniffleClient = | ||
| std::make_shared<CelebornClient>(vm, partitionPusher, unifflePushPartitionDataMethod); | ||
| partitionWriter = std::make_unique<CelebornPartitionWriter>( |
There was a problem hiding this comment.
Like the above comment, better to use a common name if they indeed can be shared by both celeborn & uniffle.
There was a problem hiding this comment.
Using a common name relate to all the logic of rss in jni layer, but not limit to this PartitionWriter, maybe it's better to resolve this in a separate pr
There was a problem hiding this comment.
@summaryzb, ok to me. Please create an issue to track this. Thanks!
...fle/velox/src/main/java/org/apache/spark/shuffle/gluten/uniffle/GlutenRssShuffleManager.java
Outdated
Show resolved
Hide resolved
...fle/velox/src/main/java/org/apache/spark/shuffle/gluten/uniffle/GlutenRssShuffleManager.java
Outdated
Show resolved
Hide resolved
...fle/velox/src/main/java/org/apache/spark/shuffle/gluten/uniffle/GlutenRssShuffleManager.java
Outdated
Show resolved
Hide resolved
|
Run Gluten Clickhouse CI |
|
Run Gluten Clickhouse CI |
|
Run Gluten Clickhouse CI |
|
@philo-he PTAL |
philo-he
left a comment
There was a problem hiding this comment.
@zhztheplayer, do you have any other comment?
| sparkConf.getSizeAsBytes( | ||
| RssSparkConfig.RSS_WRITER_BUFFER_SIZE.key(), | ||
| RssSparkConfig.RSS_WRITER_BUFFER_SIZE.defaultValue().get()); | ||
| compressionCodec = GlutenShuffleUtils.getCompressionCodec(sparkConf); |
There was a problem hiding this comment.
I note celeborn honors Spark's config for SHUFFLE_COMPRESS. Should uniffle do the same check? See e182d65.
|
Run Gluten Clickhouse CI |
I'm good if we have tests covering the feature. Please proceed if it's ready to merge. Thanks @summaryzb for this work! |
jackylee-ch
left a comment
There was a problem hiding this comment.
Since we have a follow-up issue #5335 to solve the conflicting code, there is no problem on my side. Thanks for your work.@summaryzb
|
This pr is ready to be merged on my side. Would leave some time to see if there are any more suggestions. |
zhztheplayer
left a comment
There was a problem hiding this comment.
@summaryzb Thank you for working on uniffle integration and I just left some late minor review comments.
| run: | | ||
| export MAVEN_HOME=/usr/lib/maven && \ | ||
| export PATH=${PATH}:${MAVEN_HOME}/bin && \ | ||
| export export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk && \ |
| tar xzf apache-uniffle-0.8.0-incubating-bin.tar.gz -C /opt/ && mv /opt/rss-0.8.0-hadoop2.8 /opt/uniffle && \ | ||
| wget -nv https://archive.apache.org/dist/hadoop/common/hadoop-2.8.5/hadoop-2.8.5.tar.gz && \ | ||
| tar xzf hadoop-2.8.5.tar.gz -C /opt/ | ||
| rm -f /opt/uniffle/jars/server/shuffle-server-0.8.0-SNAPSHOT.jar |
There was a problem hiding this comment.
Did you want to remove shuffle-server-0.8.0.jar instead of shuffle-server-0.8.0-SNAPSHOT.jar, was this a typo or something?
What changes were proposed in this pull request?
Add uniffle integration
How was this patch tested?
Integration test with uniffle manually
When uniffle 0.8.0 release in maven, add test in gluten action flow