[GLUTEN-8852][CORE] PART0: Adding Spark400 support#9768
[GLUTEN-8852][CORE] PART0: Adding Spark400 support#9768zhouyuan merged 17 commits intoapache:mainfrom
Conversation
|
Run Gluten Clickhouse CI on x86 |
1 similar comment
|
Run Gluten Clickhouse CI on x86 |
|
Is there active work going on this PR? |
@kapilks yes, I'm still working on this - waiting for some refactoring work on the main branch to land first |
|
Run Gluten Clickhouse CI on x86 |
|
Run Gluten Clickhouse CI on x86 |
|
Run Gluten Clickhouse CI on x86 |
1 similar comment
|
Run Gluten Clickhouse CI on x86 |
|
Run Gluten Clickhouse CI on x86 |
1 similar comment
|
Run Gluten Clickhouse CI on x86 |
@zjuwangg Cool. With this patch Gluten can pass the basic TPCH tests. TPCDS failed due to missing logic of handling dynamic partition pruning. This can be included in the 1.5 release(targeting Aug). |
|
We can adjust the goal of this PR. If build pass for Spark 4.0 and no CI failure for earlier Spark support, I think we can merge the PR. The runtime issues can be fixed in subsequent PRs. |
|
Run Gluten Clickhouse CI on x86 |
Signed-off-by: Yuan <yuanzhou@apache.org>
Signed-off-by: Yuan <yuanzhou@apache.org>
Signed-off-by: Yuan <yuanzhou@apache.org>
* Fix compilation * Add reducer * Fix * Fix UI * Remove isNullIntolerant (already done in upstream) * Fix substrait module * Fix newly found issues * Fix UT * Minor change
Signed-off-by: Yuan <yuanzhou@apache.org>
Signed-off-by: Yuan <yuanzhou@apache.org>
* Initial * Refine
Signed-off-by: Yuan <yuanzhou@apache.org>
Signed-off-by: Yuan <yuanzhou@apache.org>
Signed-off-by: Yuan <yuanzhou@apache.org>
Signed-off-by: Yuan <yuanzhou@apache.org>
Signed-off-by: Yuan <yuanzhou@apache.org>
Signed-off-by: Yuan <yuanzhou@apache.org>
c2f5de4 to
976645e
Compare
|
Run Gluten Clickhouse CI on x86 |
1 similar comment
|
Run Gluten Clickhouse CI on x86 |
Signed-off-by: Yuan <yuanzhou@apache.org>
ec9e2c7 to
fcc1632
Compare
|
Run Gluten Clickhouse CI on x86 |
| * number of PlaceholderRows + the TerminalRow equates to the size of the original columnar batch. | ||
| */ | ||
| sealed abstract class BatchCarrierRow extends InternalRow { | ||
| abstract class BatchCarrierRowBase extends InternalRow { |
There was a problem hiding this comment.
We can avoid shiming this class by making BatchCarrierRow extend a mixin trait. I'll help update the PR for that.
| * @since 4.0.0 | ||
| */ | ||
| @Evolving | ||
| public interface Reducer<I, O> { |
There was a problem hiding this comment.
The class duplicates the one from vanilla Spark. I think we need to place it to all 3.x shims instead? Although it's tedious than the current practice.
| <phase>generate-sources</phase> | ||
| <configuration> | ||
| <target> | ||
| <replaceregexp file="src/main/scala/org/apache/spark/sql/execution/ui/GlutenAllExecutionsPage.scala" |
There was a problem hiding this comment.
It's a Scala class so we may be able to add a type shim in the shim layers to avoid such hacking? E.g.,
type HttpServletRequestShim = javax.servlet.http.HttpServletRequestThere was a problem hiding this comment.
@zhztheplayer, I have created a pr to fix: zhouyuan#33. Thanks.
|
Run Gluten Clickhouse CI on x86 |
1 similar comment
|
Run Gluten Clickhouse CI on x86 |
|
Run Gluten Clickhouse CI on x86 |
|
@zhztheplayer @philo-he Thanks a lot for helping on this patch! Just merged this initial part so it wont block community efforts on further improving. |
What changes were proposed in this pull request?
Here's the command to build package for Spark-400:
Note: Spark-400 will set ANSI on by default. However Gluten will automatically do fallback to JVM code path when ANSI ON. So for now it's recommended to test Gluten + Spark-400 with ANSI turned off:
.config("spark.sql.ansi.enabled", "false")co-authored-by: feilong.he@intel.com @philo-he
(Related: #8852)
How was this patch tested?
pass GHA