Skip to content

Fix Spark 4.0 sql tests #551

@kazuyukitanimura

Description

@kazuyukitanimura

Describe the bug

Regarding #537, there are 103 Spark 4.0 sql tests failing.

  • sql1 91 tests failing
  • sql2 12 tests failing

Fix comet shims for the Spark 4.0 profile and remove IgnoreComet for those tests. Some of the tests may share same root causes

sql-1

WIP PR posted Done Failing test
SPARK-43402: FileSourceScanExec supports push down data filter with scalar subquery
[SPARK-43226] extra constant metadata fields with extractors
parquet widening conversion ShortType -> IntegerType
parquet widening conversion IntegerType -> ShortType
parquet widening conversion IntegerType -> LongType
parquet widening conversion ShortType -> DoubleType
parquet widening conversion IntegerType -> DoubleType
parquet widening conversion DateType -> TimestampNTZType
parquet widening conversion ByteType -> DecimalType(10,0)
parquet widening conversion ByteType -> DecimalType(20,0)
parquet widening conversion ShortType -> DecimalType(10,0)
parquet widening conversion ShortType -> DecimalType(20,0)
parquet widening conversion ShortType -> DecimalType(38,0)
parquet widening conversion IntegerType -> DecimalType(10,0)
parquet widening conversion IntegerType -> DecimalType(20,0)
parquet widening conversion IntegerType -> DecimalType(38,0)
parquet widening conversion LongType -> DecimalType(20,0)
parquet widening conversion LongType -> DecimalType(38,0)
parquet widening conversion ByteType -> DecimalType(11,1)
parquet widening conversion ShortType -> DecimalType(11,1)
parquet widening conversion IntegerType -> DecimalType(11,1)
parquet widening conversion LongType -> DecimalType(21,1)
unsupported parquet conversion ByteType -> DecimalType(1,0)
unsupported parquet conversion ByteType -> DecimalType(3,0)
unsupported parquet conversion ShortType -> DecimalType(3,0)
unsupported parquet conversion ShortType -> DecimalType(5,0)
unsupported parquet conversion IntegerType -> DecimalType(5,0)
unsupported parquet conversion ByteType -> DecimalType(4,1)
unsupported parquet conversion ShortType -> DecimalType(6,1)
unsupported parquet conversion LongType -> DecimalType(10,0)
unsupported parquet conversion ByteType -> DecimalType(2,0)
unsupported parquet conversion ShortType -> DecimalType(4,0)
unsupported parquet conversion IntegerType -> DecimalType(9,0)
unsupported parquet conversion LongType -> DecimalType(19,0)
unsupported parquet conversion ByteType -> DecimalType(3,1)
unsupported parquet conversion ShortType -> DecimalType(5,1)
unsupported parquet conversion IntegerType -> DecimalType(10,1)
unsupported parquet conversion LongType -> DecimalType(20,1)
unsupported parquet timestamp conversion TimestampType (TIMESTAMP_MICROS) -> DateType
unsupported parquet timestamp conversion TimestampType (TIMESTAMP_MILLIS) -> DateType
unsupported parquet timestamp conversion TimestampNTZType (INT96) -> DateType
unsupported parquet timestamp conversion TimestampNTZType (TIMESTAMP_MICROS) -> DateType
unsupported parquet timestamp conversion TimestampNTZType (TIMESTAMP_MILLIS) -> DateType
parquet decimal precision change Decimal(5, 2) -> Decimal(7, 2)
parquet decimal precision change Decimal(5, 2) -> Decimal(10, 2)
parquet decimal precision change Decimal(5, 2) -> Decimal(20, 2)
parquet decimal precision change Decimal(10, 2) -> Decimal(12, 2)
parquet decimal precision change Decimal(10, 2) -> Decimal(20, 2)
parquet decimal precision change Decimal(20, 2) -> Decimal(22, 2)
parquet decimal precision change Decimal(7, 2) -> Decimal(5, 2)
parquet decimal precision change Decimal(10, 2) -> Decimal(5, 2)
parquet decimal precision change Decimal(20, 2) -> Decimal(5, 2)
parquet decimal precision change Decimal(12, 2) -> Decimal(10, 2)
parquet decimal precision change Decimal(20, 2) -> Decimal(10, 2)
parquet decimal precision change Decimal(22, 2) -> Decimal(20, 2)
parquet decimal precision and scale change Decimal(5, 2) -> Decimal(7, 4)
parquet decimal precision and scale change Decimal(5, 2) -> Decimal(10, 7)
parquet decimal precision and scale change Decimal(5, 2) -> Decimal(20, 17)
parquet decimal precision and scale change Decimal(10, 2) -> Decimal(12, 4)
parquet decimal precision and scale change Decimal(10, 2) -> Decimal(20, 12)
parquet decimal precision and scale change Decimal(20, 2) -> Decimal(22, 4)
parquet decimal precision and scale change Decimal(7, 4) -> Decimal(5, 2)
parquet decimal precision and scale change Decimal(10, 7) -> Decimal(5, 2)
parquet decimal precision and scale change Decimal(20, 17) -> Decimal(5, 2)
parquet decimal precision and scale change Decimal(12, 4) -> Decimal(10, 2)
parquet decimal precision and scale change Decimal(20, 17) -> Decimal(10, 2)
parquet decimal precision and scale change Decimal(22, 4) -> Decimal(20, 2)
parquet decimal precision and scale change Decimal(10, 6) -> Decimal(12, 4)
parquet decimal precision and scale change Decimal(20, 7) -> Decimal(22, 5)
parquet decimal precision and scale change Decimal(12, 4) -> Decimal(10, 6)
parquet decimal precision and scale change Decimal(22, 5) -> Decimal(20, 7)
parquet decimal precision and scale change Decimal(5, 2) -> Decimal(6, 4)
parquet decimal precision and scale change Decimal(10, 4) -> Decimal(12, 7)
parquet decimal precision and scale change Decimal(20, 5) -> Decimal(22, 8)
parquet decimal type change Decimal(5, 2) -> Decimal(3, 2) overflows with parquet-mr
partition pruning in broadcast hash joins with aliases
partition pruning in broadcast hash joins
SPARK-32817: DPP throws error when the broadcast side is empty
SPARK-36444: Remove OptimizeSubqueries from batch of PartitionPruning
SPARK-38674: Remove useless deduplicate in SubqueryBroadcastExec
SPARK-39338: Remove dynamic pruning subquery if pruningKey's references is empty
SPARK-39217: Makes DPP support the pruning side has Union
partition pruning in broadcast hash joins with aliases
partition pruning in broadcast hash joins
different broadcast subqueries with identical children
SPARK-32817: DPP throws error when the broadcast side is empty
SPARK-36444: Remove OptimizeSubqueries from batch of PartitionPruning
SPARK-38674: Remove useless deduplicate in SubqueryBroadcastExec
SPARK-39338: Remove dynamic pruning subquery if pruningKey's references is empty
SPARK-39217: Makes DPP support the pruning side has Union
join with ordering requirement

sql-2

WIP PR posted Done Failing test
collations.sql
SPARK-39166: Query context of binary arithmetic should be serialized to executors when WSCG is off (Requires ANSI overflow check for ints)
SPARK-39175: Query context of Cast should be serialized to executors when WSCG is off
SPARK-39190,SPARK-39208,SPARK-39210: Query context of decimal overflow error should be serialized to executors when WSCG is off
SPARK-40389: Don't eliminate a cast which can cause overflow
postgreSQL/float8.sql
postgreSQL/groupingsets.sql
postgreSQL/int4.sql
SPARK-47120: subquery literal filter pushdown
SPARK-47120: subquery literal filter pushdown
view-schema-binding-config.sql
view-schema-compensation.sql
postgreSQL/int8.sql
postgreSQL/select_having.sql

Steps to reproduce

No response

Expected behavior

No response

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingspark 4

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions