[VL][Spark 3.3+] support match columns use filedIds in native insteads of fallback

currently native parquet scan didn't support match columns use filedId when reading, so the example case below would fallback with this PR ready https://github.com/oap-project/gluten/pull/2563 or return null.

This item use to track let native parquet scan can match column use filedId.

```scala
import org.apache.spark.sql.{AnalysisException, Column, DataFrame, Row}
import org.apache.spark.sql.types.{ArrayType, IntegerType, MapType, Metadata, MetadataBuilder, StringType, StructType}
import scala.collection.JavaConverters._

val FIELD_ID_METADATA_KEY = "parquet.field.id"

def withId(id: Int): Metadata =
    new MetadataBuilder().putLong(FIELD_ID_METADATA_KEY, id).build()

val readSchema =
new StructType()
  .add("a", StringType, true, withId(0))
  .add("b", IntegerType, true, withId(1))
  

val writeSchema =
new StructType()
  .add("random", IntegerType, true, withId(0))
  .add("name", StringType, true,withId(1))

val writeData = Seq(Row(100, "text"), Row(200, "more"))
spark.createDataFrame(writeData.asJava, writeSchema).write.mode("overwrite").parquet("/tmp/spark1/data")
val df = spark.read.schema(readSchema).parquet("/tmp/spark1/data")
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[VL][Spark 3.3+] support match columns use filedIds in native insteads of fallback #2619

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[VL][Spark 3.3+] support match columns use filedIds in native insteads of fallback #2619

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions