Skip to content
This repository was archived by the owner on Mar 24, 2025. It is now read-only.
This repository was archived by the owner on Mar 24, 2025. It is now read-only.

ignoreSurroundingSpaces is not working after upgrading to version 0.16.0 #636

@irajhedayati

Description

@irajhedayati

After upgrading to 0.16.0 from 0.14.0, my tests started failing.

Here is the output of the unit test

== Results ==
!== Correct Answer - 4 ==   == Spark Answer - 4 ==
 struct<id:string>          struct<id:string>
![A]                        [ B]
![B]                        [ D ]
![C]                        [A]
![D]                        [C ]

Here is how I read the file

    context.spark.read
      .format("com.databricks.spark.xml")
      .option("mode", "FAILFAST")
      .option("inferSchema", true)
      .option("rootTag", "feed")
      .option("rowTag", "entry")
      .option("treatEmptyValuesAsNulls", true)
      .option("ignoreNamespace", true)
      .option("ignoreSurroundingSpaces", true)
      .load("/path/to/file.xml")

and this is the input file

<?xml version="1.0" encoding="UTF-8"?>
<feed>
    <entry>
        <id>A</id>
    </entry>
    <entry>
        <id> B</id>
    </entry>
    <entry>
        <id>C </id>
    </entry>
    <entry>
        <id> D </id>
    </entry>
</feed>

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions