Skip to content

[GLUTEN-11550][UT] Enable GlutenXmlExpressionsSuite for spark4x and exclude 'from_xml- invalid data'#11580

Merged
baibaichen merged 1 commit intoapache:mainfrom
baibaichen:fix/xml-expressions-suite
Feb 28, 2026
Merged

[GLUTEN-11550][UT] Enable GlutenXmlExpressionsSuite for spark4x and exclude 'from_xml- invalid data'#11580
baibaichen merged 1 commit intoapache:mainfrom
baibaichen:fix/xml-expressions-suite

Conversation

@baibaichen
Copy link
Copy Markdown
Contributor

What changes are proposed in this pull request?

Fixes #11550 (partial)

  • Enable GlutenXmlExpressionsSuite in VeloxTestSettings for both spark40 and spark41 (was TODO disabled for spark41)
  • Fix mixin: GlutenTestsCommonTraitGlutenTestsTrait. The prior PR ([UT] Add missing Gluten test suites for Spark 4.0 and 4.1 #11512) added GlutenXmlExpressionsSuite with GlutenTestsCommonTrait, which does not enable Gluten execution for the test suite.
  • Exclude from_xml- invalid data: Gluten overrides checkEvaluation to execute expressions via DataFrame (df.select().collect()), which throws SparkException directly instead of wrapping it in TestFailedException. Same pattern as from_json - invalid data.
  • Fix woodstox classpath conflict: Exclude hadoop-common transitive dependency from hive-llap-common in both gluten-ut/pom.xml and spark-specific pom files. Hadoop ships a shaded woodstox (org.apache.hadoop.shaded.com.ctc.wstx.*) whose property names are incompatible with the non-shaded woodstox used by Spark XML, causing IllegalArgumentException: Unrecognized property in to_xml tests.

How was this patch tested?

Ran GlutenXmlExpressionsSuite on both spark40 and spark41:

  • spark40: 28 passed, 1 ignored (from_xml- invalid data) ✅
  • spark41: 28 passed, 1 ignored (from_xml- invalid data) ✅

Compiled successfully with both spark-4.0 and spark-4.1 profiles.

Was this patch authored or co-authored using generative AI tooling?

Generated-by: GitHub Copilot CLI

@github-actions github-actions bot added the CORE works for Gluten Core label Feb 6, 2026
@github-actions
Copy link
Copy Markdown

github-actions bot commented Feb 6, 2026

Run Gluten Clickhouse CI on x86

@github-actions
Copy link
Copy Markdown

github-actions bot commented Feb 7, 2026

Run Gluten Clickhouse CI on x86

@baibaichen baibaichen force-pushed the fix/xml-expressions-suite branch from b6a2397 to e88091c Compare February 10, 2026 10:30
@github-actions
Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

@baibaichen baibaichen force-pushed the fix/xml-expressions-suite branch from e88091c to 4fb4dec Compare February 26, 2026 11:35
@github-actions
Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

@baibaichen baibaichen force-pushed the fix/xml-expressions-suite branch from 4fb4dec to e8e4904 Compare February 27, 2026 09:45
…ml- invalid data'

- Enable GlutenXmlExpressionsSuite in VeloxTestSettings (was TODO disabled)
- Fix mixin: GlutenTestsCommonTrait → GlutenTestsTrait. The prior PR (apache#11512)
  added GlutenXmlExpressionsSuite with GlutenTestsCommonTrait, which does not
  enable Gluten execution for the test suite.
- Exclude 'from_xml- invalid data': Gluten overrides checkEvaluation to execute
  expressions via DataFrame, which throws SparkException directly instead of
  wrapping it in TestFailedException. Same pattern as 'from_json - invalid data'.
@baibaichen baibaichen force-pushed the fix/xml-expressions-suite branch from e8e4904 to 22b5cd8 Compare February 27, 2026 14:52
@baibaichen baibaichen merged commit f22e21f into apache:main Feb 28, 2026
106 of 107 checks passed
@baibaichen baibaichen deleted the fix/xml-expressions-suite branch February 28, 2026 02:25
baibaichen added a commit to baibaichen/gluten that referenced this pull request Mar 9, 2026
The woodstox classpath conflict that caused 10 failures was already fixed
by PR apache#11580. All 31 tests pass on both spark40 and spark41.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
baibaichen added a commit to baibaichen/gluten that referenced this pull request Mar 11, 2026
The woodstox classpath conflict that caused 10 failures was already fixed
by PR apache#11580. All 31 tests pass on both spark40 and spark41.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
baibaichen added a commit that referenced this pull request Mar 12, 2026
….1 (#11725)

The woodstox classpath conflict that caused 10 failures was already fixed
by PR #11580. All 31 tests pass on both spark40 and spark41.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
zhztheplayer pushed a commit to zhztheplayer/gluten that referenced this pull request Mar 15, 2026
….1 (apache#11725)

The woodstox classpath conflict that caused 10 failures was already fixed
by PR apache#11580. All 31 tests pass on both spark40 and spark41.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CORE works for Gluten Core

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Spark 4.x: Tracking disabled test suites

3 participants