Fix for xml expression to not parse arbitrary strings#679
Fix for xml expression to not parse arbitrary strings#679srowen merged 1 commit intodatabricks:masterfrom
Conversation
|
This project is now incorporated into Apache Spark. Could you open the pull request there? if it's accepted, I will back-port it here just in case. Can you give an example of what no longer works (and should not), and/or what didn't work before and does now? |
|
Thanks for the quick response @srowen! Looks like it's being added to spark 4.0 here https://github.com/apache/spark/blame/29d077fbbd5464f64e0eeb495f7a955850915cc5/sql/core/src/main/scala/org/apache/spark/sql/functions.scala#L7146? Reading this path it seems like it isn't using the same constructor as this library so I think we should be good to just back port this here? I can't think of an example of something that doesn't work now that would have worked before... Cases that previously didn't work and do now: |
|
@srowen is there a way for us to tag a new version for this fix? |
|
Possibly, but does this block anything? it seems like the issues it avoids are things the caller can just not do, or do I misunderstand |
|
I got around this by under the underlying spark expression rather than using functions so yes there's a work around but it's causing unexpected errors at the moment when using string literals and when used within an array transform. |
|
I just released 0.18.0 with this change: https://github.com/databricks/spark-xml/releases/tag/v0.18.0 |
Previously it was the case that we would parse the string of the column as the column expression argument for the expression. This leads to being able to execute arbitrary spark SQL if you parse the expression a string literal. It also means that string literals don't work with this expression and it doesn't work within an array transform expression either.