XSD -> schema tool with a test#457
Conversation
| * @param xsdFile XSD file | ||
| * @return Spark-compatible schema | ||
| */ | ||
| @Experimental |
There was a problem hiding this comment.
I added these experimental "API" methods
| Constants.XSD_ANYTYPE => | ||
| StructField(baseName, StringType) | ||
| } | ||
| case _ => StructField(baseName, StringType) |
There was a problem hiding this comment.
This is about the only substantive change, which helps with 'any' type fields with no further content restrictions. Just treat them as strings. Other changes to the code are cosmetic simplifications (IMHO)
| matchType match { | ||
| case Constants.XSD_BOOLEAN => StructField(baseName, BooleanType) | ||
| case Constants.XSD_BYTE => StructField(baseName, BinaryType) | ||
| case Constants.XSD_DATE | |
There was a problem hiding this comment.
Looks like you removed Date/DaateTime? You could release this with #448
There was a problem hiding this comment.
I just let them fall into the general string case. But yeah maybe just as well to try them as a date or time type.
srowen
left a comment
There was a problem hiding this comment.
Let's get this in as a start. We can iterate as we get more tests that may exercise it better.
|
Can I pull this version & compile myself? Before 0.10 gets released |
|
Yes you can just build it with SBT and try it. I should just make an 0.10 release soon. |
Relates to #449 and @seddonm1 's gist.
Adds a basic XSDToSchema utility that can parse a Spark schema from some XSDs, those defining a table-like schema with simple, complex and sequence types.