This repository was archived by the owner on Mar 24, 2025. It is now read-only.
Description Hi.
I've created simple dataframe:
from pyspark .sql import SparkSession
from pyspark .sql .types import StructType , StructField , TimestampType
from datetime import datetime , timezone
spark = SparkSession .builder .config ("spark.jars.packages" , "com.databricks:spark-xml_2.12:0.17.0" ).getOrCreate ()
schema = StructType ([StructField ("created-at" , TimestampType ())])
df = spark .createDataFrame ([{"created-at" : datetime .now (tz = timezone .utc )}], schema = schema )
df .show (10 , False )
df .write .format ("xml" ).option ("timestampFormat" , "yyyy-MM-dd HH:mm:ss.SSSXXX" ).mode ("overwrite" ).save ("2.xml" )
created-at
2023-10-09 09:05:24.269352
Then saved it as xml:
df .repartition (1 ).write \
.format ("xml" ) \
.mode ("overwrite" ) \
.option ("compression" , None ) \
.option ("rowTag" , "item" ) \
.save ("2.xml" )
This is content of 2.xml folder:
> ls -la 2.xml
drwxr-xr-x 2 maxim maxim 84 окт 9 09:18 ./
drwxr-xr-x 19 maxim maxim 4096 окт 9 09:18 ../
-rw-r--r-- 1 maxim maxim 156 окт 9 09:18 part-00000
-rw-r--r-- 1 maxim maxim 12 окт 9 09:18 .part-00000.crc
-rw-r--r-- 1 maxim maxim 0 окт 9 09:18 _SUCCESS
-rw-r--r-- 1 maxim maxim 8 окт 9 09:18 ._SUCCESS.crc
File 2.xml/part-00000 has the following content:
<?xml version =" 1.0" encoding =" UTF-8" standalone =" yes" ?>
<ROWS >
<item >
<created-at >2023-10-09T09:05:24.269352Z</created-at >
</item >
</ROWS >
But it does not have .xml extension. Is that an expected behavior?
Reactions are currently unavailable
Hi.
I've created simple dataframe:
Then saved it as
xml:This is content of
2.xmlfolder:> ls -la 2.xml drwxr-xr-x 2 maxim maxim 84 окт 9 09:18 ./ drwxr-xr-x 19 maxim maxim 4096 окт 9 09:18 ../ -rw-r--r-- 1 maxim maxim 156 окт 9 09:18 part-00000 -rw-r--r-- 1 maxim maxim 12 окт 9 09:18 .part-00000.crc -rw-r--r-- 1 maxim maxim 0 окт 9 09:18 _SUCCESS -rw-r--r-- 1 maxim maxim 8 окт 9 09:18 ._SUCCESS.crcFile
2.xml/part-00000has the following content:But it does not have
.xmlextension. Is that an expected behavior?