I'm using the cobrix library to read an EBDCID binary file, obtained as a table extraction on DB2. The reading operation output is writed as CSV file on HDFS.
For parsing the binary file I'm using a copybook file present on HDFS.
Here is the spark reading command:
spark.read.
format (sourceFileFormat).
option ("copybook", copybookStagingHDFSPath).
option ("schema_retention_policy", "collapse_root").
option ("string_trimming_policy", "none").
load (sourceFileStagingHDFSPath + file)
The resulting output on CSV file does not allow the distinction between BLANK ("") and NULL cells but are both treated as empty strings.
Is there a way to treat BLANK ("") value and NULL value differently, in order to have an output corresponding to the data present on the DB2 source database?
I'm using the cobrix library to read an EBDCID binary file, obtained as a table extraction on DB2. The reading operation output is writed as CSV file on HDFS.
For parsing the binary file I'm using a copybook file present on HDFS.
Here is the spark reading command:
spark.read.
format (sourceFileFormat).
option ("copybook", copybookStagingHDFSPath).
option ("schema_retention_policy", "collapse_root").
option ("string_trimming_policy", "none").
load (sourceFileStagingHDFSPath + file)
The resulting output on CSV file does not allow the distinction between BLANK ("") and NULL cells but are both treated as empty strings.
Is there a way to treat BLANK ("") value and NULL value differently, in order to have an output corresponding to the data present on the DB2 source database?