Currently the SparkIngestDriver will not work when hdfs relies on file:/// as the protocol (which is helpful for testing when we don't want to rely on S3 etc). The ingest driver also allocates things like IngestFormatPluginOptions per partition which is unnecessary. These allocations occur in processInput and should be pulled out and provided as configuration parameters to Spark somehow.
Currently the SparkIngestDriver will not work when hdfs relies on
file:///as the protocol (which is helpful for testing when we don't want to rely on S3 etc). The ingest driver also allocates things likeIngestFormatPluginOptionsper partition which is unnecessary. These allocations occur inprocessInputand should be pulled out and provided as configuration parameters to Spark somehow.