-
Notifications
You must be signed in to change notification settings - Fork 270
Closed
Description
What is the problem the feature request solves?
In CometBatchIterator we export the schema with each batch via Arrow FFI:
val arrowSchema = ArrowSchema.wrap(schemaAddrs(index))
val arrowArray = ArrowArray.wrap(arrayAddrs(index))
Data.exportVector(
allocator,
getFieldVector(valueVector, "export"),
provider,
arrowArray,
arrowSchema)Exporting the schema seems quite expensive since it involves string copies and memory allocation. It gets more expensive for complex schemas, especially when nested types are involved.
Internally in Data.exportVector, the schema is exported with:
exportField(allocator, vector.getField(), provider, outSchema);
I wonder if we could refactor CometBatchIterator to just export the schema once, with the first batch, and then have the native side re-use that schema for subsequent batches.
Describe the potential solution
No response
Additional context
No response