-
-
Notifications
You must be signed in to change notification settings - Fork 48
Closed
Milestone
Description
Currently we can manipulate batch size of dataframe in 2 ways:
- DataFrame::parallelize
- DataFrame::collect
First one will split Rows into smaller batches, second one will merge small Rows into bigger batch.
Even that those two are working fine, it might not be always easy to determine which one to use, plus
their names might be confusing to users without understanding how DataFrame works.
We can solve both of those problems by adding:
DataFrame::batchSize(int $size) : self
That will split or merge processed rows based on the size.
It should also be more intuitive to the users.
Metadata
Metadata
Assignees
Labels
No labels
Type
Projects
Status
Done