Skip to content

Add batchSize method to DataFrame #718

@norberttech

Description

@norberttech

Currently we can manipulate batch size of dataframe in 2 ways:

  • DataFrame::parallelize
  • DataFrame::collect

First one will split Rows into smaller batches, second one will merge small Rows into bigger batch.
Even that those two are working fine, it might not be always easy to determine which one to use, plus
their names might be confusing to users without understanding how DataFrame works.

We can solve both of those problems by adding:

DataFrame::batchSize(int $size) : self

That will split or merge processed rows based on the size.
It should also be more intuitive to the users.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

Status

Done

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions