-
-
Notifications
You must be signed in to change notification settings - Fork 48
Closed
Milestone
Description
In order to improve DX, Flow should try to detect if the current BatchSize is suitable for a given Loader.
(new Flow())
->read(
Dbal::from_limit_offset(
$sourceDbConnection,
'source_dataset_table',
new OrderBy('id', Order::DESC)
)
)
->withEntry('id', ref('id')->cast('int'))
->withEntry('name', concat(ref('name'), lit(' '), ref('last name')))
->drop('last_name')
->write(Dbal::to_table_insert($dbConnection, 'flow_dataset_table'))
->run();In this example, batchSize is equal to 1. This means that Flow will try to insert rows into the db, one by one.
This can be easily changed by putting batchSize(1_000) just above write, but it also requires from developer some knowledge about how loaders work internally.
What we can do, is use Optimizer in order to detect current batchSize when Loaders are added, and whenever we notice that batchSize wasn't set, we can automatically apply one.
The exact numbers should be predefined, I think we can start from 1k for each of the following:
- ElasticSearch
- Dbal
- Meilisearch
For the file-based loaders, this is irrelevant, as most of them are writing rows one by one.
Metadata
Metadata
Assignees
Labels
No labels
Type
Projects
Status
Done