Skip to content

Make it possible to pack tasks in joblib.Parallel #112

@ogrisel

Description

@ogrisel

joblib.Parallel is not efficient at scheduling small tasks due to interprocess communication overhead. Currently this can be addressed by writing a wrapper function that work on a group of tasks and do the task grouping manually prior to calling joblib.Parallel

It would be more convenient to make joblib.Parallel do grouped dispatch internally by just passing the size of the group:

output = joblib.Parallel(n_jobs=42, group_size=10)(
    delayed(my_function)(one_input) for one_input in many_many_inputs)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions