RFC Consider making auto-memmaping a manual operation

Over the years, I have the feeling that the auto-memmaping feature of the multiprocessing/loky backends has caused a lot of maintenance troubles. It's too magic and complex to diagnose problems. 

I think we might consider making the memmaping of the input arrays passed in a generator of tasks an explicit operation.

However I am not sure how the API would look like. We still want robust and automated collection of the temporary data on disk and those are tied to the state of the worker pool.

Furthermore we do not want to do any memmapping for many backends (e.g. thread, dask, ray...).

Even though this is still very fuzzy to me, I figured it would be a good idea to open an issue to help sediment design considerations.

Related to:

- #912
- https://github.com/scikit-learn/scikit-learn/pull/25172
- https://github.com/scikit-learn/scikit-learn/issues/19608

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC Consider making auto-memmaping a manual operation #1376

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

RFC Consider making auto-memmaping a manual operation #1376

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions