Avoid duplicating data in cross_val_score

I'm using a large dataset which takes up most of my memory. `cross_val_score` is basically unable to run on it (I get a `MemoryError`) since AFAICT data is duplicated for each fold (with `list()` here):
https://github.com/scikit-learn/scikit-learn/blob/38f6a91566bc643e2a8f76beb16f3e673faab848/sklearn/model_selection/_validation.py#L131

Would it be possible to use views or otherwise avoid this duplication?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Avoid duplicating data in cross_val_score #7919

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Avoid duplicating data in cross_val_score #7919

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions