I'm using a large dataset which takes up most of my memory. cross_val_score is basically unable to run on it (I get a MemoryError) since AFAICT data is duplicated for each fold (with list() here):
|
cv_iter = list(cv.split(X, y, groups)) |
Would it be possible to use views or otherwise avoid this duplication?
I'm using a large dataset which takes up most of my memory.
cross_val_scoreis basically unable to run on it (I get aMemoryError) since AFAICT data is duplicated for each fold (withlist()here):scikit-learn/sklearn/model_selection/_validation.py
Line 131 in 38f6a91
Would it be possible to use views or otherwise avoid this duplication?