A fair amount of estimators currently have copy=True (or copy_X=True) by default. In practice, this means that the code looks something like,
X = check_array(X, copy=copy)
and then some other calculations that may change or not X inplace. In the case when the following operations are not done inplace, we have just made a wasteful copy with no good reason.
As discussed in #13923, an example is for instance Ridge(fit_intercept=False) that will copy X, although it is not needed. Actually, I can't find any inplace operations of X in Ridge even with fit_intercept=True, but maybe I am missing something. (found it)
I think in general it would be better to avoid the,
X = check_array(X, copy=copy)
pattern, and instead make a copy explicitly where it is needed. Maybe it could be OK to not make a copy with copy=True if no copy is needed. Alternatively we could introduce copy=None by default.
Adding a common test that checks that Estimator(copy=True).fit(X, y) doesn't change X.
A fair amount of estimators currently have
copy=True(orcopy_X=True) by default. In practice, this means that the code looks something like,and then some other calculations that may change or not X inplace. In the case when the following operations are not done inplace, we have just made a wasteful copy with no good reason.
As discussed in #13923, an example is for instance
Ridge(fit_intercept=False)that will copy X, although it is not needed.Actually, I can't find any inplace operations of(found it)XinRidgeeven withfit_intercept=True, but maybe I am missing something.I think in general it would be better to avoid the,
pattern, and instead make a copy explicitly where it is needed. Maybe it could be OK to not make a copy with
copy=Trueif no copy is needed. Alternatively we could introducecopy=Noneby default.Adding a common test that checks that
Estimator(copy=True).fit(X, y)doesn't changeX.