Deprecate copy in Birch

`Birch` doesn't perform inplace operations (at least not on the input array), so the `copy` parameter is useless and should be deprecated. It's even detrimental because by default it makes a copy.

The only place where an inplace operation happens is in the `update` method of `_CFSubcluster`: https://github.com/scikit-learn/scikit-learn/blob/11e8c216698370520a47d0639c69d959c0312a25/sklearn/cluster/_birch.py#L315-L320

However, `update` is call in 2 places. The first one is in the `_split_node` function, but here we first create 2 new `_CFSubcluster` objects and so the `update` performs inplace operations on newly created data, so the input data is not modified. The second one is in the `insert_cf_subcluster` method of `_CFNode` but is only triggered if the subcluster has a child, which can only come from splitted subclusters (i.e. after `_split_node`), so again we're not modifying the input data.

	def update(self, subcluster):
	self.n_samples_ += subcluster.n_samples_
	self.linear_sum_ += subcluster.linear_sum_
	self.squared_sum_ += subcluster.squared_sum_
	self.centroid_ = self.linear_sum_ / self.n_samples_
	self.sq_norm_ = np.dot(self.centroid_, self.centroid_)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Deprecate copy in Birch #29092

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Uh oh!

Deprecate copy in Birch #29092

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions