Skip to content

[DOC] Improve documentation of DBSCAN memory use#28493

Merged
jeremiedbb merged 1 commit intoscikit-learn:mainfrom
kno10:patch-13
Feb 21, 2024
Merged

[DOC] Improve documentation of DBSCAN memory use#28493
jeremiedbb merged 1 commit intoscikit-learn:mainfrom
kno10:patch-13

Conversation

@kno10
Copy link
Copy Markdown
Contributor

@kno10 kno10 commented Feb 21, 2024

Original DBSCAN only queries one point at a time.
It is a scikit-learn limitation that the bulk query may use quadratic memory.

A better documentation of the memory is already found below, in the Notes:

This implementation bulk-computes all neighborhood queries, which increases
the memory complexity to O(n.d) where d is the average number of neighbors,
while original DBSCAN had memory complexity O(n). It may attract a higher
memory complexity when querying these nearest neighborhoods, depending
on the ``algorithm``.

Funnily, the incorrect "DBSCAN needs quadratic memory" claim was introduced later, in #26783

Original DBSCAN only queries one point at a time.
It is a scikit-learn limitation that the bulk query may use quadratic memory.
@github-actions
Copy link
Copy Markdown

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: f836eac. Link to the linter CI: here

Copy link
Copy Markdown
Member

@jeremiedbb jeremiedbb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @kno10

@jeremiedbb jeremiedbb merged commit e318019 into scikit-learn:main Feb 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants