Skip to content

DOC Added information about space complexity to docs DBSCAN#26783

Merged
adrinjalali merged 4 commits intoscikit-learn:mainfrom
StefanieSenger:docs_dbscan
Jul 20, 2023
Merged

DOC Added information about space complexity to docs DBSCAN#26783
adrinjalali merged 4 commits intoscikit-learn:mainfrom
StefanieSenger:docs_dbscan

Conversation

@StefanieSenger
Copy link
Copy Markdown
Member

Reference Issues/PRs

#26726

What does this implement/fix? Explain your changes.

Added information about space complexity to docstring; because users were wondering about the huge RAM usage if param eps is high, while param min_samples is low.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Jul 6, 2023

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: d200e03. Link to the linter CI: here

@adrinjalali
Copy link
Copy Markdown
Member

I think you used Alt+Q to format the code (line length) and it has formatted the whole docstring, resulting in a larger than needed diff.

Could you please redo those? You can also set it in the settings of your vscode that Alt+Q only formats the current paragraph instead of the whole section.

@StefanieSenger
Copy link
Copy Markdown
Member Author

Okay, did it. Thanks for your support!

@adrinjalali
Copy link
Copy Markdown
Member

I'm wondering if this is good as is, or if it should be a note (.. note :: kinda thing) right before the example.

@StefanieSenger StefanieSenger changed the title DOC Added information about space complexity DOC Added information about space complexity to docs DBSCAN Jul 7, 2023
@StefanieSenger
Copy link
Copy Markdown
Member Author

I've added a sentence to clarify what min_samples tunes, because I find its naming not very intuitive.

Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
@adrinjalali adrinjalali enabled auto-merge (squash) July 13, 2023 14:03
@adrinjalali adrinjalali merged commit 889b829 into scikit-learn:main Jul 20, 2023
@StefanieSenger StefanieSenger deleted the docs_dbscan branch July 21, 2023 09:28
punndcoder28 pushed a commit to punndcoder28/scikit-learn that referenced this pull request Jul 29, 2023
…earn#26783)

Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
glemaitre added a commit to glemaitre/scikit-learn that referenced this pull request Sep 18, 2023
…earn#26783)

Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
jeremiedbb pushed a commit that referenced this pull request Sep 20, 2023
Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
REDVM pushed a commit to REDVM/scikit-learn that referenced this pull request Nov 16, 2023
…earn#26783)

Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
@kno10
Copy link
Copy Markdown
Contributor

kno10 commented Feb 21, 2024

The claim "The worst case memory complexity of DBSCAN is O(n²), which can occur when the eps param is large and min_samples is low." is incorrect.

(Original) DBSCAN has linear memory requirements, and worst-case quadratic distance computations. The O(n²) memory use is due to the sklearn implementation, as it was already documented in the "Notes" just a bit further down.

In #28493 I propose a revised claim.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants