ENH Introduce dtype preservation semantics in DistanceMetric objects.#27006
Merged
jjerphan merged 10 commits intoscikit-learn:mainfrom Aug 10, 2023
Merged
ENH Introduce dtype preservation semantics in DistanceMetric objects.#27006jjerphan merged 10 commits intoscikit-learn:mainfrom
DistanceMetric objects.#27006jjerphan merged 10 commits intoscikit-learn:mainfrom
Conversation
Co-authored-by: Julien Jerphanion <git@jjerphan.xyz>
…into dm_float32
This reverts commit ee4faf4.
Contributor
Author
I've reintroduced the dtype-dependent signatures. Do you think this warrants a separate changelog entry considering |
Member
|
I don't think this is necessary since |
OmarManzoor
approved these changes
Aug 10, 2023
Contributor
OmarManzoor
left a comment
There was a problem hiding this comment.
LGTM. Thanks @Micky774
Member
|
I merged |
TamaraAtanasoska
pushed a commit
to TamaraAtanasoska/scikit-learn
that referenced
this pull request
Aug 21, 2023
…s. (scikit-learn#27006) Co-authored-by: Julien Jerphanion <git@jjerphan.xyz>
REDVM
pushed a commit
to REDVM/scikit-learn
that referenced
this pull request
Nov 16, 2023
…s. (scikit-learn#27006) Co-authored-by: Julien Jerphanion <git@jjerphan.xyz>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Reference Issues/PRs
What does this implement/fix? Explain your changes.
Preserves dtype when computing distances, under the assumption that the precision of the input data is an implication of preferred precision of output data. Note that accumulation still largely occurs using
float64_twith some exceptions.Any other comments?
Current benchmarks (generated here) suggest that there is no regression in the dense case (
dist), and a 10-25% speedup in the sparse case (dist_csr).Benchmark Plots
Memory profiling indicates a reduction of memory usage in this script from
763MiBto382MiB.cc: @jjerphan @OmarManzoor @thomasjpfan