ENH add pairwise distances backend for LocalOutlierFactor#26316
ENH add pairwise distances backend for LocalOutlierFactor#26316OmarManzoor wants to merge 7 commits intoscikit-learn:mainfrom
Conversation
Micky774
left a comment
There was a problem hiding this comment.
Hey there @OmarManzoor, always wonderful seeing a PR from you :)
I know you are still early in the process of finalizing this PR and it is only marked as a draft, but I figured some (potentially unsolicited) advice may help smooth out the road in front of you.
I also wanted to go ahead and mention that this PR will require some thorough benchmarking just to confirm that there is indeed a performance gain, and to potentially evaluate what the best deployment strategies are (i.e. parallel_on_{X, Y}) since the asymmetry of the algorithm may require us to adopt a different heuristic than what is the default for ArgKmin.
Please do not hesitate to ping if you have any questions or concerns. Thanks!
sklearn/metrics/_pairwise_distances_reduction/_argkmin_lrd.pyx.tp
Outdated
Show resolved
Hide resolved
sklearn/metrics/_pairwise_distances_reduction/_argkmin_lrd.pyx.tp
Outdated
Show resolved
Hide resolved
sklearn/metrics/_pairwise_distances_reduction/_argkmin_lrd.pyx.tp
Outdated
Show resolved
Hide resolved
sklearn/metrics/_pairwise_distances_reduction/_argkmin_lrd.pyx.tp
Outdated
Show resolved
Hide resolved
sklearn/metrics/_pairwise_distances_reduction/_argkmin_lrd.pyx.tp
Outdated
Show resolved
Hide resolved
sklearn/metrics/_pairwise_distances_reduction/_argkmin_lrd.pyx.tp
Outdated
Show resolved
Hide resolved
Micky774
left a comment
There was a problem hiding this comment.
Looking good so far. Aside from the missing test, please ping me once you have some benchmark results ready :)
| raise ValueError( | ||
| "Only float64 or float32 datasets pairs are supported at this time, " | ||
| f"got: X.dtype={X.dtype} and Y.dtype={Y.dtype}." | ||
| ) |
There was a problem hiding this comment.
Need to add a test for this
|
@OmarManzoor: you can use asv to easily perform benchmarks between those revisions. You can take some inspiration from those ASV benchmarks definitions and adapt them for |
Right thank you. |
|
@jjerphan Could you kindly have a look if this benchmark seems correct argkmin_lrd_benchmark? |
I think this is going to take a lot of time to run. |
|
@jjerphan @Micky774 These are the results that I got so far I don't think this looks good. The performance seems to have decreased. |
Can you try profiling (e.g. with Linux |
Thank you! |
|
@Micky774 I tried using cProfile and these are the results for the argkmin_lrd.pyx file Codeimport cProfile
import numpy as np
from sklearn.neighbors import LocalOutlierFactor
n_train = 1000
n_features = 100
rng = np.random.RandomState(0)
X_train = rng.rand(n_train, n_features).astype(np.float32)
cProfile.run("LocalOutlierFactor(n_neighbors=10).fit(X=X_train)")ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 0.000 0.000 _argkmin_lrd.pyx:134(_finalize_results)
1 0.031 0.031 0.074 0.074 _argkmin_lrd.pyx:34(compute)
1 0.000 0.000 0.015 0.015 _argkmin_lrd.pyx:64(__init__)It seems like the compute method if taking the actual time. |
|
Python and native code can be profiled with I would recommend setting thread affinity to only use one. On GNU/Linux, this can be done using Profiles' files can be uploaded here so that they be inspected by readers as well. |
I think I realized what the issue is with the significant drop. The metric was being set as euclidean which has a specialized implementation when we use kneighbors. |
|
Cool! |
|
These are the latest benchmarks with float32 Benchmark Codeimport numpy as np
from .common import Benchmark
from sklearn.neighbors import LocalOutlierFactor
class ArgKminLRDBenchmark(Benchmark):
param_names = ["n_train", "n_features"]
params = [
[1000, 10000, 100000],
[100],
]
def setup(self, n_train, n_features):
rng = np.random.RandomState(0)
self.X_train = rng.rand(n_train, n_features).astype(np.float32)
self.y_train = rng.randint(low=-1, high=1, size=(n_train,))
def time_fit(self, n_train, n_features):
est = LocalOutlierFactor(n_neighbors=10, metric="manhattan").fit(X=self.X_train)
self.estimator_ = estPR - time_fit
========= ============
-- n_features
--------- ------------
n_train 100
========= ============
1000 16.9±0.9ms
10000 926±20ms
100000 1.54±0.2m
========= ============
MAIN - time_fit
========= ============
-- n_features
--------- ------------
n_train 100
========= ============
1000 16.5±0.7ms
10000 926±9ms
100000 1.49±0.01m
========= ============
|
|
Benchmarks with float64 Benchmark Codeimport numpy as np
from .common import Benchmark
from sklearn.neighbors import LocalOutlierFactor
class ArgKminLRDBenchmark(Benchmark):
param_names = ["n_train", "n_features"]
params = [
[1000, 10000, 100000],
[100],
]
def setup(self, n_train, n_features):
rng = np.random.RandomState(0)
self.X_train = rng.rand(n_train, n_features).astype(np.float64)
self.y_train = rng.randint(low=-1, high=1, size=(n_train,))
def time_fit(self, n_train, n_features):
LocalOutlierFactor(n_neighbors=10, metric="manhattan").fit(X=self.X_train)PR - time_fit
========= ============
-- n_features
--------- ------------
n_train 100
========= ============
1000 18.6±0.1ms
10000 1.11±0.02s
100000 1.77±0m
========= ============
MAIN - time_fit
========= ============
-- n_features
--------- ------------
n_train 100
========= ============
1000 18.7±0.4ms
10000 1.11±0.02s
100000 1.80±0.01m
========= ============
|
|
It looks like no specific improvements are noted. |
|
Even by setting the number of neighbors to be 1000 no improvement was noted. Benchmark Codeimport numpy as np
from .common import Benchmark
from sklearn.neighbors import LocalOutlierFactor
class ArgKminLRDBenchmark(Benchmark):
param_names = ["n_train", "n_features"]
params = [
[100000],
[100],
]
def setup(self, n_train, n_features):
rng = np.random.RandomState(0)
self.X_train = rng.rand(n_train, n_features).astype(np.float32)
def time_fit(self, n_train, n_features):
LocalOutlierFactor(n_neighbors=1000, metric="manhattan").fit(X=self.X_train)[ 0.00%] · For scikit-learn commit d15e8c09 <pairwise_distance_backend_for_lof> (round 1/1):
[ 0.00%] ·· Benchmarking conda-py3.9-cython-joblib-numpy-pandas-scipy-threadpoolctl
[50.00%] ··· argkmin_lrd.ArgKminLRDBenchmark.time_fit ok
[50.00%] ··· ========= ============
-- n_features
--------- ------------
n_train 100
========= ============
100000 1.68±0.01m
========= ============
[50.00%] · For scikit-learn commit 55af30d9 <main> (round 1/1):
[50.00%] ·· Building for conda-py3.9-cython-joblib-numpy-pandas-scipy-threadpoolctl..
[50.00%] ·· Benchmarking conda-py3.9-cython-joblib-numpy-pandas-scipy-threadpoolctl
[100.00%] ··· argkmin_lrd.ArgKminLRDBenchmark.time_fit ok
[100.00%] ··· ========= =========
-- n_features
--------- ---------
n_train 100
========= =========
100000 1.68±0m
========= =========
BENCHMARKS NOT SIGNIFICANTLY CHANGED. |
|
After inspecting w/ @jjerphan @OmarManzoor please do sanity check me on this, but afaik I think this is just ill-suited for further optimization via a dedicated backend. This is good information though, since it allows us to focus efforts in other areas -- and also probably should set the precedent of confirming bottlenecks w/ an initial profiling rather than after work has been done :) Once you two are satisfied with this explanation, we can close this and move on. |
Thank you for inspecting with py-spy. I think what you are specifying makes sense and reinforces the observed benchmarks. |
|
@jjerphan I think we can close this PR then? |
|
Closing since no performance improvements can be reached. Sorry, @OmarManzoor. 🤦♂️ I wished I provided more evidenced certainty about room for performance improvements beforehand, with a profile of the current execution of algorithms for instance.
I think I will reword the issue description in this regard. |
No no there is no need to apologize. It was a good learning experience. We can try out |
❤️
I have updated #25888 to reflect the other ongoing/open work for finalizing the optimization of |
Reference Issues/PRs
Towards #25888
What does this implement/fix? Explain your changes.
Any other comments?
CC: @jjerphan Could you kindly have a look and see how this looks?