[MRG] MNT: Use nrm2 to find the residuals squared#11923
Merged
jnothman merged 1 commit intoscikit-learn:masterfrom Aug 31, 2018
Merged
[MRG] MNT: Use nrm2 to find the residuals squared#11923jnothman merged 1 commit intoscikit-learn:masterfrom
nrm2 to find the residuals squared#11923jnothman merged 1 commit intoscikit-learn:masterfrom
Conversation
51f75f7 to
b988008
Compare
Contributor
Author
|
Quick benchmark below for comparison. In [1]: import numpy as np
In [2]: from scipy import linalg
In [3]: a = 2 * np.random.random((100, 110)) - 1
In [4]: %timeit b = a.copy();
4.66 µs ± 181 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [5]: %timeit b = a.copy(); b **= 2; b.sum()
19.5 µs ± 371 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [6]: nrm2, = linalg.get_blas_funcs(('nrm2',), (a,))
In [7]: %timeit nrm2(a) ** 2
10.7 µs ± 66.8 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) |
b988008 to
78133b0
Compare
Using BLAS's `nrm2` is a bit faster than squaring the residuals in-place and summing them. So switch to using `nrm2` instead. Interestingly it doesn't appear necessary to flatten the array first as the BLAS function interprets the array as flat under the hood.
nrm2 to find the residuals squarednrm2 to find the residuals squared
agramfort
approved these changes
Aug 28, 2018
jnothman
reviewed
Aug 29, 2018
Member
jnothman
left a comment
There was a problem hiding this comment.
The gains here seem so tiny for something that's only called once per iteration (for a default 100 iterations) amidst much other logic. Why bother making this change?
Member
|
And which is more readable? |
Contributor
Author
|
We already use |
jnothman
pushed a commit
to jnothman/scikit-learn
that referenced
this pull request
Sep 2, 2018
jnothman
pushed a commit
to jnothman/scikit-learn
that referenced
this pull request
Sep 17, 2018
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Using BLAS's
nrm2is a bit faster than squaring the residuals in-place and summing them. So switch to usingnrm2instead. Interestingly it doesn't appear necessary to flatten the array first as the BLAS function interprets the array as flat under the hood.