[MRG+1] Made PCA expose the singular values by tomlof · Pull Request #7685 · scikit-learn/scikit-learn

tomlof · 2016-10-17T13:23:18Z

Reference Issue

What does this implement/fix? Explain your changes.

I've made it so that the singular values from the underlying SVD are stored in the PCA classes under the attribute "singular_values_".

This was a trivial fix, and all I did was to save the singular values in an instance variable. I also added the attribute to the list of attributes in the documentations and added doc tests for them. I also added unit tests (test_singular_values) that cover the estimators PCA, IncrementalPCA and TruncatedSVD to the corresponding tests in sklearn/decomposition/tests.

Any other comments?

It may appear like there are loads of commits in this fix, but the commits listed below are some old stuff I was working on years ago, that are no longer in my fork. This fix only add changes to the modules: "decomposition/incremental_pca.py", "decomposition/pca.py" and "decomposition/truncated_pca.py" and to their corresponding unit tests, and the changes are made on the latest commit to the main scikit-learn repository.

…variable as per Issue scikit-learn#6955.

amueller

Looks good apart from minor comments. I'm not sure whether we want a test or not.

amueller · 2016-10-17T21:48:17Z

sklearn/decomposition/incremental_pca.py

-        to 1.0
+        to 1.0.
+
+    singular_values_ : array, [n_components]


please say array, shape (n_components,)

amueller · 2016-10-17T21:48:36Z

sklearn/decomposition/incremental_pca.py

        self.singular_values_ = None
        self.explained_variance_ = None
        self.explained_variance_ratio_ = None
+        self.singular_values_ = None


Is there any reason to add this here?

Not really, but the other attributes were set there, so I thought it would be good for consistency to have the singular values be set there as well.

amueller · 2016-10-17T21:49:24Z

sklearn/decomposition/pca.py

        If ``n_components`` is not set then all components are stored and the
        sum of explained variances is equal to 1.0.

+    singular_values_ : array, [n_components]


hm here all the docstrings seem to be using a different convention. I like shape + tuple better, that's the standard numpy way. But I don't mind that much.

I've updated them all to follow the numpy standard.

amueller · 2016-10-17T21:50:15Z

sklearn/decomposition/pca.py

        self.explained_variance_ = exp_var = (S ** 2) / n_samples
        full_var = np.var(X, axis=0).sum()
        self.explained_variance_ratio_ = exp_var / full_var
+        self.singular_values_ = S.copy()  # Store the singular values.


why do you need to copy?

No need to copy here. I've removed it.

amueller · 2016-10-17T21:51:01Z

sklearn/decomposition/truncated_svd.py

+
+    singular_values_ : array, [n_components]
+        The singular values corresponsing to each of the selected components.
+        The singular values corresponds to the 2-norms of the ``n_components``


Maybe say are the 2-norms or are equal to. You just used "corresponding". Also, there's a typo in "corresponding".

jnothman · 2016-10-19T12:51:32Z

sklearn/decomposition/pca.py

        explained_variance_ = (S ** 2) / n_samples
        total_var = explained_variance_.sum()
        explained_variance_ratio_ = explained_variance_ / total_var
+        singular_values_ = S.copy()  # Store the singular values.


Can't this be calculated by the user as np.sqrt(explained_variance_ * n_samples)?

Sorry. Stupid question. I've read the issue and figure this is all about making something comparable available in TruncatedSVD.

jnothman · 2016-10-19T12:54:27Z

Please add tests.

tomlof · 2016-10-25T15:03:10Z

Do you mean to add more doc tests, or to add a unit test?

amueller · 2016-10-25T15:55:53Z

unit test

…into pca_expose_singular_values

tomlof · 2016-10-25T22:09:19Z

Ok, I've added unit tests for the singular values to PCA, IncrementalPCA and TruncatedSVD.

…into pca_expose_singular_values

amueller · 2016-10-26T19:03:14Z

Thanks. The test are failing though.

jnothman

LGTM and thanks for the great tests! I only wonder if we should be refactoring the tests.

amueller · 2016-10-27T15:33:26Z

yeah it would be nice not to duplicate the code as much, bot otherwise LGTM.

jnothman · 2016-10-27T21:09:54Z

@tomlof please add an entry in what's new and let us know so we can merge.

tomlof · 2016-10-27T22:06:16Z

Ok. I've updated the pull request description so that it mentions the unit tests.

jnothman · 2016-10-29T10:57:32Z

What's new is doc/whats_new.rst, our changelog.

jnothman · 2016-10-29T12:21:39Z

sklearn/decomposition/truncated_svd.py

            random_state=42, tol=0.0)
    >>> print(svd.explained_variance_ratio_)  # doctest: +ELLIPSIS
-    [ 0.0782... 0.0552... 0.0544... 0.0499... 0.0413...]
+    [ 0.0606... 0.0584... 0.0497... 0.0434... 0.0372...]


Why did these change?

Someone else had changed those doctests. You see this in my merge commit 279fd60 above. I reverted the change, but when I ran the tests, it failed and I had to change back to what it was I pulled from the main repo. This change back was included in ae86e2f.

I didn't check what had changed to make the unit test change, but just assumed that since it was included in the main repo it was validated. Perhaps someone updated sparse_random_matrix?

It appears to be the product of a change in sample_without_replacement in commit edc9e7. Let me know if there is anything else you want me to change.

Oh you're saying it was a merge error on your part and it's since been fixed to reflect master. All good, then.

jnothman · 2016-10-30T03:17:02Z

doc/whats_new.rst

+   - :class:`decomposition.PCA`, :class:`decomposition.IncrementalPCA` and
+     :class:`decomposition.TruncatedSVD` now expose the singular values
+     from the underlying SVD. They are stored in the attribute
+     `singular_values_`, like in :class:`decomposition.IncrementalPCA`.


fixed-width font requires double-backticks in RST

Thanks, I've updated.

…into pca_expose_singular_values

jnothman · 2016-10-30T10:07:59Z

Thanks @tomlof

amueller · 2016-11-16T16:45:52Z

This seems to have cause a test failure in master: #7893

Tommy Löfstedt and others added 15 commits January 24, 2013 14:28

Manually resolved merge conflict

c637fb0

Added sparse NIPALS

8897ab5

Added sparse PCA (L1 penalised)

d2f913f

Work in progress. Added SVD and PLS-R.

4687649

Work in progress. Updated PLS-R and added soft thresholding to it

ef16c4c

Merge branch 'master' of https://github.com/scikit-learn/scikit-learn

2324939

Work in progress: Updated PLS-R.

e807663

Work in progress: Added several unit tests.

23c701a

Work in progress: Rescue-save.

510a9a1

Work in progress: Rescue-save.

c11f175

Work in progress.

5b4c183

Work in progress.

948f19a

Merge.

fd6c962

Merge.

38d7e1d

ENH: Added the singular values to PCA by a singular_values_ instance …

e3acdbf

…variable as per Issue scikit-learn#6955.

amueller requested changes Oct 17, 2016

View reviewed changes

DOC: Updates as per PR review.

51a71f0

jnothman reviewed Oct 19, 2016

View reviewed changes

tomlof added 2 commits October 25, 2016 23:03

Merge branch 'master' of https://github.com/scikit-learn/scikit-learn …

e9fe5e7

…into pca_expose_singular_values

TEST: Added unit tests for PCA, IncrementalPCA and TruncatedSVD.

c4a4943

tomlof added 2 commits October 26, 2016 19:47

BUG: Removed the use of new features from numpy.

44ab311

Merge branch 'master' of https://github.com/scikit-learn/scikit-learn …

15816fe

…into pca_expose_singular_values

tomlof added 3 commits October 26, 2016 22:59

TEST: Reduced the error thresholds in PCA singular value tests.

566f49d

TEST: Reduced the error thresholds in PCA singular value tests.

d3df7ed

MAINT: PEP8 compliance.

ce17751

jnothman approved these changes Oct 26, 2016

View reviewed changes

jnothman changed the title ~~Made PCA expose the singular values~~ [MRG+1] Made PCA expose the singular values Oct 26, 2016

amueller approved these changes Oct 27, 2016

View reviewed changes

tomlof added 3 commits October 29, 2016 13:35

MAINT: Merge.

279fd60

DOC: Updated whats_new.rst to include news in PCA classes.

9fc3420

TEST: Fixed doctests for truncated PCA.

ae86e2f

jnothman reviewed Oct 29, 2016

View reviewed changes

jnothman reviewed Oct 30, 2016

View reviewed changes

tomlof added 2 commits October 30, 2016 09:57

Merge branch 'master' of https://github.com/scikit-learn/scikit-learn …

2b731f0

…into pca_expose_singular_values

DOC: Fixed typo in whats_new.rst.

b5a6356

jnothman merged commit 4ae1013 into scikit-learn:master Oct 30, 2016

tomlof deleted the pca_expose_singular_values branch November 2, 2016 11:15

sergeyf pushed a commit to sergeyf/scikit-learn that referenced this pull request Feb 28, 2017

[MRG+1] ENH PCA variants now expose singular values (scikit-learn#7685)

1d05f81

Sundrique pushed a commit to Sundrique/scikit-learn that referenced this pull request Jun 14, 2017

[MRG+1] ENH PCA variants now expose singular values (scikit-learn#7685)

142714e

paulha pushed a commit to paulha/scikit-learn that referenced this pull request Aug 19, 2017

[MRG+1] ENH PCA variants now expose singular values (scikit-learn#7685)

d026214

rth mentioned this pull request Oct 18, 2017

decomposition.PCA has no attribute 'singular_values_' #9949

Closed

maskani-moh pushed a commit to maskani-moh/scikit-learn that referenced this pull request Nov 15, 2017

[MRG+1] ENH PCA variants now expose singular values (scikit-learn#7685)

f1185b0

olegstikhin mentioned this pull request May 3, 2019

[MRG] DOC Added version information for PCA.singular_values_ #13776

Merged

Uh oh!

Conversation

tomlof commented Oct 17, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reference Issue

What does this implement/fix? Explain your changes.

Any other comments?

Uh oh!

amueller left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jnothman commented Oct 19, 2016

Uh oh!

tomlof commented Oct 25, 2016

Uh oh!

amueller commented Oct 25, 2016

Uh oh!

tomlof commented Oct 25, 2016

Uh oh!

amueller commented Oct 26, 2016

Uh oh!

jnothman left a comment

Choose a reason for hiding this comment

Uh oh!

amueller commented Oct 27, 2016

Uh oh!

jnothman commented Oct 27, 2016

Uh oh!

tomlof commented Oct 27, 2016

Uh oh!

jnothman commented Oct 29, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jnothman commented Oct 30, 2016

Uh oh!

amueller commented Nov 16, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

tomlof commented Oct 17, 2016 •

edited

Loading