[MRG] Limiting n_components by both n_features and n_samples instead of just n_features#8486
[MRG] Limiting n_components by both n_features and n_samples instead of just n_features#8486wallygauze wants to merge 6 commits intoscikit-learn:masterfrom
Conversation
Codecov Report
@@ Coverage Diff @@
## master #8486 +/- ##
==========================================
+ Coverage 95.48% 95.48% +<.01%
==========================================
Files 342 342
Lines 61000 61023 +23
==========================================
+ Hits 58246 58269 +23
Misses 2754 2754
Continue to review full report at Codecov.
|
|
You have flake8 failures that you need to understand. Also use a descriptive title please. |
|
Since my pull request introduces no change in coverage, I believe this codecov fail is irrelevant. |
|
please add a non-regression test. Start from the gist in the issue: |
|
While running the tests, I realised that pca.py can't handle the user not entering the number of components if the solver is 'arpack' : with that particular solver, n_components must be strictly less than min(n_samples, n_features), so the default (exactly min(n_samples, n_features)) would fail. |
sklearn/decomposition/pca.py
Outdated
| explained is greater than the percentage specified by n_components | ||
| n_components cannot be equal to n_features for svd_solver == 'arpack'. | ||
| explained is greater than the percentage specified by n_components. | ||
| if svd_solver == 'arpack', the number of components must be strictly |
| # matrix X) raise errors. | ||
| X = np.array([[0, 1, 0], [1, 0, 0]]) | ||
| for solver in solver_list: | ||
| for n_components in [-1, 3]: |
There was a problem hiding this comment.
You should use assert_raises_regex to check also the message return and not only assert_raises.
|
Thank you @glemaitre . |
|
There was seemingly a problem with Travis until recently, so I am assuming that is why no checks were performed. I am going to open a new pull request, so all checks are run. |
|
you don't have to close |
|
@glemaitre I did that, but it seemed I could no longer reopen this pull-request because amending the last commit rewrote history (which makes sense). Did I misunderstand what you advised? I perhaps should have pushed an empty commit on top. Regardless, I have opened a new pull-request already |
|
You should have reopen the PR and then do the amend and push. I think that if you push somewhere else you cannot reopen. Anyway it is fine. Guillaume Lemaitre INRIA Saclay Ile-de-France / Equipe PARIETALguillaume.lemaitre@inria.fr - https://glemaitre.github.io/
|
Reference Issue
Fixes #8484
What does this implement/fix? Explain your changes.
Documentation changes:
n_component parameter documentation, & mentions elsewhere in parameters section
n_components_ attribute documentation
unrelated (to issue) extra: corrected documentation for explained_variance_ratio_ attribute