Skip to content

[MRG+2] warning for PCA with sparse input#7649

Merged
amueller merged 1 commit intoscikit-learn:masterfrom
giorgiop:pca-sparse-input
Oct 18, 2016
Merged

[MRG+2] warning for PCA with sparse input#7649
amueller merged 1 commit intoscikit-learn:masterfrom
giorgiop:pca-sparse-input

Conversation

@giorgiop
Copy link
Copy Markdown
Contributor

@giorgiop giorgiop commented Oct 12, 2016

Fix for #7642.

@jnothman
Copy link
Copy Markdown
Member

Huh? If check_array here rejects sparse then why does #7642 (comment) claim it gets as far as subtracting a mean?

I don't think a warning followed by an exception is an improvement; but the exception should be raised and this behaviour tested.

@giorgiop
Copy link
Copy Markdown
Contributor Author

I think the point in #7642 is that, since there is an instruction subtracting the mean, the developer did not have in mind to support input sparsity in pca. This is indeed the case and in fact there is a check_arracy to perform exactly that check.

The issue converged to the point that the error is not informative enough and misled people into using sparse input with pca. What do you suggest instead of the additional warning? I don't like it very much either

@amueller
Copy link
Copy Markdown
Member

You can immediately raise an error instead of a warning?

@giorgiop
Copy link
Copy Markdown
Contributor Author

Now: raising an error and testing it.

@giorgiop giorgiop changed the title warning for PCA with sparse input [MRG] warning for PCA with sparse input Oct 17, 2016
@amueller
Copy link
Copy Markdown
Member

test failure

@giorgiop giorgiop force-pushed the pca-sparse-input branch 4 times, most recently from 705795c to 952d042 Compare October 18, 2016 04:22
@giorgiop
Copy link
Copy Markdown
Contributor Author

Should be fixed now.


def test_pca_spase_input():

X = np.random.rand(5, 4)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to use a seeded RNG, for instance "X = np.random.RandomState(0).rand(5, 4)". We really don't want to leave any non seeded RNG in the test suite.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't believe I forgot this :) Thanks.

@GaelVaroquaux
Copy link
Copy Markdown
Member

+1 aside the minor change I asked for.

@GaelVaroquaux GaelVaroquaux changed the title [MRG] warning for PCA with sparse input [MRG+1] warning for PCA with sparse input Oct 18, 2016
@GaelVaroquaux
Copy link
Copy Markdown
Member

+1. @jnothman : do you still have changes requests, or are you +1?

@jnothman
Copy link
Copy Markdown
Member

LGTm

@jnothman jnothman changed the title [MRG+1] warning for PCA with sparse input [MRG+2] warning for PCA with sparse input Oct 18, 2016
@amueller amueller merged commit 0198e2c into scikit-learn:master Oct 18, 2016
amueller pushed a commit to amueller/scikit-learn that referenced this pull request Oct 25, 2016
Sundrique pushed a commit to Sundrique/scikit-learn that referenced this pull request Jun 14, 2017
paulha pushed a commit to paulha/scikit-learn that referenced this pull request Aug 19, 2017
maskani-moh pushed a commit to maskani-moh/scikit-learn that referenced this pull request Nov 15, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants