Added user guide documentation for permutation_test_score#14757
Added user guide documentation for permutation_test_score#14757aditi9783 wants to merge 9 commits intoscikit-learn:masterfrom
Conversation
|
@aditi9783 should we close this PR in favor of #14769? |
|
@NicolasHug Sure. |
…ction "Obtaining predictions by cross-validation" and modified function help text to follow PEP8.
|
Actually re-opening so that @aditi9783 can update (will probably close #14769 then ) |
NicolasHug
left a comment
There was a problem hiding this comment.
Thanks @aditi9783 and @vmanisha. I merged with master for the docs to compile (not sure what went wrong), and also made some very minor modifications.
Looks good!
| It also returns cross validation scores for each permutation of y labels. It | ||
| permutes the labels of the samples and computes the p-value against the null | ||
| hypothesis that the features and the labels are independent, meaning that there | ||
| is no difference between the classes. |
There was a problem hiding this comment.
"classes" implies classification, which we aren't necessarily talking about here.
There was a problem hiding this comment.
The function permutes class labels. Isn't classification implied in that case?
There was a problem hiding this comment.
No, it peemutes target values for each sample, not class labels
|
|
||
| It also returns cross validation scores for each permutation of y labels. It | ||
| permutes the labels of the samples and computes the p-value against the null | ||
| hypothesis that the features and the labels are independent, meaning that there |
There was a problem hiding this comment.
Should "features" be "predictions"? Or is this right, if we add "conditioned on the estimator"?
There was a problem hiding this comment.
Features are the input vectors (X), and thus not predictions.
There was a problem hiding this comment.
Yeah, so you need to add "given the estimator"
| approximates the probability that the average cross-validation score would be | ||
| obtained by chance if the target is independent of the data. | ||
|
|
||
| It also returns cross validation scores for each permutation of y labels. It |
There was a problem hiding this comment.
Talking about return value here confuses the matter if what we want to talk about is how the P value is constructed
There was a problem hiding this comment.
This documentation is for user guide. The function description and the paper cited give more details about how the p-value is constructed.
There was a problem hiding this comment.
Ordinarily our user guide is less focused on API things like what the function returns. Conversely, a description of how the algorithm works belongs in the user guide not in the docstring.
There was a problem hiding this comment.
Hi, we're trying to get all of the PR's from the WiMLDS sprint wrapped up, are we waiting on @aditi9783, or approval from another reviewer? Thanks!
Not completely sure what's going on with this one @reshamas
There was a problem hiding this comment.
@kellycarmody We are waiting on @aditi9783
@aditi9783 We use the user guide to explain the math with the details necessary to explain the function. The function's docstring is usually brief and links to the user guide. In this case, moving most of the docstring into the user guide would be good.
There was a problem hiding this comment.
Hi @thomasjpfan, @reshamas asked me to take over this PR, but I see that Nicolas Hug already approved the changes, and it is just waiting for one more reviewer.
Is the PR good? Or should I move most of the docstring into the user guide?
|
This PR is completed from our (me and my sprint partner, Manisha Verma)
side and is ready for merge. I don't know if we need approval from a core
contributor or if there is anything else we need to do.
Thanks,
Aditi
…On Wed, Sep 11, 2019, 8:48 AM kellycarmody ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In doc/modules/cross_validation.rst
<#14757 (comment)>
:
> @@ -300,6 +300,23 @@ section.
* :ref:`sphx_glr_auto_examples_model_selection_plot_cv_predict.py`,
* :ref:`sphx_glr_auto_examples_model_selection_plot_nested_cross_validation_iris.py`.
+
+.. _cv_significance_evaluation:
+
+Cross-validation significance evaluation
+----------------------------------------
+
+Significance of cross validation scores can be evaluated using the
+:func:`permutation_test_score` function. The function returns a p-value, which
+approximates the probability that the average cross-validation score would be
+obtained by chance if the target is independent of the data.
+
+It also returns cross validation scores for each permutation of y labels. It
Hi, we're trying to get all of the PR's from the WiMLDS sprint wrapped up,
are we waiting on
@aditi9783 <https://github.com/aditi9783>, or approval from another
reviewer? Thanks
Not completely sure what's going on with this one @reshamas
<https://github.com/reshamas>
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#14757?email_source=notifications&email_token=AALMXJGABNVXXGXNP6WDO4DQJBPPJA5CNFSM4IPGTBXKYY3PNVWWK3TUL52HS4DFWFIHK3DMKJSXC5LFON2FEZLWNFSXPKTDN5WW2ZLOORPWSZGOCEKC74A#discussion_r323043617>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AALMXJH5X6PT64OJFQK7V7TQJBPPJANCNFSM4IPGTBXA>
.
|
|
Hi @aditi9783, Thomas Fan mentioned that they are still waiting for you to close this out. Would you be able to work on this sometime soon so we can start closing the PR's from the sprint? Thanks! |
|
@amueller @jnothman @thomasjpfan Is this typical to include a reference link in documentation that may require payment? I was attempting to access the paper to better understand the comments on this PR. cc: @cmarmo (this PR is from NYC wimlds Aug 2019 sprint) |
|
In this specific case, it is better to cite the following: http://www.jmlr.org/papers/volume11/ojala10a/ojala10a.pdf which is an open-access JMLR paper. It should the extended version of the IEEE version which was a conference paper.
If we really want to acknowledge the original method of an author, it might be the only way sometimes. |
|
I propose adding this PR to the upcoming Berlin beginner sprint list of issues. cc: @adrinjalali @cmarmo |
|
@reshamas do you have a repo as you usually do for the sprints to add these to a project on that repo? |
|
@adrinjalali |
|
also added @cmarmo to repo: |
|
Hey I just started working on this citation problem @adrinjalali @reshamas @aditi9783 |
|
@cmarmo this PR is one of the last ones from the NYC Aug 2019 sprint. Does it need a reviewer? |
Hi @reshamas , if I understand correctly, @lucyleeow has taken over in #17373. |
|
doesn't that mean this is superseded and not stalled @cmarmo ? |
|
Closed by #17373. |
Reference Issues/PRs
Fixes #10905
Fixes #10931. Worked on the branch https://github.com/maskani-moh/scikit-learn/tree/fix-10905.
Closes #14769
What does this implement/fix? Explain your changes.
Added user guide documentation for permutation_test_score.
Any other comments?
Worked with @vmanisha at the NYC scikit-learn sprint 2019.