[WIP/RFC] Test docstring parameters (with order) by agramfort · Pull Request #9023 · scikit-learn/scikit-learn

agramfort · 2017-06-06T19:56:49Z

Reference Issue

Fixes #7758

What does this implement/fix? Explain your changes.

This adds a unit test to check that all parameters are documented
and in the right order.

Any other comments?

For now I only fixes datasets module and I think it's the least controversial.

jnothman · 2017-06-07T00:36:31Z

See also #7793.

I think there's no doubt that something like this could improve documentation quality. One big challenge is handling exceptional cases, including deprecation and *args (train_test_split, for instance).

I had grand plans to make something like this more descriptive, outputting a diff between the parameters and the docstring, so that it effectively fixes things for you. I have some code somewhere. Let me know if I should pull it out and share it. (Note that diffing docstrings is only feasible in some cases, e.g. where it appears directly in the function in question and \ isn't used; it also requires changes to numpydoc, changes I hoped to push there.) I also hoped to similarly handle cases where the parameter name did not have a before the :.

But what I'd really like to see a test for is that fitted model attributes correspond to those in the docstring.

ogrisel · 2017-06-07T08:22:17Z

But what I'd really like to see a test for is that fitted model attributes correspond to those in the docstring.

+1 but maybe for another PR once this or #7793 is reviewed and merged.

agramfort · 2017-06-07T08:24:18Z

@ogrisel do you think it's reasonable to fix all docstrings in this PR? otherwise we need to do the hack of flake8

ogrisel · 2017-06-07T09:05:51Z

sklearn/utils/tests/test_docstring_parameters.py

+        raise SkipTest(
+            "numpydoc is required to test the docstrings")
+
+    from numpydoc import docscrape


The codecov browser extension tells me that this line is never executed by our CI servers. It seems that we need to add the numpydoc module to at least a Python 3.6 and probably to a Python 2.7 build job in travis to properly cover this test code (and actually run the tests).

ogrisel · 2017-06-07T10:01:11Z

@ogrisel do you think it's reasonable to fix all docstrings in this PR?

I think we can merge a first PR with the infrastructure to do the checks as long as it's sufficiently tested. Speaking of which, it would be great to have couple of unittests to check the check_parameters_match itself so as to make sure that the error message is informative enough on common invalid docstring cases.

otherwise we need to do the hack of flake8

As @jnothman said there are probably functions that are exception to the behavioral rules encoded in your test function (e.g. train_test_split). So if we implement a systematic CI check as done for the flake8 check we need to have an easy way to disable the check for specific functions.

amueller · 2017-06-07T10:06:27Z

numpydoc? anyone wanna review #7355 ;) [currently it coughs up lots of errors but should still work]

amueller · 2017-06-07T15:57:05Z

sklearn/datasets/samples_generator.py

        The probability that a coefficient is zero (see notes). Larger values
        enforce more sparsity.

+    norm_diag : boolean, optional (default=False)


I guess we need to respect the order exactly?

amueller · 2017-06-07T15:57:39Z

sklearn/utils/tests/test_docstring_parameters.py

+                        '__neg__', '__hash__')
+
+public_modules = [
+    # the list of modules users need to access for all functionality


why not all? too much to do? leave for future PRS?

you want a huge PR to review? :)

sklearn.base should be there, no? Why not use pkgutils?

amueller · 2017-06-07T15:58:56Z

sklearn/utils/tests/test_docstring_parameters.py

+]
+
+
+def check_parameters_match(func, doc=None):


This maybe should move to utils.testing? If it deserves its own tests, I don't think it should live in a test file.

This file currently contains three things from what I can see: a function to check docstrings, tests for this function, and tests that run this function on sklearn. I feel they should live in two or three different files since they are conceptually very separate.

ogrisel · 2017-06-07T16:25:05Z

sklearn/utils/tests/test_docstring_parameters.py

+
+def test_check_parameters_match():
+    check_parameters_match(f_ok)
+    assert_raise_message(RuntimeError, 'Unknown section Results',


Nitpick: it would be better to have: 'Unknown section: Results'

Or even:

'Invalid section: Results'

this is a numpydoc string not mine

agramfort · 2017-06-07T16:48:40Z

refactoring done

lesteve · 2017-06-08T14:52:12Z

.travis.yml

    # versions of numpy, scipy with ATLAS that comes with Ubuntu Trusty 14.04
    - env: DISTRIB="ubuntu" PYTHON_VERSION="2.7" CYTHON_VERSION="0.23.4"
-           COVERAGE=true
+           COVERAGE=true TEST_DOCSTRINGS="false"


No need to set TEST_DOCSTRINGS if you don't want to test the docstrings (similar to what we do with COVERAGE)

Update with master and fix merge conflicts

raghavrv · 2017-06-23T11:21:55Z

Closing this in favor of #9206

ogrisel reviewed Jun 7, 2017

View reviewed changes

agramfort force-pushed the test_docstring branch from 11429f8 to dc56cea Compare June 7, 2017 15:51

amueller reviewed Jun 7, 2017

View reviewed changes

ogrisel reviewed Jun 7, 2017

View reviewed changes

lesteve reviewed Jun 8, 2017

View reviewed changes

agramfort added 15 commits June 9, 2017 10:06

add test script to have docstrings consistent with function signatures

3ae41ff

fix docstrings in datasets

3a1b925

add tests

04da883

update travis

00ebcfc

refactor

0b2c0eb

make travis happy?

5495e0e

do not crash when y=None for API reason

f3cefca

more fixes

a755044

more on metrics module

9fa05d8

more

f3d9c11

more

42b2a2d

more

98686e4

more

46abcd8

more

27936f4

more

88c71f3

agramfort force-pushed the test_docstring branch from 89390ba to 88c71f3 Compare June 9, 2017 09:02

more

c0a0630

agramfort and others added 6 commits June 9, 2017 12:03

more

d48d425

more

cd36d2d

more

4183f6a

more

a15459d

more

9a6f01d

Merge master

ed1a350

raghavrv mentioned this pull request Jun 17, 2017

Update with master and fix merge conflicts agramfort/scikit-learn#15

Merged

agramfort and others added 2 commits June 18, 2017 09:10

Merge pull request #15 from raghavrv/test_docstring_params

48f7433

Update with master and fix merge conflicts

Merge branch 'master' into test_docstring

b93b4cc

This was referenced Jun 22, 2017

[MRG+1] Do not transform y #9180

Merged

[MRG + 1 (rv) + 1 (alex) + 1] Add a check to test the docstring params and their order #9206

Merged

raghavrv closed this Jun 23, 2017

Uh oh!

Conversation

agramfort commented Jun 6, 2017 • edited by jnothman Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reference Issue

What does this implement/fix? Explain your changes.

Any other comments?

Uh oh!

jnothman commented Jun 7, 2017

Uh oh!

ogrisel commented Jun 7, 2017

Uh oh!

agramfort commented Jun 7, 2017 via email

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ogrisel commented Jun 7, 2017

Uh oh!

amueller commented Jun 7, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

agramfort commented Jun 7, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

raghavrv commented Jun 23, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

agramfort commented Jun 6, 2017 •

edited by jnothman

Loading