DOC Ensures that LatentDirichletAllocation passes numpydoc validation#20402
DOC Ensures that LatentDirichletAllocation passes numpydoc validation#20402g4brielvs wants to merge 1 commit intoscikit-learn:mainfrom g4brielvs:numpydoc-validation-latentdirichletallocation
Conversation
| Returns | ||
| ------- | ||
| self | ||
| self: |
There was a problem hiding this comment.
self : object for consistency with other places in the package
There was a problem hiding this comment.
yes, but I don't think it's correct. The base class for all estimators is BaseEstimator not object, so I think there is no point in indicating the object dtype.
| Returns | ||
| ------- | ||
| self | ||
| self: |
There was a problem hiding this comment.
The initial version was correct for cases where there are no types, and it's a bit pointless to indicate object dtype for estimators which is also not accurate https://numpydoc.readthedocs.io/en/latest/format.html#parameters
cc @glemaitre
There was a problem hiding this comment.
it's a bit pointless to indicate object dtype for estimators which is also not accurate
I agree. However, we have this way of documenting everywhere. I would prefer to keep this inaccurate convention until we do a find/replace regex in another PR. Would it not be easier to make the change?
There was a problem hiding this comment.
Right, but my point is there is no issue with this line of docstring. It should pass the numpydoc validation, I think? And if so I would rather we didn't change it instead of introducing unnecessary code churn with a solution we know is not helpful / correct.
self : object is used elsewhere, but 27% of estimators still use the current conversion,
$ rg "^\s+self$" | grep self | wc -l
38
$ rg "^\s+self\s*:$" | grep self | wc -l
2
$ rg "^\s+self\s*:\s*object$" | grep self | wc -l
103There was a problem hiding this comment.
@rth Thank you so much for reviewing my PR. The numpydoc validation requires a description and that's the reason a added a tautology. I'll be more than happy to make any changes according to your guidance.
There was a problem hiding this comment.
I think the description is ok. The point of @rth is about self : object vs self alone on the current line, not the description below.
| [2] "Stochastic Variational Inference", Matthew D. Hoffman, David M. Blei, | ||
| Chong Wang, John Paisley, 2013 | ||
|
|
||
| [3] Matthew D. Hoffman's onlineldavb code. Link: |
There was a problem hiding this comment.
I think,
| [2] "Stochastic Variational Inference", Matthew D. Hoffman, David M. Blei, | |
| Chong Wang, John Paisley, 2013 | |
| [3] Matthew D. Hoffman's onlineldavb code. Link: | |
| .. [2] "Stochastic Variational Inference", Matthew D. Hoffman, David M. Blei, | |
| Chong Wang, John Paisley, 2013 | |
| .. [3] Matthew D. Hoffman's onlineldavb code. Link: |
but need to check the rendering (ci/circleci: doc artifact CI job), currently it doesn't look ideal: https://142780-843222-gh.circle-artifacts.com/0/doc/modules/generated/sklearn.decomposition.LatentDirichletAllocation.html
There was a problem hiding this comment.
It might raise an error if the article is not linked (somewhere we expect [2]_) in the docstring. If this is not the case, we could either link where it is meaningful, otherwise, we could as well remove the reference.
|
Updating with correct spelling: #DataUmbrella sprint |
|
OK.
…On Mon, 28 Jun 2021 at 11:44, Roman Yurchak ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In sklearn/decomposition/_lda.py
<#20402 (comment)>
:
>
Returns
-------
- self
+ self:
Right, but my point is there is no issue with this line of docstring. It
should pass the numpydoc validation, I think? And if so I would rather we
didn't change it instead of introducing unnecessary code churn with a
solution we know is not helpful / correct.
self : object is used elsewhere, but 27% of estimators use the current
conversion,
$ rg "^\s+self$" | grep self | wc -l
38
$ rg "^\s+self\s*:$" | grep self | wc -l
2
$ rg "^\s+self\s*:\s*object$" | grep self | wc -l
103
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#20402 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABY32P3LWJK22RLIEIMMEW3TVBAAXANCNFSM47LWDQNA>
.
--
Guillaume Lemaitre
Scikit-learn @ Inria Foundation
https://glemaitre.github.io/
|
rth
left a comment
There was a problem hiding this comment.
The numpydoc validation requires a description and that's the reason a added a tautology.
If untyped self doesn't pass validation than it's an instance of numpy/numpydoc#242. LGTM for the rest except for the https://github.com/scikit-learn/scikit-learn/pull/20402/files#r659270234 for which I'm still not sure.
|
I made the changes in another PR and merge with the current branch and authorship. |
Reference Issues/PRs
Addresses #20308
What does this implement/fix? Explain your changes.
This PR ensures
LatentDirichletAllocationis compatible with numpydocdLatentDirichletAllocationfromDOCSTRING_IGNORE_LISTAny other comments?
Thanks #DataUmbrella