Skip to content

Separate documentation for contributing and developing estimators in to different files #14582#14611

Merged
thomasjpfan merged 15 commits intoscikit-learn:masterfrom
arpanchowdhry:master
Aug 27, 2019
Merged

Separate documentation for contributing and developing estimators in to different files #14582#14611
thomasjpfan merged 15 commits intoscikit-learn:masterfrom
arpanchowdhry:master

Conversation

@arpanchowdhry
Copy link
Copy Markdown
Contributor

Addresses #14582

I used the following as a guideline to separate sections. Also this is just an initial version to get feedback. The new file with developing estimators is pretty bare bones and might need more verbiage.

Ways to contribute - Contributing
Submitting a bug report or a feature request - Contributing
How to make a good bug report
Contributing code - Contributing
How to contribute
Pull request checklist
Continuous Integration (CI)
Stalled pull requests
Issues for New Contributors
Documentation - Contributing
Building the documentation
Guidelines for writing documentation
Generated documentation on CircleCI
Testing and improving test coverage - Developing
Writing matplotlib related tests
Workflow to improve test coverage
Issue Tracker Tags - Contributing - I don't think this belongs here in any case
Coding guidelines - Contributing for top matter
Input validation - Developing?
Random Numbers - Developing
Deprecation - Contributing
Change the default value of a parameter - Contributing
Python versions supported - Contributing
Code Review Guidelines - Contributing
APIs of scikit-learn objects - Developing
Different objects
Estimators
Instantiation
Fitting
Estimated Attributes
Optional Arguments
Pairwise Attributes
Rolling your own estimator - Developing
get_params and set_params
Parameters and init
Cloning
Pipeline compatibility
Estimator types
Specific models
Estimator Tags
Reading the existing code base - Contributing

Copy link
Copy Markdown
Member

@NicolasHug NicolasHug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking care of this @arpanchowdhry .

Just a few comments for now. You will also need to add the reference to doc/developers/index.rst for the new file to be rendered.

.. _api_overview:

APIs of scikit-learn objects
============================
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this should come first?

@@ -0,0 +1,761 @@
.. _developing_estimators:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section is intended to be about development in general, not just developing estimator. Maybe change the ref/file/section to e.g. "Developing scikit-learn" ? (not great either, others will have better ideas)

@arpanchowdhry
Copy link
Copy Markdown
Contributor Author

@NicolasHug Hi, I made the required changes, but I am seeing all sorts for failures due the reference addition in index.rst. Any thoughts?

@NicolasHug
Copy link
Copy Markdown
Member

I think the CI is going berserk lately, your docs build correctly on my machine.

Copy link
Copy Markdown
Member

@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this. It does need a little more work.

@@ -0,0 +1,761 @@
.. _developing_scikit:

============
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please extend these lines to match the length of the text

.. _developing_scikit:

============
Developing scikit-learn
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Developing scikit-learn
Developing scikit-learn estimators

3. Loop.


Issue Tracker Tags
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure this belongs in the contributer guide, not here: it is about the specific development processes for the scikit-learn project

Coding guidelines
=================

The following are some guidelines on how new code should be written. Of
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thesl issue suggests that the top matter from this section goes into the contributing guide, not here. Or maybe we can just say that the following are some standards used within scikit-learn, which may be appropriate to adopt in external projects for the sake of consistency

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jnothman Thanks for the feedback. I have made all the requested changes. But I am still not sure about how to properly handle the 'Coding guidelines' section in the new file. I just changed the first line to reflect what you had commented above. But if further modifications are needed please let me know.

Copy link
Copy Markdown
Member

@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few more.

Coding guidelines
=================

The following are some standards used within scikit-learn, which may
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Despite being my words, I'm not happy with this result. Let's use the existing "The following are some guidelines on how new code should be written for inclusion in scikit-learn, and which may be appropriate to adopt in external projects."


.. _coding-guidelines:

Coding guidelines
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this belongs below "Rolling your own estimator"

find bugs in scikit-learn.

* Use the `numpy docstring standard
<https://github.com/numpy/numpy/blob/master/doc/HOWTO_DOCUMENT.rst.txt>`_
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


.. _testing_coverage:

Testing and improving test coverage
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm now thinking this belongs back in the contributing guide. It's more specific to scikit-learn core than I had realised. (It would be nice to eventually include some general comments here on how to test estimators.)


* **Please don't use** ``import *`` **in any case**. It is considered harmful
by the `official Python recommendations
<https://docs.python.org/2/howto/doanddont.html#from-module-import>`_.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jnothman Thanks for the feedback. I made all the changes you recommended above.

@arpanchowdhry
Copy link
Copy Markdown
Contributor Author

@jnothman Let me know if you have any other feedback.

Copy link
Copy Markdown
Member

@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Coding Guidelines section is still somewhat duplicated between the two sections. The sections on "Deprecation" and "Change the default value of a parameter" could be under a new heading called "Maintaining backwards compatibility" instead of the Coding Guidelines. I don't think "
Python versions supported" needs to be in this file.

@@ -0,0 +1,671 @@
.. _developing_scikit:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're going to need to give this a better filename. I'm not sure yet what's right. coding? implementation? api?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I like 'developing' more than 'coding', but I will rely on your judgement.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Developing" wrt an OSS project often includes project & issue management, personnel, etc. So it covers "contributing" as well as this. Rather this section is a guide to engineering, etc.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Grasping at straws now, but maybe use 'enhancement' or 'enhancing_scikit'. That word is used heavily at my day job :)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not about enhancing. This is about how to implement/develop/code/build your own estimators. The filename should be a single word, not two. I'm okay with you choosing one of those words. Thanks

==================================
Developing scikit-learn estimators
==================================

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Insert .. currentmodule:: sklearn

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jnotham I am not sure why but I am getting build failures with '.. currentmodule:: sklearn'. But once I removed it all the failures are gone.

@arpanchowdhry
Copy link
Copy Markdown
Contributor Author

arpanchowdhry commented Aug 19, 2019

The Coding Guidelines section is still somewhat duplicated between the two sections. The sections on "Deprecation" and "Change the default value of a parameter" could be under a new heading called "Maintaining backwards compatibility" instead of the Coding Guidelines. I don't think "
Python versions supported" needs to be in this file.

@jnothman I can take out 'Coding Guidelines' from one of the files in you would like, but seems useful in both places though.

@jnothman
Copy link
Copy Markdown
Member

jnothman commented Aug 19, 2019 via email

@jnothman
Copy link
Copy Markdown
Member

Those build failures were not due to any of your changes, but due to failures elsewhere that we subsequently fixed.

Copy link
Copy Markdown
Member

@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to approve this apart from that filename.

@@ -0,0 +1,671 @@
.. _developing_scikit:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not about enhancing. This is about how to implement/develop/code/build your own estimators. The filename should be a single word, not two. I'm okay with you choosing one of those words. Thanks

@arpanchowdhry
Copy link
Copy Markdown
Contributor Author

@jnothman Thanks for the feedback. I changed the filename to develop.

Copy link
Copy Markdown
Member

@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems a big improvement to me.

@jnothman
Copy link
Copy Markdown
Member

Could we get another pair of eyes on this and merge?

@jnothman jnothman added High Priority High priority issues and pull requests Waiting for Reviewer labels Aug 26, 2019
Copy link
Copy Markdown
Member

@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great @arpanchowdhry !

@thomasjpfan thomasjpfan merged commit dc11759 into scikit-learn:master Aug 27, 2019
@NicolasHug
Copy link
Copy Markdown
Member

Thanks for the work @arpanchowdhry!

@jnothman @thomasjpfan sorry I wasn't able to review earlier. I feel that the Plotting API section is in an awkward position now. Maybe we can move it to the Developing scikit-learn estimators section if we rename this one to something else?

@thomasjpfan
Copy link
Copy Markdown
Member

Both sections are kind of awkward for Plotting API to be in. I considered “contributing” = “anything that is not about developing an estimator”.

Developing scikit-learn estimators is really well targeted to third party libraries. Adding the plotting to the end seems a little strange, even with a change in title.

Plotting API most likely needs its own section, considering with PDPs we have space stealing.

@jnothman
Copy link
Copy Markdown
Member

jnothman commented Aug 27, 2019 via email

@NicolasHug
Copy link
Copy Markdown
Member

I thought too, especially considering we have coding style recommendation

@thomasjpfan
Copy link
Copy Markdown
Member

After describing what is an estimator, it goes into “rolling your own estimator”. Then estimator tags. These topics are very generic about “how to create an estimator” Finally, at the end it goes into sklearn specific “coding guidelines”.

@thomasjpfan
Copy link
Copy Markdown
Member

It is also well targeted for this specific library as well. ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Documentation High Priority High priority issues and pull requests Waiting for Reviewer

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants