Skip to content

Conversation

@snehankekre
Copy link
Contributor

@snehankekre snehankekre commented Nov 28, 2022

📚 Context

The docstrings in Streamlit's codebase (specifically, the public API) loosely conform to the numpy (aka numpydoc) convention. The specific rules of the numpy convention we follow and don't follow haven't been explicitly laid out.

  • What kind of change does this PR introduce?

    • Bugfix
    • Feature
    • Refactoring
    • Other, please describe: Doc improvement request

🧠 Description of Changes

This PR does two things:

  1. Makes explicit the rules of the numpy convention we follow and those we ignore. Read Error Codes for an exhaustive list of all possible rules.
    • We want to limit ourselves to validate docstrings that are part of Streamlit's public API. However, pydocstyle operates on .py files and not function names. As such, the pre-commit hook in point 2 validates docstrings in .py files in lib/streamlit/ (which contains all .py files corresponding to all public Streamlit commands), excluding lib/streamlit/vendor and lib/streamlit/watcher (both sub dirs don't contain .py files with public API commands).
  2. Adds a pre-commit hook that uses pydocstyle to validate if docstrings pass all the rules except the ones mentioned below:
    - repo: https://github.com/pycqa/pydocstyle
        rev: 6.1.1
        hooks:
          - id: pydocstyle
            name: Run pydocstyle
            args:
              - --convention=numpy
              - --add-ignore=D100,D101,D104,D105,D106,D107
              - --add-select=D417
              - --match-dir=(lib\/streamlit\/)(?!vendor)(?!watcher)

Rules we follow

The subset of rules from pydocstyle's Error Codes we follow are:

Error code Description
Missing Docstrings
D102 Missing docstring in public method
D103 Missing docstring in public function
Whitespace Issues
D200 One-line docstring should fit on one line with quotes
D201 No blank lines allowed before function docstring
D202 No blank lines allowed after function docstring
D204 1 blank required after class docstring
D205 1 blank line required between summary line and description
D206 Docstring should be indented with spaces, not tabs
D207 Docstring is under-indented
D208 Docstring is over-indented
D209 Multi-line docstring closing quotes should be on a separate line
D210 No whitespaces allowed surrounding docstring text
D211 No blank lines allowed before class docstring
D214 Section is over-indented
D215 Section underline is over-indented
Quotes Issues
D300 Use """triple double quotes"""
D301 Use r""" if any backslashes in a docstring
D302 Deprecated: Use u""" for Unicode docstrings
Docstring Content Issues
D400 First line should end with a period
D401 First line should be in imperative mood
D403 First word of the first line should be properly capitalized
D404 First word of the docstring should not be This
D405 Section name should be properly capitalized
D406 Section name should end with a newline
D407 Missing dashed underline after section
D408 Section underline should be in the line following the section's name
D409 Section underline should match the length of its name
D410 Missing blank line after section
D411 Missing blank line before section
D412 No blank lines allowed between a section header and its content
D414 Section has no content
D417 Missing argument descriptions in the docstring
D418 Missing attribute description in the docstring

Rules we ignore

Error code Description
Missing Docstrings
❌ D100 Missing docstring in public module
❌ D101 Missing docstring in public class
❌ D104 Missing docstring in public package
❌ D105 Missing docstring in magic method
❌ D106 Missing docstring in public nested class
❌ D107 Missing docstring in init

What's not covered in this PR

  • All existing docstrings that break any of the above rules we're supposed to follow will not be retroactively fixed. i.e. The pydocstyle pre-commit hook operates only on staged files in git.
    • We choose to operate only on modified files and run the checks locally so as to not disrupt developers' workflows. As a developer contributing to the codebase, one would rather not see hundreds of docstring errors that unrelated to one's contribution; both locally and in CI (GHA)
  • The hook does not modify files. It only checks if files follow rules (think linting) and outputs what files, if any, break the rules.

What's next?

The PR ensures that the pydocstyle check will run against any new or modified .py files in (lib\/streamlit\/)(?!vendor)(?!watcher). It leaves unanswered the question: "How do we ensure existing docstrings are validated ?".

That question can be answered using one or both of these options:

  1. The Streamlit Documentation team in subsequent PRs fixes all existing the docstrings in violation of rules
  2. Let the Community tackle in separate PRs fixes for each violated rule

This PR, however, is independent of the above question and can be merged before it is answered.

@snehankekre snehankekre requested review from sfc-gh-kbregula and vdonato and removed request for sfc-gh-kbregula November 28, 2022 13:35
@snehankekre snehankekre added the security-assessment-completed Security assessment has been completed for PR label Nov 28, 2022
@snehankekre
Copy link
Contributor Author

I'm not sure why autoflake is failing in js_test and py_version. The diffs don't include the files that are failing..

@snehankekre snehankekre marked this pull request as draft November 29, 2022 09:55
@snehankekre
Copy link
Contributor Author

Changed to a draft PR as some aspects of this change could use Product review first.

run: |
# Run eslint as a standalone command to generate the test report.
PRE_COMMIT_NO_CONCURRENCY=true SKIP=eslint pipenv run pre-commit run --show-diff-on-failure --color=always --all-files
PRE_COMMIT_NO_CONCURRENCY=true SKIP=eslint,pydocstyle pipenv run pre-commit run --show-diff-on-failure --color=always --all-files
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we want to disable this check in CI? Can you update the comment below?

- --convention=numpy
- --add-ignore=D100,D101,D104,D105,D106,D107
- --add-select=D417
- --match-dir=(lib\/streamlit\/)(?!vendor)(?!watcher)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we use the files/exclude option for pre-commit-hook, this check will only run when we make changes to matching files.
Screenshot 2022-11-29 at 16 27 06

Now, this check will be run when we modify any python file.

    types: [python]

https://github.com/PyCQA/pydocstyle/blob/master/.pre-commit-hooks.yaml#L6

@sfc-gh-kbregula
Copy link
Contributor

I'm not sure why autoflake is failing in js_test and py_version. The diffs don't include the files that are failing..

Here is fix: #5790

@stale
Copy link

stale bot commented Dec 15, 2022

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

security-assessment-completed Security assessment has been completed for PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants