Skip to content

[docs] Generate linting docs from source#17189

Merged
jugglinmike merged 1 commit intoweb-platform-tests:masterfrom
bocoup:docs-generate-lint-docs
Nov 14, 2019
Merged

[docs] Generate linting docs from source#17189
jugglinmike merged 1 commit intoweb-platform-tests:masterfrom
bocoup:docs-generate-lint-docs

Conversation

@jugglinmike
Copy link
Contributor

When new rules have been added to WPT's "lint" tool, the corresponding
documentation has not always been updated [1] [2] [3]. The static list
of rules currently describes only 22 of the 53 available rules.
Automatically generating documentation from source code helps avoid this
state and the confusion it can cause contributors.

Rely on the previously-implemented source code structure [4] during
documentation generation to automatically create a listing of all
available linting rules.

Although the Sphinx documentation generator includes a built-in
extension for generating documentation from Python source code, the
output of that extension is designed to document Python primitives such
as functions and classes. Such a format is inappropriate for this case
because the users of the linting tool do not interact with the internals
in this way. Define a custom docutils directive to tailor the
documentation to the needs of its audience.

[1] #5299
[2] #10501
[3] #11479
[4] #16268


As noted in the commit message, this nearly doubles the number of documented
linting rules. Over time, we can further improve it by separating out explicit
"to fix" instructions (and maybe even require that new rules do the same).

Copy link
Member

@sideshowbarker sideshowbarker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest the existing descriptions from the docs/writing-tests/lint-tool.md file — rather than just being dropped — should be migrated over to the tools/lint/rules.py file, to replace the corresponding existing embedded descriptions in that file. Rationale: The existing descriptions in the docs/writing-tests/lint-tool.md file in most cases seem more helpful than the corresponding existing embedded descriptions in the tools/lint/rules.py file. That said, of course the descriptions should be evaluated case-by-case rather than just being replaced without reviewing them.


if rule["to_fix"]:
definition += nodes.strong(text="To fix:")
definition += WPTLintRules._parse_markdown(rule["to_fix"])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we parse the to_fix as Markdown, but not the description?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, that's a mistake. Good catch!

@jugglinmike
Copy link
Contributor Author

@sideshowbarker @gsnedders the descriptions are now parsed as Markdown, and the website's text has been preserved in all cases where it was more useful. Additionally, the list is lexicographically sorted now.

Copy link
Member

@sideshowbarker sideshowbarker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Beautiful

class InvalidTimeout(Rule):
name = "INVALID-TIMEOUT"
description = "Invalid timeout value %s"
description = (
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why aren't we just using a triple-quoted string here, as we did in earlier versions of the PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Multi-line strings are fine when rendering markup, but they aren't appropriate for the CLI. Since multi-line strings are easier to maintain, I experimented with normalizing the values at runtime using Python's standard library (specifically textwrap.dedent and inspect.cleandoc). That takes care of indentation, but the interstitial newlines are still a problem.

Rather than further increase complexity with custom formatting code, I opted for the uglier syntax.

Copy link
Member

@sideshowbarker sideshowbarker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refinements all LGTM

@jugglinmike
Copy link
Contributor Author

@gsnedders do you support my decision about multiline strings? If so, would you mind merging this?

@gsnedders
Copy link
Member

@jugglinmike Can you add a comment or something at the top justifying this?

I feel like this is something we're highly likely to regress in future, though. And probably add a parametrised test that goes through all the rules and checks the description/to_fix doesn't contain any newlines?

@jugglinmike
Copy link
Contributor Author

@gsnedders I suggested the requirement to author text without newlines as a compromise to limit complexity. In some ways, writing tests cuts against that goal. It's not that I'm against testing, but I'd rather test something that supports a more Pythonic contribution experience. That is, instead of complaining when folks naturally reach for multi-line strings, I'd rather try to support that pattern and test the logic which enables it.

Do you think it's better to treat this as an output sanitization problem? A utility function like this would be easy to test:

def prep_for_command_line(text):
    return inspect.cleandoc(text).replace('\n', ' ')

@gsnedders
Copy link
Member

@jugglinmike we need more aggressive whitespace normalisation that that, don't we? or we'll just end up with lots of consecutive whitespace? (or does that work fine because it's interpreted as Markdown and whitespace is otherwise collapsed?)

@jugglinmike
Copy link
Contributor Author

I think the combination of inspect.cleandoc and the removal of newlines should have us covered. Can you think of any specific cases where that wouldn't be enough?

import inspect

strings = [
    '''a
       b
       c''',
    '''
    a
    b
    c''',
    '''
    a
    b
    c
    ''',
    '''a
    b
    c
    '''
]

for string in strings:
    print inspect.cleandoc(string).replace('\n', ' ')
a b c
a b c
a b c
a b c

@gsnedders
Copy link
Member

@jugglinmike I got side-tracked and forgot to check what inspect.cleandoc does 🙂

@jugglinmike
Copy link
Contributor Author

@gsnedders I now remember another wrinkle: the tests inspect the Python objects directly (as opposed to the CLI output). That means we can't clean these values as we write them out--they have to be stored in sanitized form.

I've added a function named collapse and called it for each multiline string value. This might be a little more ergonomic than string concatenation, but it still requires contributors to discover a local convention.

Another solution might be perform the collapse operation in a metaclass. That kind of trades "too explicit" for "too implicit" though.

Whatever we decide, future contributors will be made aware of extra whitespace without any new tests because the current tests assert the description value. In light of that (and these alternate approaches), do you have a preference for how we structure this code?

@jugglinmike
Copy link
Contributor Author

@gsnedders There was a strange failure in Taskcluster that seemed to be unrelated to the change

logs
E: Failed to fetch http://archive.ubuntu.com/ubuntu/pool/main/p/python3-stdlib-extensions/python3-lib2to3_3.6.7-1~18.04_all.deb  404  Not Found [IP: 91.189.88.175 80]
E: Failed to fetch http://archive.ubuntu.com/ubuntu/pool/main/p/python3-stdlib-extensions/python3-distutils_3.6.7-1~18.04_all.deb  404  Not Found [IP: 91.189.88.175 80]
E: Failed to fetch http://archive.ubuntu.com/ubuntu/pool/main/p/python3.6/libpython3.6_3.6.7-1~18.04_amd64.deb  404  Not Found [IP: 91.189.88.175 80]
E: Failed to fetch http://archive.ubuntu.com/ubuntu/pool/main/p/python3.6/libpython3.6-dev_3.6.7-1~18.04_amd64.deb  404  Not Found [IP: 91.189.88.175 80]
E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/main/p/python-cryptography/python3-cryptography_2.1.4-1ubuntu1.2_amd64.deb  404  Not Found [IP: 91.189.88.175 80]
E: Failed to fetch http://archive.ubuntu.com/ubuntu/pool/main/p/python3.6/python3.6-dev_3.6.7-1~18.04_amd64.deb  404  Not Found [IP: 91.189.88.175 80]
E: Failed to fetch http://archive.ubuntu.com/ubuntu/pool/universe/p/python-pip/python3-pip_9.0.1-2.3~ubuntu1_all.deb  404  Not Found [IP: 91.189.88.175 80]
E: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?
[taskcluster 2019-09-05 17:55:57.623Z] === Task Finished ===
[taskcluster 2019-09-05 17:55:57.701Z] Unsuccessful task run with exit code: 100 completed in 47.098 seconds

...except that it was consistent. I've rebased to try to address that (the prior version of this branch is available here). Also fixed a problem flagged by the type system.

Would you mind taking another look at this?

@jugglinmike
Copy link
Contributor Author

@gsnedders gh-19952 improved the feedback we get for documentation patches. I've merged master into this branch so we can see exactly what effect this pull request has on the generated content. Does it look good to you?

When new rules have been added to WPT's "lint" tool, the corresponding
documentation has not always been updated [1] [2] [3]. The static list
of rules currently describes only 22 of the 53 available rules.
Automatically generating documentation from source code helps avoid this
state and the confusion it can cause contributors.

Rely on the previously-implemented source code structure [4] during
documentation generation to automatically create a listing of all
available linting rules.

Although the Sphinx documentation generator includes a built-in
extension for generating documentation from Python source code, the
output of that extension is designed to document Python primitives such
as functions and classes. Such a format is inappropriate for this case
because the users of the linting tool do not interact with the internals
in this way. Define a custom docutils directive to tailor the
documentation to the needs of its audience.

[1] web-platform-tests#5299
[2] web-platform-tests#10501
[3] web-platform-tests#11479
[4] web-platform-tests#16268
@jugglinmike
Copy link
Contributor Author

I've rebased this as per the recent advice to contributors. The prior version of this branch is available here.

@gsnedders Last call :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants