Provide compatibility with mkdocs-material blog plugin#72
Provide compatibility with mkdocs-material blog plugin#72manuzhang merged 4 commits intomanuzhang:mainfrom
Conversation
|
@manuzhang should the linting errors be fixed as part of this PR? |
|
Yes, please fix it |
htmlproofer/plugin.py
Outdated
| try: | ||
| return files[search_path] | ||
| except KeyError: | ||
| for file in files: |
There was a problem hiding this comment.
Can this impact performance significantly?
There was a problem hiding this comment.
It changes the O for lookup to be linear so for huge sites, it could have a large impact.
Unfortunately the faster lookup that existed previously is also a bug.
The performance could be addressed by creating a Dictionary at the time that lookup is needed after all the File objects have been fully configured. This would have better performance characteristics but is also more obscure and probably a bit harder to maintain.
On almost all sites, the impact of this check should not be particularly noticeable to a "reasonable human".
There was a problem hiding this comment.
Note, I'm not sure where the proper place to build that Dictionary to speed the lookups in find_source_file would be.
My python is not the greatest. Ideally, if you want to do that change, I'd prefer to hand it off to you.
There was a problem hiding this comment.
Me neither ;). @johnthagen could you please take a look?
There was a problem hiding this comment.
With the change I had to make to get this to work for the Windows tests, the cost has increased some as os.path.normpath needs to be called on file.url for the comparison to work, so that is an additional bit of cost.
I ran again locally with the source for ponylang.io (https://github.com/ponylang/ponylang-website) and if there is a difference, it isn't noticeable to me.
There was a problem hiding this comment.
If I understand the lifecycle correctly then, self.files could be turned into an optimized dictionary in on_post_page anytime before the for a in soup.find_all('a', href=True): loop.
If that is correct, I do a dictionary based structure that mimics the old data structure but has the correct information.
Can you confirm that my understanding of the lifecycle is correct?
There was a problem hiding this comment.
I'll make that change in the not so distant future. Before the end of the weekend at the latest.
7fe9b58 to
99707e7
Compare
|
@manuzhang this particular implementation apparently has an issue with windows. I'm guessing that is a path separator issue. I'll look at it later. |
9f7a612 to
b99b84c
Compare
|
I think I know what the windows issue is. Testing that out now. |
eee2979 to
c3eb1df
Compare
Prior to this commit, certain assumptions were made about the files seen in `on_files` that are not true when the mkdocs-material blog plugin is used. The url seen at the time `on_files` is called is not guaranteed to be the final url that will appear in html. With the blog plugin, at the time `on_files` is called, the value will be something like: `blog/posts/foo.md` but at the time we are trying to get a mapping, the url will be something like: `blog/2024/01/foo.html` Due to this change, htmlproofer would fail to validate the url despite it being valid. To address this, instead of looking up pages from a Dict where the key is set at the time of `on_files`, we know store a list of Files and check for the `search_path` for a url against the `url` attribute of each File to try to find the mapping.
cabe052 to
8d9d9db
Compare
|
@manuzhang all set. with the optimization, all the old tests continued to work unchanged. It ends up being a better change. I tested this locally as well with a site that uses the blog as an extra level of verification. |
Prior to this commit, certain assumptions were made about the files seen in
on_filesthat are not true when the mkdocs-material blog plugin is used.The url seen at the time
on_filesis called is not guaranteed to be the final url that will appear in html. With the blog plugin, at the timeon_filesis called, the value will be something like:blog/posts/foo.mdbut at the time we are trying to get a mapping, the url will be something like:
blog/2024/01/foo.htmlDue to this change, htmlproofer would fail to validate the url despite it being valid.
To address this, instead of looking up pages from a Dict where the key is set at the time of
on_files, we know store a list of Files and check for thesearch_pathfor a url against theurlattribute of each File to try to find the mapping.