Better file gathering and validation for the info plugin#6826
Better file gathering and validation for the info plugin#6826squidfunk merged 6 commits intosquidfunk:masterfrom
Conversation
There was a problem hiding this comment.
Thanks for the PR! I see that you went down the 'more specific' road. As I argued in #6750 (comment), I don't think this is what we should do, because we'll be adding exceptions, exceptions and more exceptions. For instance, mkdocstrings usage is not covered in this solution, as far as I can tell. Additionally, the site package exclusion is based on git, which we should do programmatically, because we'll be adding more and more entries to the .gitignore as we discover other preferences for naming folders for virtual environments.
Moreover, configuration inheritance will likely not catch all cases, as we discussed in #6750. Collections of paths is based on a curated list of names of parameters, which is too labor intense, given that packing up of bug reports should 'just work'. Thus, as outlined in my last comment, I regard it as a better approach to go down the following road:
- Collect all files recursively from current directory
- Filter files that are in our
.gitignoreor insite-packages(this is essential) - Iterate and refine our
.gitignoreif we find other files that we always want to exclude
This should also allow for the projects plugin and other multi-configuration plugins to be accurately collected + is a solution that does not require maintenance from our side as the ecosystem grows.
Yes, I did, but I want to limit the specificity to MkDocs itself, as I mentioned:
which wasn't clear enough, sorry. I want to keep validation for:
I believe this is a good bare minimum to detect running In the code currently there is also the validation for markdown_extensions, which I checked mostly use 3 generic names for path options, but it will be removed soon, as by default, running I agree that If you still insist that this simple validation is out of scope I'll remove it completely ✌️
Not completely accurate. In the mkdocs-material/src/plugins/info/plugin.py Lines 364 to 368 in 1e7c2f2
I think the user might have different virtual environments, perhaps one with
Yes, I understand that there might be more custom generated paths in the project directory. I already add a mkdocs-material/src/plugins/info/plugin.py Line 174 in 1e7c2f2
I'm a bit busy today, but I will still try to clean up the PR before "tomorrow" comes 😅 EDIT: After some more pondering I think that |
|
Checking for So, to sum up: the aim should be: pack up all files that are used to run
Yet, packing up an example with
Yet, only one environment is active. I doubt that users use two virtual environments in one project. It sounds like an anti-pattern to me, and IMHO, we should optimize for the common case where only a single environment is present, i.e., the active environment, which we should resolve as discussed.
Sounds good! We likely need to repeat this for all projects, i.e, all
No hurry! I hope my reasoning makes sense. If not, don't hesitate to ask. I hope I could make it clearer why we need to make the plugin more resilient and less strict, yet keep it simple and powerful. Too much magic is like not a good idea for something that should work with minimal effort from a usage and maintenance perspective.
The |
|
I feel like there is some misunderstanding, because
indicates to me that you believe I'm using resolve INHERIT to get the parent.yml file and then include it in the zip file. If I'm wrong then sorry. I resolve INHERIT because MkDocs doesn't include it in MkDocsConfig, and I want to validate that it is inside the Apart of that I don't use the INHERIT config, and use everything that is provided by the already validated and resolved MkDocsConfig and everything inside the current working directory of
I just want to avoid missing any files, so if there is any invalid path outside of CWD the user gets notified and the program ends. I ended up with checking the following paths, only to inform the user that the info plugin won't find the files: mkdocs-material/material/plugins/info/plugin.py Lines 95 to 102 in 8f48c99 mkdocs-material/material/plugins/info/plugin.py Lines 298 to 316 in 8f48c99
Good idea. I will also consider adding some kind of output/log copy in the zip. But perhaps that's a feature for a future PR.
OK, I removed the static names, I also modified the exclusion logic, so that exclusion patterns for venvs are generated.
I considered excluding site_dir based on the inclusion of
Thanks, but not only did I completely miss both deadlines I set in the PR, I suggested I would fix the plugin some months ago 🥶 So I'm trying to focus on this PR for now.
I meant paths that are called Here is a slightly modified
The gathered project zip (inside of the zip above) isn't runnable because the I think it works fine, perhaps the exclusion is too lax. |
I think I misread the implementation then, sorry! In that case, yes, we can try including the
Also a great idea, albeit it might be a catch-22?
Again, no hurry. Albeit we're trying to fix bugs in the plugin, they happen only very rarely with most bug reports being complete and runnable (if users stick to our bug reporting guidelines and include the reproduction in the first place).
Ah yes, because of the ignore patterns. Yes! I read that this is another reason to be very careful when using exclusion patterns, so doing exclusion as programmatically as possible sounds like a safer approach.
Hmm, so the hook should be included I think, albeit the plugin should warn you about using hooks. We also need to make sure that mkdocstrings works as expected. Here's an example that was manually packed because the info plugin did not include all files. We need to make sure that the packed archive has exactly the same contents as this:
We'll learn along the way |
It's good we're finally on the same page. The confusion probably came from the first commit I left for the time being and it will be squashed later..., it included the INHERIT path inclusion statically, but in a further commit I changed to the recursive path lookup. It's definitely not the cleanest PR (as usual), but I wanted to show that I started working on it, instead of make "assurances" in the Issue comments. Your comments definitely keep me within bounds 😄
I forgot to mention that it did warn me about using hooks, so I enabled the config option that allows to create the archive despite bad validation. But when the user sees the warning they can just remove the
From the example zip I see it uses a 9.5.11+insiders.4.53.0-strings.zip
LOGINFO - Started archive creation for bug report
Please name your bug report (2-4 words): strings
DEBUG - Excluded pattern '*.zip': /9.5.11+insiders.4.53.0-strings.zip
INFO - UserWarning: Duplicate name: '9.5.11+insiders.4.53.0-strings/requirements.lock.txt'
File "C:\MyFiles\Python312\Lib\zipfile\__init__.py", line 1684, in _open_to_write
self._writecheck(zinfo)
File "C:\MyFiles\Python312\Lib\zipfile\__init__.py", line 1786, in _writecheck
warnings.warn('Duplicate name: %r' % zinfo.filename, stacklevel=3)
INFO - UserWarning: Duplicate name: '9.5.11+insiders.4.53.0-strings/platform.json'
File "C:\MyFiles\Python312\Lib\zipfile\__init__.py", line 1684, in _open_to_write
self._writecheck(zinfo)
File "C:\MyFiles\Python312\Lib\zipfile\__init__.py", line 1786, in _writecheck
warnings.warn('Duplicate name: %r' % zinfo.filename, stacklevel=3)
INFO - Archive successfully created:
9.5.11+insiders.4.53.0-strings/docs/index.md 404.0 B
9.5.11+insiders.4.53.0-strings/mkdocs.yml 342.0 B
9.5.11+insiders.4.53.0-strings/platform.json 282.0 B
9.5.11+insiders.4.53.0-strings/platform.json 79.0 B
9.5.11+insiders.4.53.0-strings/requirements.lock.txt 350.0 B
9.5.11+insiders.4.53.0-strings/requirements.lock.txt 395.0 B
9.5.11+insiders.4.53.0-strings/src/package/__init__.py 288.0 B
9.5.11+insiders.4.53.0-strings.zip 3.4 kBPerhaps the best approach would be to allow the user to set Another approach is a bit risky exclusion pattern again Actually all of the above applies to I guess the previous warnings were created because there was no exclusion logic to prevent files from being added to the zip. Since there is such a thing now, then perhaps the warnings shouldn't be as strict. |
The idea of the "ignore violation" flag is: warn the user about customizations and hooks and terminate unless the flag is set, so if the plugin progresses beyond the termination because of the flag, customizations and hooks should be packaged up as well. This might be necessary in very rare cases, e.g. for the customizations we advertise on our docs. Thus, customizations and hooks should never be ignored. The inline comments or our previous conversation might not have made this clear. Additionally, there's no need to remove
We should include it, e.g., as we advertise something similar for the projects plugin transform (to make usage simpler).
That is very, very weird. I'm not sure why this is the case!? |
|
As an addendum: I wouldn't handle the case where customizations are packaged up that are not used. This would mandate resolving things from |
I didn't mean in the plugin. I just stated the fact that after unpacking a packed zip, before running
I don't know why the Python ZipFile allows for that, or why doesn't 7-zip crash when I open the zip later.
Happy days, I don't have to do 4D logic in my head to handle this branching logic. I finally tested #2346 - (9-community-multi-lang.zip) And it turned out that we have to exclude site_dir based on the existence of
So I think I'm finished, I only need to fixup the commits and squash them into 2 or 3 properly categorized names. |
Only if you run
Agreed.
I'm not sure I entirely understand. So, you essentially want to use
I've started modernizing and professionalizing our Python code base with the latest iteration of the tags plugin in Insiders. Eventually, I want to migrate all plugins to those patterns, since they are now properly documented with Docstrings and cut much, much smaller. If you find a good abstraction of things in the same style, you can definitely go ahead. We can also discuss the conventions I used, as Python is still like Spanish for me – still learning a lot. |
|
To add on the last part: if this is too much of an ask, please don't bother. I see it as my obligation to do that, but since you asked for refactoring opportunities that could be taken, I wanted to point out my latest advancements in this area. I will definitely refactor the info plugin at some point, but if you want to go ahead and help with that, you are very much invited. Again, and I can't stress this enough: thanks for being such an awesome part of our inner community! Your time investments into Material for MkDocs are very valued and much appreciated. Great work! |
🙄I forgot that serve won't run info, so yeah my idea was short-sighted.
Recommending something new and better is fine, not supporting something old and "widely" used isn't.
The most I can do, is to split the I'm also more of a "fan" of flat package structures, and your tags plugin implementation is deeply nested, so I would probably need some diagram to see the whole picture 😅I'm just a simple coder, so deep abstraction structures scare me, reminds me of Java-like business grade structures which are the devil 😆
This is very much welcome, once the Pylint, the tests, and some auto formatter is introduced (like
I appreciate your time investment as well, this back and forth in the PR really helped me stay on track and not spiral into endless edge case dimension ✌️ |
3f3c0fe to
f1a40ab
Compare
|
I've managed to squash 17 commits into 6. 9.5.12+insiders.4.53.0-final-examples-projects.zip |
f1a40ab to
9108408
Compare
|
Sorry about the last force push I decided to only check for |
|
The issue template for bugs asks for a minimal reproduction and links to the page where it says to use the info plugin. |
Can you somehow back up that this is widely used? From what I've seen it isn't. Also, "widely" might be hard to define here. We can keep this approach for now, but if we don't receive any reproductions using it in the coming months, we'll consider removing it again. We have it in the history, so we can go back at any time, but in general, less code to maintain is always something to aim for.
|
|
Great idea, added in ecac4e7. |
I said "widely" based on the fact it's the most popular discussion in the Privacy Guides uses it, though it's my low hanging fruit to support my claim😆 I also checked FastAPI's docs, and they use a similar structure (closer to the I also checked that site_name: My Docs
plugins: []Perhaps I should remove the dynamic exclusion pattern generation for Note: EDIT: Yes, I think this is the best way 😮💨 I'll draft the PR, sorry again for the lack of foresight Cool that #6859 got merged, but the fixes for info plugin aren't merged yet, so people could create more reproductions with missing files 😅 |
9108408 to
7804c47
Compare
I'm afraid I'm going to need a little more time to review this. The changes are quite fundamental and #6859 only included documentation changes. Please give me some more time. |
I didn't want to come off as rushing you, I just made the observation that the issue template got modified before the plugin was updated, and that the order of things was not optimal. Anyways, more time in the PR pipeline allows me to spiral in the edge case dimension a bit longer, so I hope my force pushes don't break your review flow 😅 |
|
After some more thought I believe that any .dotfile in the root directory should be excluded. So I'm inclined to replace the fnmatch patterns with regex patterns which would allow to use |
I wasn't sure whether this PR was ready, because you did not re-request a review + it appeared to me that you were still working on it. As we made sure we're aligned with the general direction, and I trust you testing the examples we talked about, my work will be mainly to check it out some of the examples to learn about the new UX and glaze over the code base, maybe do some formatting. Just request a review when you're done and I look into it I don't think that the merge of #6859 is out of order, since using the info plugin is mandatory as mentioned in the description of the Reproduction section of our issue template, so the PR just tries to catch users not having read that.
I'd say we keep it simple first and not try to overthink this. We already changed so much on a very critical component. When we release this with the next version of Material for MkDocs, all people that update will use the new implementation when reporting bugs, so we should try and keep it as minimal as possible while fixing the known issues and aim to reach stability quickly. If we run into problems, we fix them (as we're doing here now). |
|
Additionally, please move the PR out of draft state, once it is stable from your perspective |
7804c47 to
15d0b11
Compare
Your assessment turned out to be true, I didn't press the button to re-request the review, but I provided the "final" examples here: #6826 (comment) However, this time I hope it's truly final 💪 The change from DEBUG - Excluded pattern '.*\.cache/?': /.cache/
DEBUG - Excluded pattern '^/\.vscode/': /.vscode/
DEBUG - Excluded pattern '/site/': /site/
DEBUG - Excluded pattern '/venv/': /venv/
DEBUG - Excluded pattern '^/\.[^/]+$': /.editorconfig
DEBUG - Excluded pattern '^/\.[^/]+$': /.gitignore
DEBUG - Excluded pattern '^/.*\.zip': /7-projects-examples.zip
DEBUG - Excluded pattern '/examples/fonts_site_dir/': /examples/fonts_site_dir/
DEBUG - Excluded pattern '.*__pycache__/': /examples/__pycache__/
DEBUG - Excluded pattern '/examples/blog-basic/site/': /examples/blog-basic/site/
DEBUG - Excluded pattern '.*__pycache__/': /hooks/__pycache__/So in the examples repository, I use the |
squidfunk
left a comment
There was a problem hiding this comment.
From a functional perspective, this is top-notch! I only have primarily formatting- and comment-related remarks that we should address before merging. I hope that it's not too pendantic, but I always try to put myself in my own shoes 3 months from now when I have forgotten what and why I did specific things. Especially when I jump between TypeScript and Python, extensive comments on design and architecture decisions make it much easier to rebuild an overview of the project after some time has passed
Add more elaborate comments Fix lines to span to 80 characters Fix import order Fix variable naming
15d0b11 to
a6ad363
Compare
|
Thanks again for your awesome work! We'll release it in the coming days! |
|
I just updated the dependencies in dc808ca and had to make a change to the build process, because the |
|
I got caught red handed I didn't completely follow the contribution guide. At some point I stopped rebuilding the project, but instead just copied the material/plugins/info directory to the src/plugins directory, because I noticed it doesn't make a difference 😅 Seeing the code responsible for copying the config files and assets perhaps naming the file .gitignore wasn't the best idea. Especially now after I stopped using fnmatch and started using regex, I think this should be it's own patterns.py file with a list of patterns in python which is later simply imported. Of course I didn't think of this sooner because it worked, and I didn't stop to think of the build process. From an IDE perspective the current file is recognised as a .gitignore file while it's not such file 😵💫 If you won't do it sooner with the plugin rewrite, I will revisit this again in the future, and fixup the methodology of storing patterns. Unless I should do it immediately? |
Makes sense! The next release will take 2-3 days at least, so if you can squeeze it in, you can go ahead. If not, no problem, but then we should add a comment with a |
|
Great, will do it today ✌️ |
Fixes #6750
Took a bit longer than I expected to make a minimal viable improvement, or perhaps I was running circles in my head for too long instead of actually coding via trial and error 😅
I also played a bit too much with the validation for different things relative to the root directory.
I'll simplify it, but I don't want to remove it completely.
As for the
info.gitignoreI hope the capabilities aren't too limiting compared to real.gitignorelogic.It's likely I'm missing something when it comes to
fnmatch.Once I'm closer to finishing, I'll also squash some of the commits to keep only relevant final changes.
Task list:
mkdocs buildat the top root directory.Currently it's not basic enough, rather convoluted.projectsplugin docs, like the examples repositoryregexfromfnmatch