Automatically stubbing the pyi files while keeping inspecting ability#509
Automatically stubbing the pyi files while keeping inspecting ability#509n1t0 merged 14 commits intohuggingface:masterfrom
pyi files while keeping inspecting ability#509Conversation
n1t0
left a comment
There was a problem hiding this comment.
Awesome! That is a great addition and I'm really happy that we won't have to sync these .pyi files manually anymore! 🎉
I must say I'm strongly against generating the .py files though. This doesn't really add any value while on the contrary, it takes away a good part of our freedom while also degrading the user experience.
Also, we shouldn't be introducing breaking changes for the purpose of generating files automatically. Maintaining the .py files is a breeze, and getting rid of the .pyi is where 99% of the value lies.
So let's generate only the .pyi files.
| /// | ||
| #[pyclass(dict, module = "tokenizers", name=AddedToken)] | ||
| #[text_signature = "(content, single_word=False, lstrip=False, rstrip=False, normalized=True)"] | ||
| #[text_signature = "(self, content, single_word=False, lstrip=False, rstrip=False, normalized=True)"] |
There was a problem hiding this comment.
I think we should probably remove all the self everywhere and document all the pyclass using __new__ instead. (I guess its the reason you add them in the first place?). This would be more accurate, what do you think?
There was a problem hiding this comment.
Well all code was previously documenting init. I feel it's more pythonic than new even though it's really a __new__ . self would become cls then.
There was a problem hiding this comment.
Alright, seems fair. I don't think it changes anything anyway.
|
|
||
| def do_black(content, is_pyi): | ||
| mode = black.Mode( | ||
| target_versions={black.TargetVersion.PY36}, |
There was a problem hiding this comment.
We are still using PY35 everywhere else, so we should stay consistent here.
There was a problem hiding this comment.
True I thought I had changed this. my bad.
| """ WordPiece Decoder """ | ||
| def __init__(self, suffix="</w>"): | ||
| pass | ||
| def decode(self, tokens): |
There was a problem hiding this comment.
All the inherited methods are duplicated on the children classes. I think we should remove them as the various IDE should be able to understand this by themselves. (Just like they would with plain Python files)
There was a problem hiding this comment.
That makes a lot more checks (checking mro, uptree for existing signatures, checking signatures are the same and so one).I I really don't think it's worth it as it's generated code now. The possibility of a bug that shadows existing functions outweighs the benefit of writing less generated code in my mind.
|
Latest iterations keeps the root I still fill that getting as much stuff generated as possible will make quirks and inaccuracies in the bindings or docs as low as possible and should be a desired goal. |
|
I'm not a fan of this hybrid generation, where sometimes it gets automatically generated, sometimes it doesn't, and you have to remember to import things manually just in some cases. I'd prefer to have either 100% of the exports automatically generated, or just no generation at all (speaking about the Also, I'd love to have it done automatically when calling |
We're only talking about
I wanted to do that, but in order to inspect the library it needs to be built, and somehow it does not seem to work in the hook I found. https://www.anomaly.net.au/blog/running-pre-and-post-install-jobs-for-your-python-packages/ (After |
At the moment, this concerns
Well, if we want to go with this, I think we'll have to find a way. |
|
@Narsil what's the status on this? I'd like to update the Python API reference pretty soon, and there is a huge overlap with this PR so it'd be better to wrap this up first. |
|
Well this PR is ready. You still can comment the code responsible for generating |
Context:
pyifiles was becoming tediousand would often end up being wrong or outdated.
inspectableobjects from python sideHowever, code editors usually don't use real python code to inspect.
That was the reason for the existence of
pyifiles.Proposed change:
inspectmodule to go through therustbindings, and generate all python stub filesaccordingly.
__init__.pyand__init__.pyiwill be written bystub.pyfile. Edit: With thegenerated headertypingannotations (inspect.signature(__text_signature__)does not support them link as some environments do use inspect (most REPLs, iPython, notebooks..)
another issue with annotations was importing them wherever they were defined which poses a challenge
for custom defined ones.
We also drop custom ENUMs that onlyemulatedstring conversion, in favor of explicit stringswith a comment detailing available options. We could have a regular python file implement them
but then
stub.pywould need to import them, which it does not have a simple mechanism for.python stub.py --checkwill now check that we correctly ran the command as code style to make surewe are in sync with the rust doc.
python stub.pyandpython stub.py --checkneed to run after the lib was compiled withpython setup.py developor something similar to inspect the actual lib.Because of that, github workflow was changed to run
check-styleafter building the lib@propertyto be able to document them. Currentlyno good system to detect presence of a
@XXX.setterwas discovered. It seems all attributesdo have a setter, but yields and exception when run on
#[getter]only. It should not matterthat much to autocomplete facilities.
implementationsis kept in place asstub.pywill simply ignore unknown files keeping the abilityto add new python code, just not in the ones written through rust.
Limits:
stub.pybut we should not get bittenby this, as the real black runs after
stub.py.No safeguard against overwriting real python files right now.