Parse top-level import name from wheel file#3006
Conversation
rth
left a comment
There was a problem hiding this comment.
Thanks! The nice thing is it also catches private imports (e.g. _pytest). It's an interesting approach, but note that are a number of problematic imports it finds below. Some are easy to fix, others are harder. Also this is hard to notice / review for users so I wonder whether the current system (but possibly with auto-generated list of imports by mkpkg using this code) so it could be reviewed would be more reliable.
test and tests should be explicitly excluded,

Not sure about those lib* folders, but maybe not critical,

I stopped at "l" letter in repodata.json. I imagine there are more of the same types of issues later.
Right... it seems like the
Yeah, that would be more flexible. |
|
The issue with auto-generating top level imports in mkpkg is that the final directory structure is not visible until we build a wheel. |
Don't forget about implicit namespace packages https://packaging.python.org/en/latest/guides/packaging-namespace-packages/#native-namespace-packages |
There was a problem hiding this comment.
Updated such as:
- Add a new key
package/top-levelin meta.yaml which is used for top-level import names. package/top-levelis auto calculated inmkpkgcommand. This is done by downloading a wheel file of a native platform using its directory structure. (I hope the directory structure doesn't vary between platforms...)- If a wheel file doesn't exist, it will add a placeholder so one can manually add top-level import names.
Also I stopped using top_level.txt since it often contains invalid import names as pointed out by Roman.
To make a review easier, I added comments on packages that have changes.
| version: 21.4.0 | ||
| top-level: | ||
| - attr | ||
| - attrs |
| version: "1.79" | ||
| top-level: | ||
| - Bio | ||
| - BioSQL |
| version: 3.5.2 | ||
|
|
||
| top-level: | ||
| - pylab |
There was a problem hiding this comment.
matplotlib has a top-level file pylab.py. Not sure about the usage of it.
| name: pycryptodomex | ||
| version: 3.15.0 | ||
| top-level: | ||
| - Cryptodome |
There was a problem hiding this comment.
Intestingly, this file was named wrong (meta.yml), so this package was not built.
There was a problem hiding this comment.
It seems like the build is faling, disabled for now.
| name: pyrsistent | ||
| version: 0.18.1 | ||
| top-level: | ||
| - _pyrsistent_version |
There was a problem hiding this comment.
It has a top-level version file...
| @@ -1,6 +1,8 @@ | |||
| package: | |||
| name: ruamel | |||
| name: ruamel.yaml | |||
There was a problem hiding this comment.
ruamel.yaml is a correct package name.
| version: 0.19.3 | ||
| top-level: | ||
| - skimage | ||
| # - doc # scikit-image has top level `doc` directory... but it is too common name so we don't want to use it as a hint for top-level import. |
There was a problem hiding this comment.
scikit-image have top-level doc directory, but the name doc is too common so I think we don't want to load scikit-image is someone do import doc.
| name: setuptools | ||
| version: 62.6.0 | ||
| top-level: | ||
| - _distutils_hack |
There was a problem hiding this comment.
setuptools has _distutils_hack.
| name: sympy | ||
| version: 1.10.1 | ||
| top-level: | ||
| - isympy |
There was a problem hiding this comment.
sympy has top-leve isympy.py file.
| name: toolz | ||
| version: 0.11.2 | ||
| top-level: | ||
| - tlz |
|
Sorry for the delay on review for this @ryanking13. I'll try to look at it when I get a chance. Seems like it's in good shape. |
| version: 0.19.3 | ||
| top-level: | ||
| - skimage | ||
| # - doc # scikit-image has top level `doc` directory... but it is too common name so we don't want to use it as a hint for top-level import. |
|
Thanks for the thorough reviews! |
pycryptodomex was added in #2966, but it had an invalid recipe name meta.yml (not meta.yaml) so our build system didn't build that package. I found that in #3006, and I also found that it is not building well, so I disabled it then. So, in other words, pycryptodomex never worked in Pyodide. I would like to remove it from the changelog and the repository for now, so that we don't add it to our next stable release accidentally. Perhaps someone interested can re-add this package.




Description
Changes how
importskey in repodata.json is calculated. This key is used inloadPackagesFromImportto get the top-level import name of a package.Previously we were using
test/importskey in meta.yaml to calculate this key, without mentioning this in docs. Therefore it was very easy to make a mistake such as #3001.This PR changes this by calculating top-level imports dynamically from the wheel file:
1) Try to parse the top-level import name fromtop_level.txtfile inside the wheel file. This file exists if the package is built with setuptools, but may not exist if the package is built with flit, poetry, etc.2) Iterate the top-level directories and files inside the wheel file, and add to top-level import if there are directories containing python file or if there is a top-level python file.
3) If both of the above fails, use the package name inside meta.yaml as a top-level import name (e.g. shared libraries)
FYI these packages don't have top_level.txt:- packages that use flit:pyparsing,more_itertools,tomli,tomli_w,threadpoolctl,typing_extensions- packages that use hatchling:jsonschema,soupsieve,~~- packages that use poetry:
pkgconfig- packages that use distutils:distlib(changed to setuptools in latest release)Checklists