Skip to content

Change platform to Pyodide#3823

Closed
hoodmane wants to merge 8 commits intopyodide:mainfrom
hoodmane:platform-pyodide
Closed

Change platform to Pyodide#3823
hoodmane wants to merge 8 commits intopyodide:mainfrom
hoodmane:platform-pyodide

Conversation

@hoodmane
Copy link
Member

@hoodmane hoodmane commented May 3, 2023

Work towards Pyodide wheels on pypi.

One question is what happens if we have an emergency need of an ABI break in between
major Python version updates. But we can just not do that I suppose.

@ryanking13
Copy link
Member

One question is what happens if we have an emergency need of an ABI break in between
major Python version updates. But we can just not do that I suppose.

Maybe we can store some more information in platform string? For instance pyodide_<version>_<something_else>? Then downstream packages like micropip can handle if things like ABI break happens.

@hoodmane
Copy link
Member Author

hoodmane commented May 3, 2023

Well for example the manylinux tag started out as just manylinux then they switched to manylinux_year and then to manylinux_libc_version.

I think it's fine to start with just pyodide and put in more stuff once we know we need it.

@ryanking13
Copy link
Member

I think it's fine to start with just pyodide and put in more stuff once we know we need it.

Makes sense. Thanks!

@henryiii
Copy link
Contributor

henryiii commented May 4, 2023

One difference is that manylinux* is never produced when you make the wheels, but only when you process them via auditwheel.

@hoodmane hoodmane force-pushed the platform-pyodide branch from e8cd3d0 to ac64494 Compare May 6, 2023 20:11
@hoodmane
Copy link
Member Author

hoodmane commented May 6, 2023

npm ERR! cb() never called!
npm ERR! This is an error with npm itself. Please report this error at:
npm ERR! https://npm.community

Interesting...

@rth
Copy link
Member

rth commented May 9, 2023

So I understand correctly, this would replace the platform tags of emscripten_3_1_32_wasm32 by pyodide right?

I understand that emscripten + version is not specific enough to indicate all the custom flags we may be using but I feel like just relying on the Python version is not sufficient, particularly if we are designing a new platform tag from scratch.

One question is what happens if we have an emergency need of an ABI break in between
major Python version updates.

Yeah, there is that. Also,

  • at some point, wasm64 will be stable enough I guess and platform tag should indicate that wasm32 and wasm64 are not the same thing.
  • same goes for threading and or SIMD if people (or us) do builds without runtime detection
  • more generally people can (and do) have builds of Pyodide with some custom flags that may not be compatible. We should be able to tell they are not compatible. Otherwise, we would just end up with with wheels that crash at runtime.

So PyPI aside I would have expected something like pyodide_202305_wasm32 or maybe a hash of the relevant compilation flags instead of the date. But I guess for the PyPI PEP one would want a fixed platform flag? That would be OK but we still need to have a plan how we encode that extra information, to be able to tell if wheels are actually compatible. Can't we put arbitrary things into the build tag from what I understand there is no strict rules what goes there, and we could say the PEP is that the platform tag is pyodide_wasm32 and then at least in micropip check that the build tag matches the expected value. Though it feels a bit like a hack.

Or alternatively pyodide_wasm32 is the equivalent of manylinux wheels that are tied to the Python version, but we also still support emscripten_<hash>_wasm32 wheels (which are equivalent of distribution dependent linux wheels). And so the workflow would be to,

  • pyodide build produces emscripten_<hash>_wasm32 wheels
  • pyodide auditwheels, has a pre-defined mapping of python version / emscripten version / emscripten compilation flags, checks that they are compatible with the to-be-written PEP and converts them to pyodide_wasm32 wheels

WDYT?

@henryiii
Copy link
Contributor

scikit-build/scikit-build#977 should cover whatever is chosen here, I think. :)

@ryanking13
Copy link
Member

ryanking13 commented May 13, 2023

I like the idea of supporting both emscripten_<hash>_wasm32 and pyodide_wasm32 platforms, as there might be other CPython Emscripten platforms in the future. By the way, I think pyodide-build needs to produce pyodide_wasm32 wheel while auditwheel-emscripten should convert it to emscripten_<hash>_wasm32, not vice versa (which is why I named it auditwheel-emscripten not auditwheel-pyodide).

About hashing build flags, I think there might be flags that are compatible with each other, so maybe we can use some bit flags instead? For example, suppose we use 8 bytes (64-bit) flag that can be encoded by:

def platform():
  flag = 0
  if simd: flag |= (1 << 0)
  if threading: flag |= (1 << 1)
  if ...
  
  return hex(flag)

# Results in: emscripten_deadbeafdeadbeaf_wasm32

So downstream package managers can parse these flags and check the flags are compatible with their distribution.

@hoodmane
Copy link
Member Author

There are really two different things here. For example, simd has no ABI consequences, but it will only work on the architecture wasm32 + simd. But wasm32+simd is actually a different architecture and not a different platform. Another example like this is tail calls. These can be feature detected at runtime and the appropriate one can be chosen.

On the other hand, wasm bigint and wasm exception handling have ABI consequences too, so wheels that don't use these features don't get along with interpreters that do use them / vice versa. So either we use them everywhere and don't support runtimes missing these features or we use them nowhere.

I'm not really sure in which category pthreads lies except that for now we can't use it at all with dynamic linking...

@rth
Copy link
Member

rth commented May 13, 2023

actually a different architecture and not a different platform.

What's a platform tag then? I thought it was more or less OS/runtime + arch; as in linux_x86_64 for instance. At least all common platform tags seem to have the arch in the name (aside from any), so IMO it would make sense to follow that logic.

But then interestingly, whether you can use SIMD or GPU natively is not part of the platform tag.

In any case, I guess there are some parallels with linux wheels here. The questions we need to have a clear answer to, is,

  1. what's the platform for a package that follows the to be written PEP and that can be uploaded to PyPI
  2. what's the platform of a Python wheel built by emscripten with arbitrary flags (with or without pyodide)

By the way, I think pyodide-build needs to produce pyodide_wasm32 wheel while auditwheel-emscripten should convert it to emscripten__wasm32, not vice versa

OK, but if say you take the artifact of emscripten-forge put it in a zip file/wheel, what would be the platform tag of that? It doesn't necessarily follow the PEP but it is certainly weird to call it a pyodide platform wheel.

So I follow the logic that pyodide produces pyodide tagged wheels that are later standardized via auditwheel-emscripten to emscripten_wasm32. But that doesn't work with the possibility that other people also build emscripten Python wheels not using pyodide.

Anyway, maybe it would be good to have a draft of the PEP before doing any of the implementation changes because there are a lot of open questions :) Or maybe we could have a separate call about this.

@ryanking13
Copy link
Member

ryanking13 commented May 13, 2023

So I follow the logic that pyodide produces pyodide tagged wheels that are later standardized via auditwheel-emscripten to emscripten_wasm32. But that doesn't work with the possibility that other people also build emscripten Python wheels not using pyodide.

auditwheel-emscripten works (or at least I want to make it work) with wheels that are not built with pyodide-build. It only relies on the dynamic linking spec of Emscripten and it does not use any pyodide specific features except that it uses pyodide CLI entrypoint. So for example, it works well with wheels built with Rust toolchain.

So I think audtiwheel-emscripten can do the job of auditing and relabeling any type of Emscripten wheels: pyodide wheels, emscripten-forge wheels (if they have).

Anyway, maybe it would be good to have a draft of the PEP before doing any of the implementation changes because there are a lot of open questions :) Or maybe we could have a separate call about this.

Sounds good! I'm also looking forward to hearing from Hood about WASM Summit in Pycon.

@hoodmane
Copy link
Member Author

So I think audtiwheel-emscripten can do the job of auditing and relabeling any type of Emscripten wheels: pyodide wheels, emscripten-forge wheels (if they have).

This seems a bit optimistic to me, since there are a very wide variety of different ABIs. I was thinking that we would standardize a Pyodide ABI and then put Pyodide in the platform tags. It might even be good to go the other way: by default tag the wheel with emscripten_version_abiflags_wasm32 and switch to pyodide_version_wasm32 if it is tested to match. Then emscripten can be like linux: a bit under-determined as a platform tag. And then pyodide isn't really like manylinux it's more like a specific ABI that is blessed by the PEP. The pypa people all seemed pretty open to blessing Pyodide's way of doing things in the short term (e.g., for the next few years).

One disadvantage of the name auditwheel_emscripten is that it makes it sound more closely related to the emscripten project whereas it will probably stay more closely related to Pyodide.

it would be good to have a draft of the PEP before doing any of the implementation changes

I guess I should try to start a draft soon. I'll make a document and share it with you all. I had in my head the following set of tasks:

  1. finish cibuildwheel support
  2. add Pyodide to more projects' CIs using cibuildwheel
  3. draft the pep
  4. improve auditwheel_emscripten so that it can detect pep-compliant wheels

@hoodmane
Copy link
Member Author

I guess for auditwheel-emscripten it would be nice to be able to give it a pair a main module and a side module and validate whether they can be linked together. So the main condition is that the imports of the side module need to be a subset of the exports of the main module and the types need to match.

But we presumably need additional metadata to determine if they agree about exception handling / stack unwinding ABIs, wasm bigint, etc. I think many of these ABI changes are not visible from the export/import types. We see this because an import/export type mismatch will generally fail either when the side module is loaded or inside of dlsym. It's very common to see other sorts of failures when we update Emscripten or when we change ABI settings. So that makes me think something more subtle is often going on. It's easy for the argument in both cases to be an i32 but it just means completely different things.

@rth
Copy link
Member

rth commented May 15, 2023

finish cibuildwheel support
add Pyodide to more projects' CIs using cibuildwheel

Thanks for working on those! Maybe we should be sure of the platform we would have and the interaction with auditwheel-emscripten before spreading it in the ecosystem? I guess it's OK to use our current platform tag, as that would work with the current stable pyodide release.

@hoodmane
Copy link
Member Author

I think the idea is that we update the platform with whatever changes we need now. In any case cibuildwheel will need to be updated to add new versions of Pyodide. We are pretty certain that whatever our platform is we will want people to be able to produce wheels using cibuildwheel. Maybe some minor things will need to change but I think we can update them ourselves (as we've been doing with numpy for instance).

@hoodmane hoodmane mentioned this pull request Oct 12, 2023
3 tasks
@hoodmane hoodmane closed this Oct 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants