-
-
Notifications
You must be signed in to change notification settings - Fork 1k
Description
🚀 Feature
This issue proposes the creation of a web proxy that can set CORS headers for a PyPI-like indices such as Anaconda.org that do not set such headers. Currently, package managers for Pyodide such as micropip and piplite (i.e., the abstraction over micropip) are unable to download packages from such indices (see pyodide/micropip#101). The CORS headers will allow for a web worker such as a browser tab or any webpage to access resources from other servers, in this case, wheels from Anaconda – the primary platform for uploads of nightly wheels for Scientific Python projects, being tracked in #3049 (comment).
Motivation
The major motivator is being able to serve nightly Pyodide wheels for various projects at different levels of scale based on the traffic their documentation website receives. These wheels will be further utilised for interactive documentation utilities for the latest/unreleased versions or specific released versions of a project's documentation: Quansight-Labs/czi-scientific-python-mgmt#19.
This will embed the installation command in a cell, which can be later hidden through JupyterLite in the future for cleaner UX: jupyterlite/jupyterlite#975, jupyterlite/jupyterlite#508. Similar requests have been received for other projects as well: scikit-hep/pyhf#1826
Pitch
Refactoring this existing project by @Carreau: https://github.com/Carreau/cloudflare-pypi-multi-index seems to be the best idea right now. A concise description of the turn of events could be as follows:
- set up a web proxy,
- host it through a VPS or any of the major cloud service providers on a paid plan (where @rgommers mentions that @Quansight can help out with this),
- this retrieves wheels on demand from the Anaconda.org PyPI index with CORS headers set, and
- therefore,
micropip.install()can proceed with the installation of these wheels through this index URL in a cell
Alternatives
Various alternatives exist:
- Embedding the JupyterLite build artifacts and the custom Emscripten wheel on a data storage service such as AWS S3 where they can be loaded from (similar to how
awkwardandawkward-cppdo it: docs: new Try-It page based on plain Pyodide scikit-hep/awkward#3058). - Push the artifacts to GitHub releases and serve an index such as https://github.com/astariul/github-hosted-pypi – here, GitHub Pages as a platform isn't meant to serve as a CDN and there are restrictions around violating GitHub's ToS (terms of service)
- For documentation on GitHub Pages, the wheels can be pushed to the Git source tree, though this would be a bad idea because it would bloat the repository with large binary files (however, downloading the wheels from a GitHub Release and embedding them at runtime through a workflow that pushes artifacts to the
gh-pagesbranch could be a reliable and reasonable option – people will have to avoid pulling thegh-pagesbranch and pull just themainbranch or similar)
Additional context
It is to be noted that PyPI previously did not set CORS headers but now does, through pypi/warehouse#4687 and pypi/conveyor#5. The simple API also sets CORS headers via pypi/warehouse#13222 which are used by micropip to download pure Python packages.
However, moving forward with PyPI as a supported platform for hosting Pyodide wheels requires the provision of a PEP and its subsequent acceptance.
Also, cc: @steppi @melissawm for visibility