Skip to content

Web proxy to work around the lack of CORS headers for nightly wheels uploaded to the Anaconda.org index #4898

@agriyakhetarpal

Description

@agriyakhetarpal

🚀 Feature

This issue proposes the creation of a web proxy that can set CORS headers for a PyPI-like indices such as Anaconda.org that do not set such headers. Currently, package managers for Pyodide such as micropip and piplite (i.e., the abstraction over micropip) are unable to download packages from such indices (see pyodide/micropip#101). The CORS headers will allow for a web worker such as a browser tab or any webpage to access resources from other servers, in this case, wheels from Anaconda – the primary platform for uploads of nightly wheels for Scientific Python projects, being tracked in #3049 (comment).

Motivation

The major motivator is being able to serve nightly Pyodide wheels for various projects at different levels of scale based on the traffic their documentation website receives. These wheels will be further utilised for interactive documentation utilities for the latest/unreleased versions or specific released versions of a project's documentation: Quansight-Labs/czi-scientific-python-mgmt#19.

This will embed the installation command in a cell, which can be later hidden through JupyterLite in the future for cleaner UX: jupyterlite/jupyterlite#975, jupyterlite/jupyterlite#508. Similar requests have been received for other projects as well: scikit-hep/pyhf#1826

Pitch

Refactoring this existing project by @Carreau: https://github.com/Carreau/cloudflare-pypi-multi-index seems to be the best idea right now. A concise description of the turn of events could be as follows:

  1. set up a web proxy,
  2. host it through a VPS or any of the major cloud service providers on a paid plan (where @rgommers mentions that @Quansight can help out with this),
  3. this retrieves wheels on demand from the Anaconda.org PyPI index with CORS headers set, and
  4. therefore, micropip.install() can proceed with the installation of these wheels through this index URL in a cell

Alternatives

Various alternatives exist:

  • Embedding the JupyterLite build artifacts and the custom Emscripten wheel on a data storage service such as AWS S3 where they can be loaded from (similar to how awkward and awkward-cpp do it: docs: new Try-It page based on plain Pyodide scikit-hep/awkward#3058).
  • Push the artifacts to GitHub releases and serve an index such as https://github.com/astariul/github-hosted-pypi – here, GitHub Pages as a platform isn't meant to serve as a CDN and there are restrictions around violating GitHub's ToS (terms of service)
  • For documentation on GitHub Pages, the wheels can be pushed to the Git source tree, though this would be a bad idea because it would bloat the repository with large binary files (however, downloading the wheels from a GitHub Release and embedding them at runtime through a workflow that pushes artifacts to the gh-pages branch could be a reliable and reasonable option – people will have to avoid pulling the gh-pages branch and pull just the main branch or similar)

Additional context

It is to be noted that PyPI previously did not set CORS headers but now does, through pypi/warehouse#4687 and pypi/conveyor#5. The simple API also sets CORS headers via pypi/warehouse#13222 which are used by micropip to download pure Python packages.

However, moving forward with PyPI as a supported platform for hosting Pyodide wheels requires the provision of a PEP and its subsequent acceptance.

Also, cc: @steppi @melissawm for visibility

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions