Skip to content

Load Pyodide runtime through HTTP#2451

Merged
garrettgu10 merged 9 commits intomainfrom
ggu/load-pyodide-from-http
Aug 9, 2024
Merged

Load Pyodide runtime through HTTP#2451
garrettgu10 merged 9 commits intomainfrom
ggu/load-pyodide-from-http

Conversation

@garrettgu10
Copy link
Copy Markdown
Contributor

@garrettgu10 garrettgu10 commented Jul 26, 2024

This loads the Pyodide bundle through HTTP. It does this in a synchronous way, by creating a new kj::Thread and immediately disowning it, causing the current thread to be blocked on the completion of the new thread.

When a disk cache directory is provided at the command line using pyodide-bundle-disk-cache, the directory is used to cache pyodide bundles.

Existing test cases have been edited to use the pyodide.capnp.bin file generated through bazel, so changes in src/pyodide will be reflected instantly if tested through bazel test .... Otherwise, the correct version according to compat date will be fetched from the internet.

Here's a plan for how to test the disk cache writing (reading is covered by bazel tests) functionality:

mkdir /tmp/pyodide
./bazel-bin/src/workerd/server/workerd serve samples/pyodide/config.capnp --experimental --pyodide-bundle-disk-cache-dir=/tmp/pyodide --verbose
# should print loading pyodide package from internet
./bazel-bin/src/workerd/server/workerd serve samples/pyodide/config.capnp --experimental --pyodide-bundle-disk-cache-dir=/tmp/pyodide --verbose
# should no longer print loading pyodide package from internet & just work

@garrettgu10 garrettgu10 requested review from a team as code owners July 26, 2024 20:21
@garrettgu10 garrettgu10 requested review from dom96 and fhanau July 26, 2024 20:21
@garrettgu10 garrettgu10 force-pushed the ggu/load-pyodide-from-http branch from 62fdc3e to 51632f2 Compare July 29, 2024 14:15
@garrettgu10 garrettgu10 marked this pull request as draft July 29, 2024 14:23
@jasnell jasnell added the python Issues/PRs relating to Python Workers label Jul 31, 2024
garrettgu10 and others added 2 commits August 5, 2024 14:09
Co-authored-by: Hood Chatham <roberthoodchatham@gmail.com>
@garrettgu10 garrettgu10 force-pushed the ggu/load-pyodide-from-http branch from 281902c to e556b73 Compare August 5, 2024 19:09
@garrettgu10 garrettgu10 changed the title [do not merge] Load Pyodide runtime through HTTP Load Pyodide runtime through HTTP Aug 5, 2024
@garrettgu10 garrettgu10 marked this pull request as ready for review August 5, 2024 21:20
@garrettgu10
Copy link
Copy Markdown
Contributor Author

One point of jank is that the tip-of-tree pyodide bundle is still named pyodide-2.capnp.bin since the version number is still hardcoded in workerd. Ideally we want this to be named pyodide-latest.capnp.bin or something similar but this would require us to finish the story on compat-date -> pyodide version mapping.

@garrettgu10 garrettgu10 requested a review from hoodmane August 5, 2024 21:28
…k cache, using generated tip-of-tree pyodide to pre-populate disk cache
@garrettgu10 garrettgu10 force-pushed the ggu/load-pyodide-from-http branch from ada41a7 to 0e324a8 Compare August 5, 2024 21:36
@garrettgu10
Copy link
Copy Markdown
Contributor Author

garrettgu10 commented Aug 5, 2024

Another thing we need to figure out before merging -- how can we determine if a worker is a Python worker or not? We need to know this so that we don't fetch the Pyodide bundle when the worker doesn't need it.

@hoodmane
Copy link
Copy Markdown
Contributor

hoodmane commented Aug 6, 2024

Since it's blocking, why not call fetchPyodideBundle in workerd-api just before it's needed?

@garrettgu10 garrettgu10 requested a review from hoodmane August 6, 2024 16:16
@dom96
Copy link
Copy Markdown
Contributor

dom96 commented Aug 6, 2024

finish the story on compat-date -> pyodide version mapping.

This story is complete with #2464. You should be able to use it to determine the bundle filename to download.

how can we determine if a worker is a Python worker or not?

By checking the WorkerBundle. We've got the hasPythonModule for this in EW which you should be able to recreate here.

@garrettgu10
Copy link
Copy Markdown
Contributor Author

garrettgu10 commented Aug 6, 2024

@dom96 ah, makes sense. We should probably copy pyodide-2.capnp.bin to pyodide-0.26.0a2.capnp.bin on our bucket so we can use that. I'll do that

Copy link
Copy Markdown
Contributor

@dom96 dom96 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but I think you should integrate PythonSnapshotRelease into here before merging. You probably want to move getPythonSnapshotRelease out of EW and into workerd, then use it here to get the right Python version to load.

@hoodmane
Copy link
Copy Markdown
Contributor

hoodmane commented Aug 7, 2024

Could you split "refactor bazel test setup" out into a separate PR?

@garrettgu10 garrettgu10 force-pushed the ggu/load-pyodide-from-http branch 3 times, most recently from e9d6b11 to c312b50 Compare August 7, 2024 19:14
@garrettgu10 garrettgu10 requested a review from dom96 August 7, 2024 20:25
@garrettgu10 garrettgu10 force-pushed the ggu/load-pyodide-from-http branch from 96fef97 to c3d0a86 Compare August 8, 2024 14:19
Comment on lines +542 to +544
if (fetchPyodideBundle(impl->pythonConfig, "dev"_kj)) {
modules->addBuiltinBundle(getPyodideBundle("dev"_kj), kj::none);
} else {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll want to select between "dev" and the appropriate version by setting an appropriate compat flag so that tests can be a bit more declarative. I think it would be confusing if you try to test a specific version but the test uses latest because it exists. But we can deal with that in a followup.

}
}

bool fetchPyodideBundle(const api::pyodide::PythonConfig& pyConfig, kj::StringPtr version) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be tidier to have fetchPyodideBundle return a kj::Maybe<> directly, but we can tidy up that sort of thing in followup PRs.

Copy link
Copy Markdown
Contributor

@hoodmane hoodmane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @garrettgu10! Let's merge this.

@garrettgu10 garrettgu10 merged commit 5c21d27 into main Aug 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

python Issues/PRs relating to Python Workers

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants