Skip to content

Support loading shared libraries inside a wheel file globally#3264

Merged
ryanking13 merged 12 commits intopyodide:mainfrom
ryanking13:vendorlibs
Nov 29, 2022
Merged

Support loading shared libraries inside a wheel file globally#3264
ryanking13 merged 12 commits intopyodide:mainfrom
ryanking13:vendorlibs

Conversation

@ryanking13
Copy link
Copy Markdown
Member

@ryanking13 ryanking13 commented Nov 17, 2022

Description

Resolve #3167
Close #3183

This is a final split-up of #3183.

This PR mostly does two things:

  1. Add a way to detect whether libraries inside the wheel should be loaded globally or not: If a library is needed by another library inside a wheel file, load it globally.

  2. Apply vendor-sharedlib (Add vendor-sharedlib key in meta.yaml spec #3234) keys to some packages that depend on other shared libraries: fiona (gdal), pyheif (libheif), h5py (libhdf5), shapely (geos).

Note that @rth had some concerns about downloading the same library multiple times (#3167). For now I applied it to the shared libraries that are used in only one or two packages, but we can discuss more about this.

For packages we built in tree, I'm less sure, as this would lead to potentially redundant downloads (e.g. BLAS downloaded multiple times for numpy and scipy once we switch to a better BLAS). Though it would be worth analysing how many dependent packages each shared library has in the current dependency graph, and whether indeed vendoring it inside the wheel would be that significant.

Checklists

Copy link
Copy Markdown
Member

@hoodmane hoodmane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work @ryanking13!

Comment on lines +184 to +185
// TODO(ryanking13): It is not clear why global libraries should be loaded first.
// But without this, we get a fatal error when loading the libraries.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it's just lucky we understand what's going on as much as we do.

Comment on lines +213 to +216
const needed = Module.getDylinkMetadata(binary).neededDynlibs;
needed.forEach((lib: string) => {
globalLibs.add(lib);
});
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a conservative overestimate of the libs that need to be loaded globally? Or is this exactly the condition? Also, does this code also run for shared libraries that are loaded separately? It's fairly efficient to turn an originally local library into a global one when it turns out to be needed later. Do we handle this case?

In any case, it's really nice to see these improvements.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a conservative overestimate of the libs that need to be loaded globally? Or is this exactly the condition?

Since neededDynlibs contains all dynamic libraries linked during the compilation (via -l flag), I think this is not an overestimate unless we find a way to load symbols of external libraries loaded locally.

On the other hand, this can rather be an "underestimate" of libs, because users who write compile commands might omit to specify dynamic libraries that should be linked. For example: in #3234, I found and fixed that our sharedlib-test-py package wasn't linked to sharedlib-test.so correctly.

Also, does this code also run for shared libraries that are loaded separately?

Good point, I think we can skip this part for shared libraries that are loaded separately.

It's fairly efficient to turn an originally local library into a global one when it turns out to be needed later. Do we handle this case?

No, I think we never turn a local library into a global one for now. Since global flag is propagated to dependencies when calling loadDynamicLibrary, a locally loaded dynamic library does not load its dependencies globally.

(I'm not sure this behavior is a bug in Emscripten, probably I should open an issue in Emscripten).

Copy link
Copy Markdown
Member Author

@ryanking13 ryanking13 Nov 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I'm not sure this behavior is a bug in Emscripten, probably I should open an issue in Emscripten).

Opened an issue in Emscripten (emscripten-core/emscripten#18264). If it is accepted, it might be possible to remove whole global checks.

@ryanking13 ryanking13 merged commit 6bf2550 into pyodide:main Nov 29, 2022
@ryanking13
Copy link
Copy Markdown
Member Author

Thanks for the review!

@ryanking13 ryanking13 deleted the vendorlibs branch November 29, 2022 04:40
@majinge7846 majinge7846 mentioned this pull request Jun 25, 2023
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Embed external shared libraries into the wheel

2 participants