draft: per-dso .so resolution cache on glibc#207893
draft: per-dso .so resolution cache on glibc#207893pennae wants to merge 3 commits intoNixOS:stagingfrom pennae:ldsocache
Conversation
|
looks like the ofborg errors were due to patchelf relocating writable sections into non-writable segments. gonna take that down as another potential bug |
|
This pull request has been mentioned on NixOS Discourse. There might be relevant details there: |
|
ran nixpkgs-review on the branch until nix got oomkilled. after 30k of 80k packages built and only 117 failures (most of which were dead links, broken FOD hashes, or requireFile instances) it looks like no errors are going to show up there. |
|
First, let me say that this is all very interesting and I appreciate the things I'm learning here. I've read through all the linked materials (incl. @fzakaria 's thoughts and prototypes) and wonder what's the fundamental critique of an approach that leverages nix' stable store locator properties to evict a system that originally was made to accommodate unstable locators with magic? The above comparison to the other approaches doesn't seem to sufficiently convey that critique, in my view. Especially, since these seem to approach the problem from first principles. I'd like to better understand those considerations that lead to the critique and I'm unable to infer. Can you detail? |
|
it's mostly about not behaving too differently from other systems that use glibc. approaches like shrinkwrap work perfectly fine (better than this, depending on your metric), but it doesn't allow replacing libraries using LD_LIBRARY_PATH for testing purposes or other reasons. that's probably a rather rare thing to do, but if we can avoid baking such a difference in behavior into the core of nixos it's probably worth doing. the guix approach probably has namespacing issues. if you use |
|
Interesting approach - might be worth sharing the ideas with the GNU libc or musl mailing list. Can you show using here's a real question: This is the question I keep asking. We need to challenge the notion that we must even have a single binary. Very cool PR -- been enjoying following along your debugging adventures :) |
readelf says: if this goes anywhere patchelf will get a nice formatter to read these.
(it's not just in the executable, it's in every DSO, but more importantly) because nixos relocates and relinks executables sometimes, eg to build stage1. adding info like this outside of patchelf which runs all the time anyway would duplicate a lot of effort into ld.bfd, ld.gold, lld, mold, ... while patchelf is not a great place to have this, it looks like it's the least bad place. |
|
Ofc, Farid just wanted to avoid another shameless plug, so here it is for what he means: #207061 (comment) 😁 |
|
oh, one more thing. just to be clear about our intentions here. this thing will remain a draft until there's an agreement that it's either a good idea worth pursuing further or it's killed off by someone. we don't have the energy to do all the touch-ups alluded to in all the todos, add tests etc etc and then throw it all away because no agreement whether it's wanted at all is forthcoming. would be nice to have this merged. great even, given the performance improvements it can bring. but we really can't put in all that work and then have it be lost to time |
There was a problem hiding this comment.
We should comment why this is required
There was a problem hiding this comment.
it's not, that's just a debugging aid (like everything else except the two patches and hooking them into their respective derivations) (and the stdenv hook)
There was a problem hiding this comment.
at least a few comments would not hurt
There was a problem hiding this comment.
will try to add exhaustive documentation for the final patches. not going to do that just yet though, while it's still unclear whether there's any chance for this approach at all.
the basic idea is simple though: add an elf note to each DSO that contains a zero-terminated array of zero-terminated string pairs (soname, path), then check that note before doing filesystem lookups. most of the complexity is (of course) in patchelf, and in working around qemu bugs 😕
|
This pull request has been mentioned on NixOS Discourse. There might be relevant details there: |
|
This pull request has been mentioned on NixOS Discourse. There might be relevant details there: https://discourse.nixos.org/t/2022-01-03-nixpkgs-architecture-team-meeting-23/24404/1 |
|
Can you describe in a bit more detail what the resolution process at runtime looks like? In particular, what happens when In principle this approach seems good to me, though I'm slightly worried about maintaining a non-trivial patch to glibc. I wonder if upstream glibc might be interested in this... |
the resolution cache added here is an additional source of edit the glibc default is to look first at LD_LIBRARY_PATH with its normal search algorithm (colon-separated list of dirs, hwcaps), then repeat the same on DT_RUNPATH if nothing was found. we just cache the results of the DT_RUNPATH lookup where we know that a specific path would be selected for a given needed soname (because hwcaps does not figure in the lookup but it is contained an runpath and thus would be found during the runpath lookup trial). this cache remains valid even when directories are appended to runpath or directories that don't contain any needed sonames are inserted anywhere, but it's better to just rebuild/nuke in all cases for consistency. stage1 currently collects libraries into an extra-utils drv without nuking the cache, leaving a few dead references around--but those don't hurt, since the failing lookups through the resolution cache aren't fatal. lookup proceeds with runpath and finds the moved libs just fine in that case.
yeah, that's always a concern. luckily this one can just be dropped without losing functionality, just some performance. and the lookups themselves are pretty easy, in a part of the algorithm that's not all that likely to change very much. (although hwcaps did happen. hopefully that's the last one.)
that would be the ideal case! other systems (we know of guix or spack) could benefit from this too. even FHS distros could benefit, though not nearly as much (since the resolution cache only optimises lookups that are served from RUNPATH). |
There was a problem hiding this comment.
we should probably bail here and not add a cache if the path contains $VARs, eg $ORIGIN. we could also resolve them, but that seems messy and error-prone, if it's not very common bailing out seems better. we could also add a "consult this rpath entry" instead, giving partial caching with no effect on search order. same if hwcaps directories are present.
There was a problem hiding this comment.
Can we not just ignore the variables and continue processing the other colon delimited paths?
There was a problem hiding this comment.
no, that would change the resolution order. imagine a runpath $ORIGIN:/some/dep, where both contain a libfoo.so and the program also needs a libfoo.so. we have to make a note of some kind that $ORIGIN must be visited, otherwise we'd not use the one from $ORIGIN, counter to the normal resolution order
hinshun
left a comment
There was a problem hiding this comment.
We're been experimenting with this patch internally with positive results. Compared to the other solutions, this one has a few advantages:
- Shrinkwrapping or setting absolute sonames causes issues in a local development workflow (outside of nix) where the same shared object from two different nix closures will both be loaded
- LD_LIBRARY_PATH and LD_PRELOAD continue to function, this is very important for us in existing local / debugging workflows
Without the suggestions in this review, I'm seeing page/section alignment errors from ld.
There was a problem hiding this comment.
| + const auto nameSize = roundUp(strlen(ELF_NOTE_NIXOS) + 1, 4); | |
| + const auto nameSize = roundUp(strlen(ELF_NOTE_NIXOS) + 1, getPageSize()); |
There was a problem hiding this comment.
| + const auto descSize = roundUp(noteSize, 4); | |
| + const auto descSize = roundUp(noteSize, getPageSize()); |
There was a problem hiding this comment.
| + wri(notePhdr.p_align, 4); | |
| + wri(notePhdr.p_align, getPageSize()); |
There was a problem hiding this comment.
| + wri(noteShdr.sh_addralign, 4); | |
| + wri(noteShdr.sh_addralign, sectionAlignment); |
$ORIGIN needs handling. we'll just bail on anything with a $ in it
|
rebased to current master, updated to not cache just a path for every needed soname but a search path (so we can properly handle |
|
This pull request has been mentioned on NixOS Discourse. There might be relevant details there: |
|
In case someone wants to try this out, simply cherry-pick the latest commit from https://github.com/DavHau/nixpkgs/commits/stat-storm I cleaned it up and applied the latest review suggestions. |
thank you for not crediting anyone in your branch. it really makes us feel appreciated. |
originally authored by @pennae via NixOS#207893 See original PR for more information
|
Please forgive me. Credited now via commit message. |
Description of changes
this is an alternative to #207061. to reduce stat storms and general slowness during application startup we save a resolution cache into each dso that maps DT_NEEDED entries to their corresponding path in the configured runpath. this should Just Work™.
we think this has many benefits over that, shrinkwrap, and other solutions:
LD_LIBRARY_PATHwe've tested this on a cross-compiled armv7 system (in qemu-user, got bored of the board taking so long to load) running
strace -cf /run/current-system/bin/switch-to-configuration test. unpatched nixos takes 44s to do this, with these patches that drops to 29s owing to almost 24000 failingopenatandstatcalls that are avoided by using the cache.this is still very rough, full of debug code, tests are missing, and the patchelf code should be upstreamed (but is included here for ease of testing).
Things done
sandbox = trueset innix.conf? (See Nix manual)nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage./result/bin/)nixos/doc/manual/md-to-db.shto update generated release notes