preserve the stage directory across installs of the same package for incremental make #25851
Replies: 2 comments 5 replies
-
|
I'm seeing that Regardless, the intent of this idea isn't to put the onus on a specific spack user running |
Beta Was this translation helpful? Give feedback.
-
|
I totally didn't think of this at first, but I'm now wondering whether this approach may be extensible to perform ABI diffing with libabigail after the fact as I believe @vsoch was working on a while ago. This is prompted by a further tweet from the same developer (regarding the idea of patching a change from within a persisted stage and rebuilding with incremental make): https://twitter.com/richfelker/status/1435689078587105280?s=21
I can't immediately see how to directly use the concept of binary diffs in spack as I'm not incredibly familiar with the usage of binary diffing in general, but it may become useful if an OS package manager like alpine's wanted to delegate to spack to provide a build environment. But I'm aware that's not a very strong use case. I mostly wanted to raise the idea of binary diffing from this second tweet since that secondary idea would be more easily enabled by the main proposal in this discussion, in my eyes. One vague vague idea: I am aware that we have a lot of tooling in spack to modify e.g. rpaths to make shared libraries relocatable, and that we can use libabigail to get semantic ABI information from the output of a build. One extension to the proposal I made in this discussion (using the idea from this second tweet) might be to try to improve the ability of libabigail to perform meaningful ABI diffs from successive package builds by retaining the build output with debug symbols in a subdirectory of the install tree in a buildcache, while continuing to strip debug symbols and/or cut out unwanted libraries to reduce the size of the binary package that spack users actually install from the buildcache during normal usage. Just spitballing. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Problem: enabling incremental
makeacross different versions of a packageThis thought was inspired by this tweet about a surprising (to me) security implication of wiping staging directories by default in alpine linux's
abuildbuild system: https://twitter.com/RichFelker/status/1435659133223587843?s=20. In particular, when the alpine maintainers needed to rebuild firefox in response to a zero-day, they had run up against the inherent time pressure of rebuilding a very large codebase like firefox from scratch. The tweet proposes the following workflow for security-sensitive projects with a large attack surface that take a long time to rebuild:make.Application to Spack
There are two points compared/contrasted to spack that I can immediately spot from the above scenario:
abuild, packages usually have a single totally-ordered version lineage, so obtaining the "patch to fix the zero-day" is often relatively easy to calculate and occasionally may be provided directly by the upstream maintainer when a new source tarball is created.abuildbuild system and aports repository allow users to build packages from source, but this isn't done in most cases (I only do this for a few packages like the kernel on my alpine install). Similarly for spack, many sites usespack buildcache createto avoid their users having to rebuild common packages from scratch each time.Alternative Approaches
spack dev-buildallows maintaining an external stage of a specific package, while setting the version to allow it to be used as part of a larger spack build. This can be used withspack stage -p <PATH>to retrieve the appropriate source checkout.spack install).spack install --keep-stage(similarly toabuild's-kflag) will retain the stage directory after a successful installation of a package.Missed opportunity: maintaining an untouched stage directory
Many/most of the underlying build systems that the codebases spack packages use to build themselves (specifically
makeand things that produceMakefiles) rely on timestamps to calculate what to rebuild (this is also true ofabuild). While some build systems such as pants and bazel rely on recursively checksumming directory contents to know what to rebuild, those tools do not understandMakefiles or./configurescripts, and require forking such packages in order to insert their own build setup, which then quickly diverges from upstream and imposes a significant maintenance burden. While I proposed partially adopting their model in #20407 for some cases, it remains difficult to see how to incrementally introduce the benefits of those tools into any of the packages that spack supports.Therefore, to avoid an immense continuing effort to rewrite the build processes of upstream packages, spack often relies on timestamps to calculate what to rebuild. This actually works great in the case of
spack dev-buildorspack build-env, enabling package developers to use spack to pull in any dependencies, then iterate via editing and incrementally rebuilding, without doing a whole fresh build repeatedly. However, as per above,spack dev-builddoesn't plug intospack install, and spack currently can't reuse the same stage directory for a new version of a previously-built package. This means that:As in the linked tweet at the top, (1) may widen the time range that spack users in a specific site are exposed to a zero-day, although this isn't really an issue since
spack installwill just build from source if a package version isn't available from a buildcache. However, (2) means users trying out any version of a package that's not already built get to wait a long time to do so, and maybe more significantly, users doing aspack dev-buildof a previously-built package still have to wait for a clean rebuild before being able to do their work. This is especially tragic if their site e.g. reserves beefy machines for building LLVM from source, but not for individual spack users (as with alpine linux andabuild/apk install). To make the point stark, none of these issues arise when people build everything from source by hand without spack, which to me implies there is an opportunity here to solve a more general problem than the one mentioned in the linked tweet.Proposal: specify a persistent staging directory for a package to reuse across builds
spack install --persist-stage=package1,package2,....~/.spack/caches/persisted-stages/<checksum>.json.spack buildcache create --propagate-persisted-stages --persist-stage=package1,package2,...retain the stage directory in the install prefix for the specified packages, preserving the files' modification times.spack install --persist-stage=package1,package2,...to look for a persisted stage directory in a buildcache or install db for the specified packages (after looking in~/.spack/caches/persisted-stages/<checksum>.jsonfor those packages).spack stageof the new package vs the persisted stage (diffing over just the files that exist in the new package's stage), apply the diff, then build within the persisted stage directory.~/.spack/caches/persisted-stages/<checksum>.json.Beta Was this translation helpful? Give feedback.
All reactions