Conversation
|
I've driven this around extensively, poking in particular at upstream use cases. It would appear that upstreams has advanced enough since #11871 that a number of problems fixed in the original patch set are no longer an issue. So far as upstreams working without configuration, I also noticed that a shared installation could configure itself as an upstream in $spack/etc/spack/defaults/upstreams.yaml, and that just worked for both the user managing that spack instance and a different user loading that spack instance. Neat! Probably not the greatest way to do it, but allows for doing something close until we have time to do another PR. |
FWIW I added the
In other words, I didn't expect what you did to work. We might want to prevent that from happening, because I'm not sure what is actually going on when the upstream is the same as the root. (this is mentioned in the subsection "New config scopes for user/admin interaction"; it would also be possible to implicitly omit an upstream if it matches the root) |
|
I generally think this is very useful, and is specifically useful for pip/apt. What do we need to do to get this beyond a draft? What can I do to help? |
|
It would be good to add to the documentation for scopes. Here's something I wrote for configuration.rst, but would go much better in this PR: and |
psakievich
left a comment
There was a problem hiding this comment.
Lot's of great work here. Specific comments are added for sections I have design concerns on. I would also like to see some unit-tests for:
- XDG mapping/fall back behaviors. Seems like a lot of permutations are possible here. Seeing the tests would help me to spot and/or understand corner cases.
- site admin stuff (I'm kind of just trusting this works as intended without a use case or docs in front of me)
- file permission issue handling for users interfacing with an admin install
IMO Those seem to be the main new features that are not tested in the current test-suite.
psakievich
left a comment
There was a problem hiding this comment.
Mainly commented on docs. Overall I think it's cool to see the way this has come together. There's certainly a lot more nuance and detail to this effort than I realized when I first got involved. Major thanks to @scheibelp to keeping with it.
| #. ``SPACK_DATA_HOME`` env var if that is set | ||
| #. Under ``SPACK_HOME`` env var; for ``$data_home``, it is ``$SPACK_HOME/.local/share/spack`` | ||
| #. ``config:locations:data`` | ||
| #. Under ``config:locations:home``; for ``$data_home`` it is ``$spack_home/.local/share/spack`` | ||
| #. ``XDG_DATA_HOME/spack`` if XDG_DATA_HOME is set | ||
| #. Under the default for ``XDG_DATA_HOME``: ``~/.local/share/spack`` | ||
|
|
||
| ``config:locations:home`` / ``SPACK_HOME`` can be used to control all 3 of ``data_home``, ``cache_home``, and ``state_home``. |
There was a problem hiding this comment.
I think the ordering is clever with the way you've weaved in and out of XDG. But it is also complex the way we weave from env variable to config back to env variable.
This is kind of addressed in the upper section, but I'd lay out the intention of XDG in a more upfront manner. There are at least 4 distinctive layers of customization profiles here:
- User env has no XDG_*, gets pure defaults (which fall back to XDG inspired defaults)
- User env has XDG_*, but other wise pure defaults (data starts moving around)
- User sets SPACK_HOME (Move everything together)
- User starts playing with all the other knobs to get truly custom data movements (with a further hierarchy between config and env variables).
This was a little confusing to me when reading. I'd reorder for the explanation.
I'd go with a table showing the defaults and XDG mappings. Then I'd say something like:
"This is intended to map to the XDG standard which allows users to collectively customize where data goes by default across many programs. We have also provided a way to customize specifically for spack."
- Everything is designed to nest under $SPACK_HOME for a wholistic data relocation strategy
- $SPACK_HOME is set by
- (Env) SPACK_HOME
- (Config) config:locations:home
- (Default) XDG_CONFIG_HOME or ~/.local/spack
- Specific sections can still be re-located independent of SPACK_HOME or XDG via env-variables or config
Then show the hierarchy for $data_home, or maybe present them all as a table?
There was a problem hiding this comment.
This is probably one of the most breaking PRs in a long time 😬 and to me the benefits are marginal.
I think it's very risky to merge this, cause it undermines the narrative that Spack v1 is a stable tool. I would not merge this before it is made backward compatible.
Review from a user perspective without looking at the code and pr description (say I just updated Spack).
First thing I see is multiline warnings on each command:
==> Warning: Bypassing config in '/Users/user/.spack' in favor of '/Users/user/.config/spack'.
Detected config in old location: '/Users/user/.config/spack'.
Set `SPACK_USER_CONFIG_PATH` or modify your include.yaml to use '/Users/user/.spack'
or move files from '/Users/user/.spack' to '/Users/user/.config/spack' to suppress this warning.
- What does "bypassing config" mean and why does the second line say "detected"? Raises the question if you detect it, why can't you use it?
- I have no means to judge whether (1)
SPACK_USER_CONFIG, (2)include.yaml(I've never used this and wouldn't even know the syntax, also... what include.yaml?), or (3) "move files" (what files?) is better.
As a Spack developer I happen to know that mv ~/.spack ~/.config/spack is not what is meant. Other users won't have a clue. They certainly won't know that ~/.spack/bootstrap is not relocatable.
Normally (buildcache, package repo, config deprecations) these warning messages show a command to run to migrate. Not here...
Next thing I tried was spack find, which worked 🤞. Then spack config blame c<tab> and it spits out the 4 line warning again (annoying). That's how I learned that the warning indeed means it does not respect my user config.
Running spack install zlib it installs zlib in $spack/opt/spack but clones a fresh spack-packages on a different commit. That's going to frustrate so many users.
I do not get why Spack is backwards compatible with install_tree, but that it refuses to use my config from ~/.spack, even though I never had a ~/.local/spack dir.
In my opinion you should make this backwards compatible and just branch on existence of ~/.spack and non-existence of ~/.local/spack, if so assume old Spack structure. Then deprecate ~/.spack in v1.3 or v1.4.
Few more thoughts:
- It's likely breaking existing CI workflows: fresh clone of spack to
$ci_temp_dir/spack; the expectation was that installs go into$ci_temp_dir/spack/opt/spack, but now they end up in~/.local/share/spack/installs. If there's no isolation (jenkins, self-hosted gitlab runners) these are possibly persistent. That will probably raise a bunch of user complaints. How should user configure this? ~/.local/share/spack/installsis shared, that complicates having two Spack instances sayv1.3andv1.4wherev1.4migrates the database to a new format as a side effect of saving; now thev1.3instance is no longer usable.- the path
~/.local/share/spack/installsis longer than my previous~/spack/opt/spack; deeply nested paths are bad for performance (kernel has to resolve them, realpath is more stat calls) and can run into path length and other limits faster. - On HPC systems, the home dir is not particularly fast; I would normally always
cdto the faster shared filesystem, clone Spack there, and then run install. Now I would have to clone it, configure it, and then install. That's another step, I'm sure we'll get complaints about that, given that all our examples of using spack aregit clone; . setup-env.sh; spack install. (In that sense I'm also confused that people asked for this...)
|
I also tried the PR briefly, and agree with @haampie that we should make it backward compatible first. Asking the users to update manually after a At a high-level I think it would be better to have a single marker that tells Spack which "layout" is in use, instead of migrating parts of the configuration per-path. The current per-path fallback heuristics are fragile because they have no single source of truth. A single marker solves this:
If possible, I'd try to stick to: and avoid emitting warnings on each command. More detailed considerations on warnings and defaultsThe warnings are really shown every time: which will likely cause a lot of gripes (see above where I ask for On top of not accounting for the default config location, it seems to me that moving defaults (e.g. from
|
| Run this first-thing when starting a spack child process that is | ||
| used for builds. Resolve all variables that are XDG-dependent, in |
There was a problem hiding this comment.
This feels suboptimal. For any new lazy constant added, you have to remember to extend it. Also, you don't call it in the new installer (and preferably shouldn't).
It'd be better to wrap these constants in an other object and pass them as function argument (i.e. serialization in the spawn/forkserver case):
- In the spawn case you can't expect the subprocess to reinitialize the state exactly as the parent process.
- Not all functions that use multiprocessing want to use
GlobalStateMarshaler, because it has enormous overhead: it serializes the active environment (afaik redundantly, but well...).
…ncluding gpg keys, install paths); new checkouts will write into new (shared) paths; tests are partially updated Signed-off-by: Peter Scheibel <scheibel1@llnl.gov>
…ey are now getting used) Signed-off-by: Peter Scheibel <scheibel1@llnl.gov>
Signed-off-by: Peter Scheibel <scheibel1@llnl.gov>
Signed-off-by: Peter Scheibel <scheibel1@llnl.gov>
Signed-off-by: Peter Scheibel <scheibel1@llnl.gov>
Signed-off-by: Peter Scheibel <scheibel1@llnl.gov>
Signed-off-by: Peter Scheibel <scheibel1@llnl.gov>
Signed-off-by: Peter Scheibel <scheibel1@llnl.gov>
Signed-off-by: Peter Scheibel <scheibel1@llnl.gov>
Signed-off-by: Peter Scheibel <scheibel1@llnl.gov>
… paths in default config) Signed-off-by: Peter Scheibel <scheibel1@llnl.gov>
Signed-off-by: Peter Scheibel <scheibel1@llnl.gov>
Signed-off-by: Peter Scheibel <scheibel1@llnl.gov>
Signed-off-by: Peter Scheibel <scheibel1@llnl.gov>
Signed-off-by: Peter Scheibel <scheibel1@llnl.gov>
Signed-off-by: Peter Scheibel <scheibel1@llnl.gov>
Signed-off-by: Peter Scheibel <scheibel1@llnl.gov>
Signed-off-by: Peter Scheibel <scheibel1@llnl.gov>

Closes #15939
Closes #49939
Closes #11871
Fixes #52251
This is a redo of #11871, which aims to make Spack pip/apt-installable. Use cases are described in detail here (they are numbered, and this PR description may refer to them like "U1"), along with parts of 11871 that are not handled here.
Moving Spack-generated artifacts outside of
$spackThe biggest blocker to creating a shared spack instance is that Spack will default to writing data into its own prefix. For new clones of Spack, this PR moves all generated files out of the cloned directory and into
~. This includes:For old clones:
$spackdirectories.~and an old spack instancegit pulls this logic, the old instance will start using~. A warning is generated in this case1-3 are large. Before this PR there were config options to relocate each of these individually (and those still take precedence). This PR offers additional mechanisms to set a single env var or config option in order to relocate all of them together.
Controlling install location
This PR establishes the following order:
config:install_tree:root:...SPACK_DATA_HOMEenv var if that is setSPACK_HOMEenv var if that is setconfig:locations:dataif that is setconfig:locations:homeif that is setXDG_DATA_HOMEenv var if that is setXDG_DATA_HOME:~/.local/share/spackIf an older clone of spack pulls this, and has existing installs in
$spack/opt/spack(the old default), then:Other notes:
config:install_tree:rootso that users who dospack config get configwill see that setting, but it uses a special default variable that effectively behaves as if it were unset.config:environments_rootconfig:source_cache)SPACK_DATA_HOME,XDG_DATA_HOMEorconfig:locations:data$spackor~before can be relocated by settingSPACK_HOMEorconfig:locations:home, except for the user config scopeetc/spack/include.yamlsetspath: "$spack_home/.config/spack/"for the user scope. In order to influence this, the user must set theSPACK_HOMEenv var, or they must setconfig:locations:homeat a scope that is lower priority than thespackscope (right now that is awkward - see: Shared spack #47615 (comment))include:config to reference config-based location variables causes too many headachesconfig:locations:disable_env:true(i.e. 2, 3, 6 will be ignored).Admin management, pip-installed spack
$spack. As of now, everything goes into~. I think this default behavior will be sufficient to supportpip installed spackinclude:config in thespack scope:upstreams:xconfigwhen: SPACK_ADMIN not in envinstall_tree:root:xconfigwhen: SPACK_ADMIN in envavoiding
~Before this PR, you could do
and nothing would be read from or written to ~
With this PR, that would not be enough. Some ways to do that here:
spack isolatecommand that you can run to configure spack to write into$spackSPACK_HOME=x SPACK_USER_CONFIG_PATH=x/config, everything will be written to xSPACK_HOME=x SPACK_DISABLE_LOCAL_CONFIG=1config:locations:dataoutside of ~, this will relocate everything that takes up significant space (envs, installs, cached downloads)To completely move everything, in an env you can do:
Users who have write access to $spack can do
XDG compliance
Everything that was written into
~/.spackis now relocated to XDG-compliant alternatives:~/.config/spack)~/.local/state/spack$SPACK_USER_CACHE_PATH(highest priority),$SPACK_STATE_HOME, and$XDG_STATE_HOMEEverything moving out of
$spackis moving into XDG-compliant default locations under~.$SPACK_DATA_HOME/gpg-keys$SPACK_DATA_HOME(e.g.$SPACK_DATA_HOME/lmod)See also: #49939
Crosstalk between spack instances
config:misc_cachestill exists, so users can share a single misc cache among multiple instances of spack~/.config/spackUse cases/considerations
All items here are addressed, although there is some friction with [4]
a. sub-case, with
--user(i.e. user may have write permissions to the install location)a. but users might choose to clone spack themselves
b. and they may or may not want to use an admin-curated upstream: it should be possible for users to get an upstream DB by default from an admin without running any extra commands
~may or may not have significant storage spacea. admin should be able to define config that only applies to users, not admins
b. Multiple admins may all want to share a config, but not provide that config to an end-user
c. users may want their install location to differ on a per-spack basis (but still live outside of the spack prefix)
a. new
git clones of spackb.
git pulls in existing clones (and e.g. should be able to use gpg keys)c. Discarding a pre-existing download cache is not a problem
TODOs
Maybe TODOs
it may be useful to strip out vendored libs (i.e. code in.lib/spack/external/) and handle it w/requirements.txtThat is possibly complicated if the versions we vendor (intended to support back to Python 3.6) are no longer supported/availablespack.vendornamespace, so long as those top-level modules do not themselves require vendored libs, this would prevent vendored Spack python deps from interfering with imports.Things not translated from #11871
If prefixed w/ "drop", I think we can skip it entirely
apt/pip-installablespack --install-tree=... installstore.pydepends on config (so this would introduce a cyclic dependency).Other changes
Not essential to this PR, but related to other changes made here
sys.addaudithook); it is enabled with--guard-writes-into-spack