In the OLCF's software management workflow, we would like to generate binary caches for packages built in a staging/testing/preview environment that can be used to rapidly deploy packages in another "production" environment.
In our fork of spack (from v0.12.1), spack generates build caches for each package in the DAG sequentially. It will generate a build cache for each dependency until it encounters the first package which is not relocatable. At that point, spack buildcache create will exit; the remaining packages in the DAG are not processed. This leaves any relocatable but unprocessed packages without a build cache unless created explicitly for each relocatable dependency.
We would like spack to generate as many build caches for dependencies as possible and allow them to be used to satisfy dependencies in later re-use by other spack instances that would only need to build from source packages that are not relocatable.
In the reproducer example that follows, the first dependency spack processes in the DAG in non-relocatable causing no binary caches to be produced at all. However, we've seen cases where two or three of the first dependencies processed are relocatable and spack successfully produces binary caches for them before encountering a non-relocatable dependency and exiting before processing all the dependencies. This leads me to believe it's not important that all the dependencies in a DAG be cache-able for the relocatable ones to still be usefully cached. Is this belief correct? And if so, can we have spack generate as many caches as possible for a given input spec?
Steps to reproduce the issue
$ spack spec -lINt netcdf@4.6.1%gcc@6.4.0
Input spec
--------------------------------
- [ ] netcdf@4.6.1%gcc@6.4.0
Concretized
--------------------------------
[+] gzpquhc [ ] builtin.netcdf@4.6.1%gcc@6.4.0~dap~hdf4 maxdims=1024 maxvars=8192 +mpi+parallel-netcdf+pic+shared arch=linux-rhel7-ppc64le
[+] 6rzaif5 [bl ] ^builtin.hdf5@1.10.3%gcc@6.4.0+cxx~debug+fortran+hl+mpi+pic+shared~szip~threadsafe arch=linux-rhel7-ppc64le
[+] acsucgh [bl ] ^builtin.numactl@2.0.11%gcc@6.4.0 patches=592f30f7f5f757dfc239ad0ffd39a9a048487ad803c26b419e0f96b8cda08c1a arch=linux-rhel7-ppc64le
[+] 4um5hjo [bl ] ^olcf.spectrum-mpi@10.3.0.0-20190419%gcc@6.4.0 arch=linux-rhel7-ppc64le
[+] fvgnqf6 [bl ] ^builtin.zlib@1.2.11%gcc@6.4.0+optimize+pic+shared arch=linux-rhel7-ppc64le
[+] sbessrn [b ] ^builtin.m4@1.4.18%gcc@6.4.0 patches=3877ab548f88597ab2327a2230ee048d2d07ace1062efe81fc92e91b7f39cd00,c0a408fbffb7255fcc75e26bd8edab116fc81d216bfd18b473668b7739a4158e,fc9b61654a3ba1a8d6cd78ce087e7c96366c290bc8d2c299f09828d793b853c8 +sigsegv arch=linux-rhel7-ppc64le
[+] hdr43hr [bl ] ^builtin.parallel-netcdf@1.8.1%gcc@6.4.0+cxx+fortran+pic arch=linux-rhel7-ppc64le
spack buildcache create \
-d ${build_cache_dir} \
-k "${signing_key}" \
/gzpquhc
==> Found at least one matching spec
==> examining match netcdf@4.6.1%gcc@6.4.0~dap~hdf4 maxdims=1024 maxvars=8192 +mpi+parallel-netcdf+pic+shared arch=linux-rhel7-ppc64le
==> adding matching spec netcdf@4.6.1%gcc@6.4.0~dap~hdf4 maxdims=1024 maxvars=8192 +mpi+parallel-netcdf+pic+shared arch=linux-rhel7-ppc64le
==> recursing dependencies
==> skipping external or virtual dependency numactl@2.0.11%gcc@6.4.0 patches=592f30f7f5f757dfc239ad0ffd39a9a048487ad803c26b419e0f96b8cda08c1a arch=linux-rhel7-ppc64le
==> adding dependency spectrum-mpi@10.3.0.0-20190419%gcc@6.4.0 arch=linux-rhel7-ppc64le
==> adding dependency zlib@1.2.11%gcc@6.4.0+optimize+pic+shared arch=linux-rhel7-ppc64le
==> adding dependency hdf5@1.10.3%gcc@6.4.0+cxx~debug+fortran+hl+mpi+pic+shared~szip~threadsafe arch=linux-rhel7-ppc64le
==> adding dependency parallel-netcdf@1.8.1%gcc@6.4.0+cxx+fortran+pic arch=linux-rhel7-ppc64le
==> adding dependency netcdf@4.6.1%gcc@6.4.0~dap~hdf4 maxdims=1024 maxvars=8192 +mpi+parallel-netcdf+pic+shared arch=linux-rhel7-ppc64le
==> writing tarballs to ./build_cache
==> creating binary cache file for package parallel-netcdf@1.8.1%gcc@6.4.0+cxx+fortran+pic arch=linux-rhel7-ppc64le
==> Error:
/tmp/tmpAtrpKQ/parallel-netcdf-1.8.1-hdr43hrl4opcz2yagwqk4k5zjxmg2bep/bin/pnetcdf_version
contains string
/autofs/nccs-svm1_sw/.b2/.swci/1-compute/opt/spack/20180914
after replacing it in rpaths.
Package should not be relocated.
Use -a to override.
ls -l "${build_cache_dir}"
total 4
drwxrwsr-x 3 <REDACTED> <REDACTED> 4096 Jun 4 12:35 linux-rhel7-ppc64le
No binary caches are produced for any of the packages, even the ones that are relocatable.
Expected result
The hdf5 and parallel-netcdf dependencies are the only packages which are not relocatable and numactl here is an external package. So we expect binary caches to be produced for everything else. What follows is the output of a run using the proposed fix patch that is described further below.
spack buildcache create /gzpquhc
==> Found at least one matching spec
==> examining match netcdf@4.6.1%gcc@6.4.0~dap~hdf4 maxdims=1024 maxvars=8192 +mpi+parallel-netcdf+pic+shared arch=linux-rhel7-ppc64le
==> adding matching spec netcdf@4.6.1%gcc@6.4.0~dap~hdf4 maxdims=1024 maxvars=8192 +mpi+parallel-netcdf+pic+shared arch=linux-rhel7-ppc64le
==> recursing dependencies
==> skipping external or virtual dependency numactl@2.0.11%gcc@6.4.0 patches=592f30f7f5f757dfc239ad0ffd39a9a048487ad803c26b419e0f96b8cda08c1a arch=linux-rhel7-ppc64le
==> adding dependency spectrum-mpi@10.3.0.0-20190419%gcc@6.4.0 arch=linux-rhel7-ppc64le
==> adding dependency zlib@1.2.11%gcc@6.4.0+optimize+pic+shared arch=linux-rhel7-ppc64le
==> adding dependency hdf5@1.10.3%gcc@6.4.0+cxx~debug+fortran+hl+mpi+pic+shared~szip~threadsafe arch=linux-rhel7-ppc64le
==> adding dependency parallel-netcdf@1.8.1%gcc@6.4.0+cxx+fortran+pic arch=linux-rhel7-ppc64le
==> adding dependency netcdf@4.6.1%gcc@6.4.0~dap~hdf4 maxdims=1024 maxvars=8192 +mpi+parallel-netcdf+pic+shared arch=linux-rhel7-ppc64le
==> writing tarballs to ./build_cache
==> creating binary cache file for package parallel-netcdf@1.8.1%gcc@6.4.0+cxx+fortran+pic arch=linux-rhel7-ppc64le
==> Warning:
/tmp/tmpd6qqWI/parallel-netcdf-1.8.1-hdr43hrl4opcz2yagwqk4k5zjxmg2bep/bin/pnetcdf_version
contains string
/autofs/nccs-svm1_sw/.b2/.swci/1-compute/opt/spack/20180914
after replacing it in rpaths.
Package should not be relocated.
Use -a to override.
==> creating binary cache file for package hdf5@1.10.3%gcc@6.4.0+cxx~debug+fortran+hl+mpi+pic+shared~szip~threadsafe arch=linux-rhel7-ppc64le
==> Warning:
/tmp/tmpYecsFk/hdf5-1.10.3-6rzaif5azberzazrue4ryftlk5g4vcp4/lib/libhdf5.so.103.0.0
contains string
/autofs/nccs-svm1_sw/.b2/.swci/1-compute/opt/spack/20180914
after replacing it in rpaths.
Package should not be relocated.
Use -a to override.
==> creating binary cache file for package netcdf@4.6.1%gcc@6.4.0~dap~hdf4 maxdims=1024 maxvars=8192 +mpi+parallel-netcdf+pic+shared arch=linux-rhel7-ppc64le
gpg: using "<REDACTED>" as default secret key for signing
==> creating binary cache file for package spectrum-mpi@10.3.0.0-20190419%gcc@6.4.0 arch=linux-rhel7-ppc64le
gpg: using "<REDACTED>" as default secret key for signing
==> creating binary cache file for package zlib@1.2.11%gcc@6.4.0+optimize+pic+shared arch=linux-rhel7-ppc64le
gpg: using "<REDACTED>" as default secret key for signing
ls -l "${build_cache_dir}"
total 20K
drwxrwsr-x 3 <REDACTED> <REDACTED> 4.0K Jun 4 12:35 linux-rhel7-ppc64le/
-rw-rw-r-- 1 <REDACTED> <REDACTED> 754 Jun 4 12:44 index.html
-rw-rw-r-- 1 <REDACTED> <REDACTED> 3.8K Jun 4 12:44 linux-rhel7-ppc64le-gcc-6.4.0-netcdf-4.6.1-gzpquhcgd7zvrohl4f7l4c5dg7ysgrlq.spec.yaml
-rw-rw-r-- 1 <REDACTED> <REDACTED> 636 Jun 4 12:44 linux-rhel7-ppc64le-gcc-6.4.0-spectrum-mpi-10.3.0.0-20190419-4um5hjogm3tepg4xe23hrptlrs2y7ez6.spec.yaml
-rw-rw-r-- 1 <REDACTED> <REDACTED> 657 Jun 4 12:44 linux-rhel7-ppc64le-gcc-6.4.0-zlib-1.2.11-fvgnqf6k3ffhltldndu7pmntzvoyfsk4.spec.yaml
Error Message
The unexpected behavior appears to be in spack.binary_distribution::build_tarball as called by spack.cmd.buildcache::createtarball:
$ spack --stacktrace buildcache create /gzpquhc
/autofs/nccs-svm1_sw/.b2/.swci/1-compute/lib/spack/spack/cmd/buildcache.py:299 ==> Found at least one matching spec
/autofs/nccs-svm1_sw/.b2/.swci/1-compute/lib/spack/spack/cmd/buildcache.py:302 ==> examining match netcdf@4.6.1%gcc@6.4.0~dap~hdf4 maxdims=1024 maxvars=8192 +mpi+parallel-netcdf+pic+shared arch=linux-rhel7-ppc64le
/autofs/nccs-svm1_sw/.b2/.swci/1-compute/lib/spack/spack/cmd/buildcache.py:307 ==> adding matching spec netcdf@4.6.1%gcc@6.4.0~dap~hdf4 maxdims=1024 maxvars=8192 +mpi+parallel-netcdf+pic+shared arch=linux-rhel7-ppc64le
/autofs/nccs-svm1_sw/.b2/.swci/1-compute/lib/spack/spack/cmd/buildcache.py:309 ==> recursing dependencies
/autofs/nccs-svm1_sw/.b2/.swci/1-compute/lib/spack/spack/cmd/buildcache.py:315 ==> skipping external or virtual dependency numactl@2.0.11%gcc@6.4.0 patches=592f30f7f5f757dfc239ad0ffd39a9a048487ad803c26b419e0f96b8cda08c1a arch=linux-rhel7-ppc64le
/autofs/nccs-svm1_sw/.b2/.swci/1-compute/lib/spack/spack/cmd/buildcache.py:317 ==> adding dependency spectrum-mpi@10.3.0.0-20190419%gcc@6.4.0 arch=linux-rhel7-ppc64le
/autofs/nccs-svm1_sw/.b2/.swci/1-compute/lib/spack/spack/cmd/buildcache.py:317 ==> adding dependency zlib@1.2.11%gcc@6.4.0+optimize+pic+shared arch=linux-rhel7-ppc64le
/autofs/nccs-svm1_sw/.b2/.swci/1-compute/lib/spack/spack/cmd/buildcache.py:317 ==> adding dependency hdf5@1.10.3%gcc@6.4.0+cxx~debug+fortran+hl+mpi+pic+shared~szip~threadsafe arch=linux-rhel7-ppc64le
/autofs/nccs-svm1_sw/.b2/.swci/1-compute/lib/spack/spack/cmd/buildcache.py:317 ==> adding dependency parallel-netcdf@1.8.1%gcc@6.4.0+cxx+fortran+pic arch=linux-rhel7-ppc64le
/autofs/nccs-svm1_sw/.b2/.swci/1-compute/lib/spack/spack/cmd/buildcache.py:317 ==> adding dependency netcdf@4.6.1%gcc@6.4.0~dap~hdf4 maxdims=1024 maxvars=8192 +mpi+parallel-netcdf+pic+shared arch=linux-rhel7-ppc64le
/autofs/nccs-svm1_sw/.b2/.swci/1-compute/lib/spack/spack/cmd/buildcache.py:320 ==> writing tarballs to ./build_cache
/autofs/nccs-svm1_sw/.b2/.swci/1-compute/lib/spack/spack/cmd/buildcache.py:323 ==> creating binary cache file for package parallel-netcdf@1.8.1%gcc@6.4.0+cxx+fortran+pic arch=linux-rhel7-ppc64le
/autofs/nccs-svm1_sw/.b2/.swci/1-compute/lib/spack/spack/binary_distribution.py:337 ==> Error:
/tmp/tmpOLsaPk/parallel-netcdf-1.8.1-hdr43hrl4opcz2yagwqk4k5zjxmg2bep/bin/pnetcdf_version
contains string
/autofs/nccs-svm1_sw/.b2/.swci/1-compute/opt/spack/20180914
after replacing it in rpaths.
Package should not be relocated.
Use -a to override.
In particular, this block calls tty.die on the first general exception rather than raising a more generic exception that could be caught in the cmd.buildcache::createtarball function's loop over dependency specs.
Proposed solution
Allow spack.binary_distribution::build_tarball to log any errors and raise the exceptions. Any NoOverwriteException thrown earlier simply be ignored by cmd.buildcache::createtarball to continue processing dependency specs. Any InstallRootStringException exception thrown by non-relocatable packages can also be ignored to continue processing dependency specs.
diff --git a/lib/spack/spack/binary_distribution.py b/lib/spack/spack/binary_distribution.py
index 46ac7790e..09967b180 100644
--- a/lib/spack/spack/binary_distribution.py
+++ b/lib/spack/spack/binary_distribution.py
@@ -321,20 +321,15 @@ def build_tarball(spec, outdir, force=False, rel=False, unsigned=False,
# optinally make the paths in the binaries relative to each other
# in the spack install tree before creating tarball
- if rel:
- try:
+ try:
+ if rel:
make_package_relative(workdir, spec.prefix, allow_root)
- except Exception as e:
- shutil.rmtree(workdir)
- shutil.rmtree(tarfile_dir)
- tty.die(str(e))
- else:
- try:
+ else:
make_package_placeholder(workdir, spec.prefix, allow_root)
- except Exception as e:
- shutil.rmtree(workdir)
- shutil.rmtree(tarfile_dir)
- tty.die(str(e))
+ except Exception as e:
+ shutil.rmtree(workdir)
+ shutil.rmtree(tarfile_dir)
+ raise e
# create compressed tarball of the install prefix
with closing(tarfile.open(tarfile_path, 'w:gz')) as tar:
tar.add(name='%s' % workdir,
diff --git a/lib/spack/spack/cmd/buildcache.py b/lib/spack/spack/cmd/buildcache.py
index fe91312c4..5c35fc9f2 100644
--- a/lib/spack/spack/cmd/buildcache.py
+++ b/lib/spack/spack/cmd/buildcache.py
@@ -20,6 +20,8 @@ from spack.spec import Spec, save_dependency_spec_yamls
from spack.spec_set import CombinatorialSpecSet
import spack.binary_distribution as bindist
+from spack.binary_distribution import NoOverwriteException
+from spack.relocate import InstallRootStringException
import spack.cmd.common.arguments as arguments
from spack.cmd import display_specs
@@ -321,9 +323,14 @@ def createtarball(args):
for spec in specs:
tty.msg('creating binary cache file for package %s ' % spec.format())
- bindist.build_tarball(spec, outdir, args.force, args.rel,
- args.unsigned, args.allow_root, signkey,
- not args.no_rebuild_index)
+ try:
+ bindist.build_tarball(spec, outdir, args.force, args.rel,
+ args.unsigned, args.allow_root, signkey,
+ not args.no_rebuild_index)
+ except (NoOverwriteException, InstallRootStringException) as _err:
+ tty.warn(str(e))
+ except Exception as e:
+ tty.die(str(e))
def installtarball(args):
Can someone more familiar with the caveats and gotchas regarding binary distribution caches weigh in on whether this is a bad idea?
In the OLCF's software management workflow, we would like to generate binary caches for packages built in a staging/testing/preview environment that can be used to rapidly deploy packages in another "production" environment.
In our fork of spack (from v0.12.1), spack generates build caches for each package in the DAG sequentially. It will generate a build cache for each dependency until it encounters the first package which is not relocatable. At that point,
spack buildcache createwill exit; the remaining packages in the DAG are not processed. This leaves any relocatable but unprocessed packages without a build cache unless created explicitly for each relocatable dependency.We would like spack to generate as many build caches for dependencies as possible and allow them to be used to satisfy dependencies in later re-use by other spack instances that would only need to build from source packages that are not relocatable.
In the reproducer example that follows, the first dependency spack processes in the DAG in non-relocatable causing no binary caches to be produced at all. However, we've seen cases where two or three of the first dependencies processed are relocatable and spack successfully produces binary caches for them before encountering a non-relocatable dependency and exiting before processing all the dependencies. This leads me to believe it's not important that all the dependencies in a DAG be cache-able for the relocatable ones to still be usefully cached. Is this belief correct? And if so, can we have spack generate as many caches as possible for a given input spec?
Steps to reproduce the issue
spack buildcache create \ -d ${build_cache_dir} \ -k "${signing_key}" \ /gzpquhcls -l "${build_cache_dir}"No binary caches are produced for any of the packages, even the ones that are relocatable.
Expected result
The hdf5 and parallel-netcdf dependencies are the only packages which are not relocatable and numactl here is an external package. So we expect binary caches to be produced for everything else. What follows is the output of a run using the proposed fix patch that is described further below.
ls -l "${build_cache_dir}"Error Message
The unexpected behavior appears to be in
spack.binary_distribution::build_tarballas called byspack.cmd.buildcache::createtarball:In particular, this block calls
tty.dieon the first general exception rather than raising a more generic exception that could be caught in thecmd.buildcache::createtarballfunction's loop over dependency specs.Proposed solution
Allow
spack.binary_distribution::build_tarballto log any errors and raise the exceptions. AnyNoOverwriteExceptionthrown earlier simply be ignored bycmd.buildcache::createtarballto continue processing dependency specs. AnyInstallRootStringExceptionexception thrown by non-relocatable packages can also be ignored to continue processing dependency specs.Can someone more familiar with the caveats and gotchas regarding binary distribution caches weigh in on whether this is a bad idea?