Best-effort co-concretization for environments with explicit conflicts#27964
Best-effort co-concretization for environments with explicit conflicts#27964
Conversation
|
@alalazo this is the PR we were discussing earlier |
bfa7407 to
c4bb008
Compare
3d93328 to
bc20006
Compare
There was a problem hiding this comment.
I started reviewing the PR, and left a few questions. At a high level:
- The
spack solve outputis very verbose with hundreds of optimization criteria - The association of a process ID and a spec is maybe too loose in the code, and needs to be consolidated in some way.
- Having separate optimization criteria for each package poses questions on how to select priorities
I tested the solve time for a few specs and the results are great, i.e. we don't have noticeable performance losses. Solve time for PR 27694 (in seconds):
| hdf5 | cp2k | trilinos | |
|---|---|---|---|
| setup | 3.6892 | 5.6508 | 4.7846 |
| ground | 1.3216 | 2.2394 | 1.9163 |
| solve | 2.4108 | 4.1936 | 2.8291 |
| total | 7.4647 | 12.1481 | 9.5782 |
Solve time for develop (in seconds):
| hdf5 | cp2k | trilinos | |
|---|---|---|---|
| setup | 3.6545 | 5.4729 | 4.6592 |
| ground | 1.0601 | 1.6933 | 1.5260 |
| solve | 2.1859 | 3.7101 | 3.4006 |
| total | 6.9415 | 11.0188 | 9.6327 |
This comparison points out that the new encoding doesn't add much overhead when we fall-back to the previous case of having a single process id.
Tests run on:
- Spack: 0.17.1-710-afa4221598
- Python: 3.8.10
- Platform: linux-ubuntu20.04-icelake
- Concretizer: clingo
lib/spack/spack/cmd/solve.py
Outdated
| result = asp.solve( | ||
| specs, dump=dump, models=models, timers=args.timers, stats=args.stats, | ||
| reuse=args.reuse, | ||
| reuse=args.reuse, multi_root=True |
There was a problem hiding this comment.
The solve command is cluttered because of the number of optimization criteria. A typical spack solve shows hundreds of lines of minimizations over possible dependencies - this seems a usability issue similar to lot of facts being dumped for errors.
There was a problem hiding this comment.
Reading the code I see that now we show by default only the criteria that are non-zero i.e. sub-optimal. Is there an easy way to show output like before in non-debug mode and add:
number of non-identical *** nodes
criteria only in debug mode?
| 'concretization': { | ||
| 'type': 'string', | ||
| 'enum': ['together', 'separately'], | ||
| 'enum': ['together', 'separately', 'together_where_possible'], |
There was a problem hiding this comment.
Minor: wondering if we should come up with a single word or anyhow something shorter for this concretization mode.
lib/spack/spack/solver/asp.py
Outdated
| self.gen.fact(fn.deprecated_version(pkg.name, v)) | ||
|
|
||
| def spec_versions(self, spec): | ||
| def spec_versions(self, idx, spec): |
There was a problem hiding this comment.
In general I think we need to improve docstrings now that the arguments are becoming more involved conceptually. As far as I understand here idx is not arbitrary, but tied to the process ID / root ID? If there is a relation between these two arguments that needs to be respected, can't we package a spec and an idx in an object and send them around together?
There was a problem hiding this comment.
The same spec could appear as part of multiple psids, since it could be a dependency of multiple roots.
|
Another data point by test-driving the PR with this simple environment: spack:
specs:
- hdf5@1.13.0 ^zlib@1.2.8
- hdf5@1.8.22
concretization: together_where_possibleI obtain: $ time spack -e . concretize -f
...
real 0m20,139s
user 0m19,811s
sys 0m0,309sCompared to concretizing separately on $ time spack -e . concretize -f
==> Starting concretization pool with 2 processes
...
real 0m8,780s
user 0m16,109s
sys 0m0,335sThere is a 20% increase in user time when we have to unify zlib over 2 specs. Using a slightly more complex environment: spack:
specs:
- hdf5@1.13.0 ^zlib@1.2.8
- hdf5@1.8.21 ^zlib@1.2.11
- hdf5@1.8.22
concretization: together_where_possibleI have: $ time spack -e . concretize -f
...
real 1m2,460s
user 1m1,971s
sys 0m0,461sCompared to the separate concretization: $ time spack -e . concretize -f
...
real 0m8,966s
user 0m24,255s
sys 0m0,514swhich is a 150% increase in user time. Furthermore, |
This comment was marked as outdated.
This comment was marked as outdated.
296861d to
85ca2a5
Compare
|
@alalazo I have addressed the two behavior bugs you showed. Still working on addressing other review comments. |
62a3406 to
264d1a8
Compare
|
I tried again the same quick performance test. Here's the solve time for
and here are the results for the PR at 78bb86e (merge commit, in seconds):
So overall just a small increase after the latest fixes, which I think is acceptable. |
|
I confirm that this environment: spack:
specs:
- hdf5@1.13.0 ^zlib@1.2.8 ^xz@5.2.4
- hdf5@1.8.21 ^zlib@1.2.11
- hdf5@1.8.22
- xz@5.2.5
concretization: together_where_possiblenow concretize as expected. I am attaching the spack.lock file as a reference. It is concerning though that the time to reach a solution for this environment is: if compared to separate concretization: The concerning part is that it seems there's a great increase moving from a single spec to be unified to 2 specs. Now the user time is 154 sec. vs. 27 sec. so more than 5x. @becker33 I wonder if unifying everything in a single optimization criteria can mitigate this increase, since the search space is collapsed into a single additional dimension. I have no idea if this is just a wrong intuition or if it can be supported by any work I don't know about yet. |
|
Tried to construct a more complicated environment, where the new optimization criteria are more exercised: spack:
specs:
- hdf5@1.13.0 ^zlib@1.2.8 ^xz@5.2.4 ^autoconf@2.71
- hdf5@1.8.21 ^zlib@1.2.11 ^hwloc@2.6.0
- hdf5@1.8.22
- autoconf@2.70
- xz@5.2.5
concretization: together_where_possibleThe overall time to solution doesn't increase much with respect to the simpler environment used before: For reference, if we concretize separately: Trying to concretize the e4s pipeline (as representative of a production environment) on my machine gives the following result on develop: when trying as it literally hogs memory (in 5 mins. it ramped up to exceed the 32GB of RAM my machine has). I guess we need to keep an eye on memory too 😓 |
alalazo
left a comment
There was a problem hiding this comment.
I am halfway through the review. So far I have only one minor question on the output of spack solve which is probably relevant only for developers.
I tried to test drive the PR and this time the results have been always correct, so 💯 for the fixes on behavior.
I am a bit concerned by resources that this mode of concretization needs. So far I have been unable to concretize e4s (used as a representative of a production environment) on my machine due to memory requests.
| import spack.solver.asp | ||
| specs = [spack.spec.Spec(s) for s in specs] | ||
| result = spack.solver.asp.solve(specs, reuse=False, multi_root=True) |
There was a problem hiding this comment.
Shouldn't we skip this test when using the original concretizer?
| import spack.solver.asp | ||
|
|
||
| specs = [spack.spec.Spec(s) for s in specs] | ||
| result = spack.solver.asp.solve(specs, reuse=False, multi_root=True) |
There was a problem hiding this comment.
Same question on skipping the test for concretizer:original
Since the new criteria is above all the build criteria, we need to modify the way we display the output. Originally done by Greg in spack#27964 and cherry-picked to this branch by the co-author of the commit. Co-authored-by: Massimiliano Culpo <massimiliano.culpo@gmail.com>
Since the new criteria is above all the build criteria, we need to modify the way we display the output. Originally done by Greg in spack#27964 and cherry-picked to this branch by the co-author of the commit. Co-authored-by: Massimiliano Culpo <massimiliano.culpo@gmail.com>
Since the new criteria is above all the build criteria, we need to modify the way we display the output. Originally done by Greg in spack#27964 and cherry-picked to this branch by the co-author of the commit. Co-authored-by: Massimiliano Culpo <massimiliano.culpo@gmail.com>
Since the new criteria is above all the build criteria, we need to modify the way we display the output. Originally done by Greg in spack#27964 and cherry-picked to this branch by the co-author of the commit. Co-authored-by: Massimiliano Culpo <massimiliano.culpo@gmail.com>
Since the new criteria is above all the build criteria, we need to modify the way we display the output. Originally done by Greg in spack#27964 and cherry-picked to this branch by the co-author of the commit. Co-authored-by: Massimiliano Culpo <massimiliano.culpo@gmail.com>
|
Superseded by #28941 |
Since the new criteria is above all the build criteria, we need to modify the way we display the output. Originally done by Greg in spack#27964 and cherry-picked to this branch by the co-author of the commit. Co-authored-by: Massimiliano Culpo <massimiliano.culpo@gmail.com>
Since the new criteria is above all the build criteria, we need to modify the way we display the output. Originally done by Greg in spack#27964 and cherry-picked to this branch by the co-author of the commit. Co-authored-by: Massimiliano Culpo <massimiliano.culpo@gmail.com>
Since the new criteria is above all the build criteria, we need to modify the way we display the output. Originally done by Greg in spack#27964 and cherry-picked to this branch by the co-author of the commit. Co-authored-by: Massimiliano Culpo <massimiliano.culpo@gmail.com>
Since the new criteria is above all the build criteria, we need to modify the way we display the output. Originally done by Greg in spack#27964 and cherry-picked to this branch by the co-author of the commit. Co-authored-by: Massimiliano Culpo <massimiliano.culpo@gmail.com>
Since the new criteria is above all the build criteria, we need to modify the way we display the output. Originally done by Greg in spack#27964 and cherry-picked to this branch by the co-author of the commit. Co-authored-by: Massimiliano Culpo <massimiliano.culpo@gmail.com>
Since the new criteria is above all the build criteria, we need to modify the way we display the output. Originally done by Greg in spack#27964 and cherry-picked to this branch by the co-author of the commit. Co-authored-by: Massimiliano Culpo <massimiliano.culpo@gmail.com>
Since the new criteria is above all the build criteria, we need to modify the way we display the output. Originally done by Greg in spack#27964 and cherry-picked to this branch by the co-author of the commit. Co-authored-by: Massimiliano Culpo <massimiliano.culpo@gmail.com>
Currently, environments can either be concretized fully together or fully separately. This works well for users who create environments for interoperable software and can use `concretizer:unify:true`. It does not allow environments with conflicting software to be concretized for maximal interoperability. The primary use-case for this is facilities providing system software. Facilities provide multiple MPI implementations, but all software built against a given MPI ought to be interoperable. This PR adds a concretization option `concretizer:unify:when_possible`. When this option is used, Spack will concretize specs in the environment separately, but will optimize for minimal differences in overlapping packages. * Add a level of indirection to root specs This commit introduce the "literal" atom, which comes with a few different "arities". The unary "literal" contains an integer that id the ID of a spec literal. Other "literals" contain information on the requests made by literal ID. For instance zlib@1.2.11 generates the following facts: literal(0,"root","zlib"). literal(0,"node","zlib"). literal(0,"node_version_satisfies","zlib","1.2.11"). This should help with solving large environments "together where possible" since later literals can be now solved together in batches. * Add a mechanism to relax the number of literals being solved * Modify spack solve to display the new criteria Since the new criteria is above all the build criteria, we need to modify the way we display the output. Originally done by Greg in #27964 and cherry-picked to this branch by the co-author of the commit. Co-authored-by: Massimiliano Culpo <massimiliano.culpo@gmail.com> * Inject reusable specs into the solve Instead of coupling the PyclingoDriver() object with spack.config, inject the concrete specs that can be reused. A method level function takes care of reading from the store and the buildcache. * spack solve: show output of multi-rounds * add tests for best-effort coconcretization * Enforce having at least a literal being solved Co-authored-by: Greg Becker <becker33@llnl.gov>
Currently, environments can either be concretized fully together or fully separately. This works well for users who create environments for interoperable software and can use `concretizer:unify:true`. It does not allow environments with conflicting software to be concretized for maximal interoperability. The primary use-case for this is facilities providing system software. Facilities provide multiple MPI implementations, but all software built against a given MPI ought to be interoperable. This PR adds a concretization option `concretizer:unify:when_possible`. When this option is used, Spack will concretize specs in the environment separately, but will optimize for minimal differences in overlapping packages. * Add a level of indirection to root specs This commit introduce the "literal" atom, which comes with a few different "arities". The unary "literal" contains an integer that id the ID of a spec literal. Other "literals" contain information on the requests made by literal ID. For instance zlib@1.2.11 generates the following facts: literal(0,"root","zlib"). literal(0,"node","zlib"). literal(0,"node_version_satisfies","zlib","1.2.11"). This should help with solving large environments "together where possible" since later literals can be now solved together in batches. * Add a mechanism to relax the number of literals being solved * Modify spack solve to display the new criteria Since the new criteria is above all the build criteria, we need to modify the way we display the output. Originally done by Greg in spack#27964 and cherry-picked to this branch by the co-author of the commit. Co-authored-by: Massimiliano Culpo <massimiliano.culpo@gmail.com> * Inject reusable specs into the solve Instead of coupling the PyclingoDriver() object with spack.config, inject the concrete specs that can be reused. A method level function takes care of reading from the store and the buildcache. * spack solve: show output of multi-rounds * add tests for best-effort coconcretization * Enforce having at least a literal being solved Co-authored-by: Greg Becker <becker33@llnl.gov>
Currently, environments can either be concretized fully together or fully separately. This works well for users who create environments for interoperable software and can use `concretizer:unify:true`. It does not allow environments with conflicting software to be concretized for maximal interoperability. The primary use-case for this is facilities providing system software. Facilities provide multiple MPI implementations, but all software built against a given MPI ought to be interoperable. This PR adds a concretization option `concretizer:unify:when_possible`. When this option is used, Spack will concretize specs in the environment separately, but will optimize for minimal differences in overlapping packages. * Add a level of indirection to root specs This commit introduce the "literal" atom, which comes with a few different "arities". The unary "literal" contains an integer that id the ID of a spec literal. Other "literals" contain information on the requests made by literal ID. For instance zlib@1.2.11 generates the following facts: literal(0,"root","zlib"). literal(0,"node","zlib"). literal(0,"node_version_satisfies","zlib","1.2.11"). This should help with solving large environments "together where possible" since later literals can be now solved together in batches. * Add a mechanism to relax the number of literals being solved * Modify spack solve to display the new criteria Since the new criteria is above all the build criteria, we need to modify the way we display the output. Originally done by Greg in spack#27964 and cherry-picked to this branch by the co-author of the commit. Co-authored-by: Massimiliano Culpo <massimiliano.culpo@gmail.com> * Inject reusable specs into the solve Instead of coupling the PyclingoDriver() object with spack.config, inject the concrete specs that can be reused. A method level function takes care of reading from the store and the buildcache. * spack solve: show output of multi-rounds * add tests for best-effort coconcretization * Enforce having at least a literal being solved Co-authored-by: Greg Becker <becker33@llnl.gov>



Currently, environments can either be concretized fully together or fully separately. This works well for users who create environments for interoperable software and can use
concretization: together. It does not allow environments with conflicting software to be concretized for maximal interoperability.The primary use-case for this is facilities providing system software. Facilities provide multiple MPI implementations, but all software built against a given MPI ought to be interoperable.
This PR adds a
concretizationoptiontogether_where_possible. When this option is used, Spack will concretize specs in the environment separately, but will optimize for minimal differences in overlapping packages.TODO:
Stretch goal: