pkg: Use filtered_formula to represent dependencies#10918
pkg: Use filtered_formula to represent dependencies#10918rgrinberg merged 19 commits intoocaml:mainfrom
filtered_formula to represent dependencies#10918Conversation
ElectreAAS
left a comment
There was a problem hiding this comment.
Some tests expectedly fail with hash incongruences, but some others fail with this:
- File "x.opam", line 1, characters 0-0:
- Error: Expected formula to be a conjunction of atoms but encountered non-atom
- term 'a | b'
+ Error: Unable to solve dependencies for the following lock directories:
+ Lock directory dune.lock:
+ Can't find all required versions.
+ Selected: x.dev a
+ - a -> (problem)
+ No known implementations at allShouldn't these tests now pass with a proper build?
|
@ElectreAAS The failure you quoted makes sense, because the test in The hash changes are an unfortumate side-effect as the non-local dependency set is calculated from a different representation and bar of implementing the OPAM filters into Dune's condition language and then hashing that there is no good way of preserving the previous hashes (or running a hybrid approach of encoding them one way if they can and another if they cannot be represented using dune-lang conditions). I don't think we guarantee that the hashes are stable between versions of Dune. |
3183e7e to
31229fb
Compare
|
I've replaced the failing test with one that shows that disjunctions in opam files lead to valid lock directory solutions using either side of the disjunction. |
|
Oh nice! Thanks for updating the test |
b3969c7 to
5728b11
Compare
src/dune_pkg/opam_solver.ml
Outdated
| let deps = | ||
| List.map pkg.dependencies ~f:(fun (d : Package_dependency.t) -> d.name) | ||
| in | ||
| let deps = Local_package.For_solver.dependency_names pkg in |
There was a problem hiding this comment.
Isn't this wrong? You should not consider packages that haven't been selected for the reachability check. This would mean that you're including packages in the lock directory that are not needed for the actual build. This is a good thing to add a test for as well.
There was a problem hiding this comment.
Yes, you're completely right, thanks for pointing this out. This code needs to be changed because it uses the input to the solver as the output instead of using the solution that the solver outputs.
I'll add a test that checks that dependencies that don't get picked by the solver don't end up in the lock directory.
There was a problem hiding this comment.
I'm looking what the semantics of reachable is and according to #10814 it is about removing post dependencies (and also post-deps-solving.t is the only test failing if I comment out the reachability test).
Since this is basically about filtering the solver output via the possible deps from the input, how about still collecting all possible packages (since they are potentially reachable) except those that are post and using that for reachability?
That way packages that the solver is not picking will not be part of the lock dir (as the solution just does not contain them) and post packages won't be picked either (as they will be filtered out by the reachability check). This makes post a bit special but post is already a magic variable in opam.
There was a problem hiding this comment.
Added a test showing an unnecessary post dependencies being eliminated.
how about still collecting all possible packages (since they are potentially reachable) except those that are post and using that for reachability?
I think it's just simpler to do this without any hacks. We've already resolved the or deps when writing the dependency list in the lock directory. Might as well just use that here instead of the formula
There was a problem hiding this comment.
In particular, I'm thinking of just using this snippet to evaluate the formula:
Resolve_opam_formula.filtered_formula_to_package_names
~with_test:false
(add_self_to_filter_env opam_package (Solver_env.to_env solver_env))
version_by_package_name
what
We already do this for external packages. No reason why we can't do it for local packages for the reachability check. You might want to wait until #11022 is merged though, as I'm going to move this snippet somewhere where it can be shared.
There was a problem hiding this comment.
Ah, I missed the comment. Ok, I'll try to evaluate the formula instead.
There was a problem hiding this comment.
Looking at the documentation of filtered_formula_to_package_names makes me question whether this is the right thing to use:
Given a map associating package names with the version of that package in a solution, this function evaluates an
OpamFormula.tto a concrete list of package names all contained in the solution with versions which satisfy the formula, or an error if the the version constraints or packages in the formula can't be satisfied with the packages in the solution. If the formula contains disjunctions (the|operator in opam) then there may be multiple sets of packages that satisfy the formula. In this case the choice of which set to return is made arbitrarily.
This sounds to me that if the solver chooses one side of the disjunction and filtered_formula_to_package_names the other, this will lead to all the solver-selected legitimate dependencies being filtered out. Or, if I pass in the result from the solver as the package name map, the function will fail to find a solution.
There was a problem hiding this comment.
This sounds to me that if the solver chooses one side of the disjunction and filtered_formula_to_package_names the other, this will lead to all the solver-selected legitimate dependencies being filtered out. Or, if I pass in the result from the solver as the package name map, the function will fail to find a solution.
If this is an issue, then it's already present with packages from opam repository. Working around it for local packages defined using the opam syntax is going to be an insufficient work around anyway.
But I think this isn't a huge issue. Rephrased less pessimistically, it tell us that solver can find at least one way to satisfy the disjunction. If there's only one way, we will find it. If there's more than one way, we will pick some arbitrary way - but it will still satisfy the formula. We're only planning to execute one build plan using this solution, so we indeed need only one solution.
I agree that It's rather unsatisfying that we can't tell the user there's more than one solution. I would really prefer to tell the user to write more constraints so that we don't have to choose things arbitrarily. But in the end, only one build plan is executed, and therefore only one side of any disjunction needs to be reachable.
|
Looked over the approach again. I agree that emitting a dependency that wasn't selected for the hash check as an example isn't so bad. It's also an improvement to have the dependency hash reflect the formula. There's an issue introduced to the reachability check though. That one should most definitely only consider effective dependency sets. Doing otherwise would include useless packages in the lock dir |
48312d3 to
6848f33
Compare
In my tests this seems to work, as the reachability is tested by filtering over a solver solution, so if the solver doesn't pick it, it isn't ever checked for reachability. That works even in the presence of |
8b2c551 to
4cfe9e4
Compare
src/dune_pkg/dependency_formula.ml
Outdated
| Dune_sexp.List [ Dune_sexp.atom "or"; l; r ] | ||
| ;; | ||
|
|
||
| let rec to_sexp v = |
There was a problem hiding this comment.
A bit of a shakem to introduce this much boilerplate. I just noticed we have some converters in Opam_dyn. How about we try using that instead of the sexp?
There was a problem hiding this comment.
That works too, I've replaced it with converting to Dyn.t and hashing that. I currently hash it via to_string but maybe a nicer solution would be using Dyn.hash and converting the int into a hex digest?
8c479b4 to
64ba323
Compare
ac42c54 to
408bd26
Compare
The `Dependency_set.t` representation can't deal with disjunctions but in most cases that is not even necessary as the set gets turned into a `filtered_formula` again. Thus it might be easier to keep the original representation and implement the necessary dependency set functionality on top of that. Signed-off-by: Marek Kubica <marek@tarides.com>
Signed-off-by: Marek Kubica <marek@tarides.com>
Signed-off-by: Marek Kubica <marek@tarides.com>
Signed-off-by: Marek Kubica <marek@tarides.com>
Signed-off-by: Marek Kubica <marek@tarides.com>
Signed-off-by: Marek Kubica <marek@tarides.com>
Signed-off-by: Marek Kubica <marek@tarides.com>
Signed-off-by: Marek Kubica <marek@tarides.com>
Signed-off-by: Marek Kubica <marek@tarides.com>
Signed-off-by: Marek Kubica <marek@tarides.com>
Signed-off-by: Rudi Grinberg <me@rgrinberg.com> Signed-off-by: Marek Kubica <marek@tarides.com>
Signed-off-by: Marek Kubica <marek@tarides.com>
Signed-off-by: Marek Kubica <marek@tarides.com>
Signed-off-by: Marek Kubica <marek@tarides.com>
Signed-off-by: Marek Kubica <marek@tarides.com>
Signed-off-by: Marek Kubica <marek@tarides.com>
24f3a62 to
440d328
Compare
972e02d to
ce95b47
Compare
|
LGTM. I just replaced the recursion with some existing opam library functions. |
At the moment when an OPAM package in the local packages we only accept a list of
Atoms, so if it has an disjunction in the dependencies we fail when loading as we attempt to parse the list ofAtoms into aDependency_set.tstructure. This structure only supports a sequence of conjunctions (as that is the only syntax thatdune-projectfiles support at the moment).This PR changes the internal representation of the dependencies to be
OpamTypes.filtered_formulaas when reading OPAM files this is what we get from the OPAM file parser. When we pass the data to the solver it is alsoOpamTpyes.filtered_formulathat we have to encode back to. Thus I suggest keepingOpamTypes.filtered_formulaas dependency specification and compute the information we need from it directly. The information is fairly limited as we only need one example package name for an error message and a hash of the dependencies.The downside of this is that all existing dependency hashes change (see tests), as the representation of the package constraints the hash is calculated over is changing, but I consider this an acceptable compromise.
Fixes #10837