add initial support for universal lock files#3831
Conversation
CodSpeed Performance ReportMerging #3831 will degrade performances by 7.66%Comparing Summary
Benchmarks breakdown
|
69d4588 to
fe29c7a
Compare
|
Will review this today. |
charliermarsh
left a comment
There was a problem hiding this comment.
Random question but one that's on my mind from my own work: in crates/uv-resolver/src/pubgrub/dependencies.rs#add_requirements, we call requirement.evaluate_markers(env, std::slice::from_ref(source_extra)). That now always evaluates to true when the marker environment is omitted, so how do we collect different dependencies based on the extra?
| } | ||
|
|
||
| // If a package has a marker, add a dependency from it to | ||
| // the same package without markers. |
There was a problem hiding this comment.
Why is this necessary? Can you refresh my memory?
There was a problem hiding this comment.
To ensure they resolve to the same version, or something?
There was a problem hiding this comment.
I can't quite explain this yet. If I don't do this (and adjust some of the code in resolution/graph.rs to not require marker: None), then I get a few test failures. One of them is this:
Snapshot: compile_constraints_markers
Source: crates/uv/tests/pip_compile.rs:322
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Expression: snapshot
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
-old snapshot
+new results
────────────┬───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
5 5 │ anyio==4.3.0
6 6 │ # via -r requirements.in
7 7 │ idna==3.6
8 8 │ # via anyio
9 │-sniffio==1.3.0
9 │+sniffio==1.3.1
10 10 │ # via
11 11 │ # -c constraints.txt
12 12 │ # anyio
13 13 │
14 14 │ ----- stderr -----
15 │-Resolved 3 packages in [TIME]
15 │+Resolved 4 packages in [TIME]
And there are 3 more failures that look like this:
Snapshot: install_executable
Source: crates/uv/tests/pip_install.rs:1802
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Expression: snapshot
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
-old snapshot
+new results
────────────┬───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
1 1 │ exit_code: 0
2 2 │ ----- stdout -----
3 3 │
4 4 │ ----- stderr -----
5 │-Resolved 7 packages in [TIME]
5 │+Resolved 8 packages in [TIME]
6 6 │ Downloaded 7 packages in [TIME]
7 7 │ Installed 7 packages in [TIME]
8 8 │ + astroid==3.0.3
9 9 │ + dill==0.3.8
I think I originally added this because I was trying to follow in the footsteps of how extras were handled. But markers aren't exactly like extras, and I'm not sure this sort of dependency makes sense.
Looking at the compile_constraints_markers test more closely, I see that it looks like there is probably something relevant here:
// If constraints are ignored, these will conflict
constraints_txt.write_str("sniffio==1.2.0;python_version<='3.7'")?;
constraints_txt.write_str("sniffio==1.3.0;python_version>'3.7'")?;
So if there is no marker: None "dependency on itself" being added (as is done here in this PR), then I believe the above two constraints will end up being two distinct PubGrubPackage values due to the markers being distinct. Which in turn means the pubgrub constraints won't unify and I think leads to a different resolution? Although I can't fully explain why 1.3.1 is selected. With the dependency on itself, I think that lets pubgrub unify the constraints, which I think is similar to the same kind of effect we achieve with extras?
I'm like 80% confident here. I've updated the comment here to hopefully expose some of that uncertainty:
// If a package has a marker, add a dependency from it to the
// same package without markers.
//
// At time of writing, AG doesn't fully understand why we need
// this, but one explanation is that without it, there is no
// way to connect two different `PubGrubPackage` values with
// the same package name but different markers. With different
// markers, they would be considered wholly distinct packages.
// But this dependency-on-itself-without-markers forces PubGrub
// to unify the constraints across what would otherwise be two
// distinct packages.
charliermarsh
left a comment
There was a problem hiding this comment.
I think there's another area that isn't covered yet -- something like:
anyio @ https://files.pythonhosted.org/packages/7b/a2/10639a79341f6c019dedc95bd48a4928eed9f1d1197f4c04f546fc7ae0ff/anyio-4.4.0-py3-none-any.whl ; python_version > '3.10'
anyio @ https://files.pythonhosted.org/packages/85/4f/d010eca6914703d8e6be222165d02c3e708ed909cdb2b7af3743667f302e/anyio-4.1.0-py3-none-any.whl ; python_version <= '3.10'
These are two different URLs, but I assume urls::Urls will reject these as conflicting upfront? We might need to make all of those "do work upfront" abstractions also track markers in some way.
a59ac65 to
6dfc44d
Compare
Good question! When a /// Evaluates this marker tree against an optional environment and a
/// possibly empty sequence of extras.
///
/// When an environment is not provided, all marker expressions based on
/// the environment evaluate to `true`. That is, this provides environment
/// independent marker evaluation. In practice, this means only the extras
/// are evaluated when an environment is not provided.
pub fn evaluate_optional_environment(
&self,
env: Option<&MarkerEnvironment>,
extras: &[ExtraName],
) -> bool {So that means |
While this could be done by callers since the representation of `MarkerTree` is public, they are just annoying enough to do that I think it makes sense to provide them on `MarkerTree` itself. These could also be improved in the future to do even more flattening of conjunctions/disjunctions (or perhaps even more robust simplification). But for now, some basic flattening is good enough. These routines will be used to combine marker expressions when merging forked resolutions.
This makes it so we can pass any function to determine whether an extra is always true or not. For example, `markers.simplify_extras_with(|_| true)` will remove all extras in any marker expression. This wasn't possible to express (without knowing all of the marker names) using the old API, but becomes trivial to express with a predicate function.
There are still some TODOs/FIXMEs here, but this makes represents a chunk of the resolver refactoring to enable forking. We don't do any merging of resolutions yet, so crucially, this code is broken when no marker environment is provided. But when a marker environment is provided, this should behave the same as a non-forking resolver. In particular, `get_dependencies_forking` is just `get_dependencies` whenever there's a marker environment.
This changes the constructor to just take an `InMemoryIndex` directly instead of the constituent parts. No real reason other than it seems a little simpler.
This commit is a pretty invasive change that implements the merging of resolutions created by each fork of the resolver. The main idea here is that each `SolveState` is converted into a `Resolution` (a new type) and stored on the heap after its fork completes. When all forks complete, they are all merged into a single `Resolution`. This `Resolution` is then used to build a `ResolutionGraph`. Construction of `ResolutionGraph` mostly stays the same (despite the gnarly diff due to an indent change) with one exception: the code to extract dependency edges out of PubGrub's state has been moved to `SolveState::into_resolution`. The idea here is that once a fork completes, we extract what we need from the PubGrub state and then throw it away. We store these edges in our own intermediate type which is then converted into petgraph edges in the `ResolutionGraph` constructor. One interesting change we make here is that our edge data is now a `Version` instead of a `Range<Version>`. I don't think `Range<Version>` was actually being used anywhere, so this seems okay? In any case, I think `Version` here is correct because a resolution corresponds to specific dependencies of each package. Moreover, I didn't see an easy way to make things work with `Range<Version>`. Notably, since we no longer have the guarantee that there is only one version of each package, we need to use `(PackageName, Version)` instead of just `PackageName` for inverted lookups in `ResolutionGraph::from_state`. Finally, the main resolver loop itself is changed a bit to track all forked resolutions and then merge them at the end. Note that we don't really have any dealings with markers in this commit. We'll get to that in a subsequent commit.
This addresses the lack of marker support in prior commits. Specifically, we add them as a new field to `AnnotatedDist`, and from there, they get added to a `Distribution` in a `Lock`.
We significantly regressed performance in some cases because we were cloning the resolver state one more time than we needed to. That doesn't sound like a lot, but in the case where there are no forks, it implies we were cloning the state for every `get_dependencies` called when we shouldn't have been cloning it at all. Avoiding the clone results in somewhat tortured code. This can probably be refactored by moving bits out to a helper routine, but that also seemed non-trivial. So we let this suffice for now.
charliermarsh
left a comment
There was a problem hiding this comment.
Assuming this passes tests, and you feel it's ready, good by me!
Ohh, I was misreading |
This PR implements "resolver forking" in a way that permits us to
generate "universal" lock files. The concrete manifestation of this
requires omitting a
MarkerEnvironmentwhen calling the resolver, andas a result, the resolution returned may include multiple versions for
the same package.
As a very basic example, consider this
requirements.infile:And while not the final intended command, one can generate a
uv.lockwith the following:
And the resulting
uv.lock:In particular, notice that there are two
anyiodistribution entries:The intent here is that, on install, a
MarkerEnvironmentwould beused to do a graph traversal while filtering out distributions whose
marker expressions don't match the marker environment. In order to
work, this does require that we only permit multiple versions of the
same package when their marker expressions have an empty intersection.
This is a draft because there is still some work to be done:
have an empty intersection.
since this one is already pretty big.
I'm also a little uneasy with how we're handling marker expressions in
the
PubGrubPackage. Careful scrutiny there would be appreciated!Closes #3358, Closes #3360