Originally reduced from a production application using the tower crate - see @fasterthanlime's reduced repro repo for more background (though note that it exhibits several other failure modes as well)
trait Trait<'a> {
type A;
type B;
type C;
type D;
fn method() {}
}
impl<T> Trait<'_> for &'_ T
where
for<'x> T: Trait<'x, A = (), B = (), C = (), D = ()>,
{
type A = ();
type B = ();
type C = ();
type D = ();
}
impl Trait<'_> for () {
type A = ();
type B = ();
type C = ();
type D = ();
}
pub fn main() {
#[cfg(depth = "7")]
<&&&&&&&() as Trait>::method();
#[cfg(depth = "8")]
<&&&&&&&&() as Trait>::method();
#[cfg(depth = "9")]
<&&&&&&&&&() as Trait>::method();
#[cfg(depth = "10")]
<&&&&&&&&&&() as Trait>::method();
}
The above example currently takes an exponential amount of time to compile, based on the type depth:
$ curl -O https://gist.githubusercontent.com/eddyb/cd4221f14fff265280d135ddce5c9712/raw/a17cbf00af1894756d2b1bdd2e838cdd4db2bbe2/proj-exp.rs
$ command time -f 'took %Us' rustc proj-exp.rs --emit=metadata --cfg 'depth = "7"'
took 0.74s
$ command time -f 'took %Us' rustc proj-exp.rs --emit=metadata --cfg 'depth = "8"'
took 2.98s
$ command time -f 'took %Us' rustc proj-exp.rs --emit=metadata --cfg 'depth = "9"'
took 11.92s
$ command time -f 'took %Us' rustc proj-exp.rs --emit=metadata --cfg 'depth = "10"'
took 51.10s
With every extra type layer, the time increases by ~4x, and that aligns well with there being 4 associated types.
While this example is a bit silly, it doesn't take more than two associated types (both constrained at once) to cause the issue (although at higher depth or with a larger constant factor from having additional bounds).
And I'm guessing it might be one of the main causes for applications built with the tower crate to experience long compile times (its main trait, Service, has Response and Error as two associated types), in large part because that's the kind of codebase this was originally reduced from (as per the note at the top of this issue).
The reason I suspect caching is a combination of factors:
- instrumenting
evaluate_predicate_recursively found an exponential ramp in terms of duplicates (i.e. the number of times each unique obligation shows up), and many of them were ProjectionPredicates
- this was done via
-Z self-profile, though RUSTC_LOG might also work (and not require compiler changes)
- many of those
ProjectionPredicates' had a bound lifetime listed in their Binder
- attempting to create a
ProjectionCacheKey returns None iff there are bound variables
- the comment is a bit worrying, given that it e.g. talks about "escaping regions" but checks for bounds ones
- (as an aside,
ProjectionCacheKey not holding a ParamEnv is probably risky long-term)
- I've talked to @nikomatsakis and we're unsure if this is a historical artifact - my best guess is that it might have something to do with the fact that normalizing under binders didn't use to work (maybe a micro-opt to not even bother trying to cache those cases? were they that common?)
- replacing the
for<'x> T: Trait<'x, ... with T: Trait<'static, ... removes the exponential slowdown
So the next step was to to try always caching (warning: this is actually an unsound quick hack, ProjectionCacheKey should be modified to carry a Binder<ProjectionTy> instead of a ProjectionTy):
diff --git a/compiler/rustc_trait_selection/src/traits/project.rs b/compiler/rustc_trait_selection/src/traits/project.rs
index b3e7fbb3578..5d5f2f67842 100644
--- a/compiler/rustc_trait_selection/src/traits/project.rs
+++ b/compiler/rustc_trait_selection/src/traits/project.rs
@@ -2123,7 +2123,7 @@ fn from_poly_projection_predicate(
let infcx = selcx.infcx();
// We don't do cross-snapshot caching of obligations with escaping regions,
// so there's no cache key to use
- predicate.no_bound_vars().map(|predicate| {
+ Some(predicate.skip_binder()).map(|predicate| {
ProjectionCacheKey::new(
// We don't attempt to match up with a specific type-variable state
// from a specific call to `opt_normalize_projection_type` - if
However, the above hack does not appear to help and we're unsure why - the main other kind of Predicate that evaluate_predicate_recursively appears to process is TraitPredicate, which IIUC is almost always locally cached?
(EDIT: turns out that the caching is more complex than initially assumed and requires further changes - see #99188 (comment) for a success)
Then again, there are other reductions that don't involve ProjectionPredicate and still cause their own exponential curves, so there might be several different caching issues.
I primarily wanted to open this issue on its own because there's definitely something weird with ProjectionCacheKey (even if a proper fix might be way more involved than just it), and there's a clear correlation between the number of associated types (being constrained) and the exponential curve.
cc @rust-lang/types
Originally reduced from a production application using the
towercrate - see @fasterthanlime's reduced repro repo for more background (though note that it exhibits several other failure modes as well)The above example currently takes an exponential amount of time to compile, based on the type depth:
With every extra type layer, the time increases by ~4x, and that aligns well with there being 4 associated types.
While this example is a bit silly, it doesn't take more than two associated types (both constrained at once) to cause the issue (although at higher depth or with a larger constant factor from having additional bounds).
And I'm guessing it might be one of the main causes for applications built with the
towercrate to experience long compile times (its main trait,Service, hasResponseandErroras two associated types), in large part because that's the kind of codebase this was originally reduced from (as per the note at the top of this issue).The reason I suspect caching is a combination of factors:
evaluate_predicate_recursivelyfound an exponential ramp in terms of duplicates (i.e. the number of times each unique obligation shows up), and many of them wereProjectionPredicates-Z self-profile, thoughRUSTC_LOGmight also work (and not require compiler changes)ProjectionPredicates' had a bound lifetime listed in theirBinderProjectionCacheKeyreturnsNoneiff there are bound variablesProjectionCacheKeynot holding aParamEnvis probably risky long-term)for<'x> T: Trait<'x, ...withT: Trait<'static, ...removes the exponential slowdownSo the next step was to to try always caching (warning: this is actually an unsound quick hack,
ProjectionCacheKeyshould be modified to carry aBinder<ProjectionTy>instead of aProjectionTy):However, the above hack does not appear to help and we're unsure why - the main other kind of
Predicatethatevaluate_predicate_recursivelyappears to process isTraitPredicate, which IIUC is almost always locally cached?(EDIT: turns out that the caching is more complex than initially assumed and requires further changes - see #99188 (comment) for a success)
Then again, there are other reductions that don't involve
ProjectionPredicateand still cause their own exponential curves, so there might be several different caching issues.I primarily wanted to open this issue on its own because there's definitely something weird with
ProjectionCacheKey(even if a proper fix might be way more involved than just it), and there's a clear correlation between the number of associated types (being constrained) and the exponential curve.cc @rust-lang/types