Evar substitutions should have an abstract/compact/efficient representation of the identity substitution

I'm becoming more and more convinced that identity substitutions in evars being non-constant-time is the source of the next large wave of performance issues in Coq tactics (the previous "wave" having been related to pervasive evar-normalization and having been mostly fixed by EConstr).  I'm creating this issue in part as a reference, and in part so that we have a place to discuss this particular change rather than having comments scattered across a multitude of issues.

Probably related issues: #8237, #8244, #8245, #9582, #11896, #12487, #12524
Probably tangentially related issues / issues on which related discussion occurred: #4964, #8231, #10206

Here is a collection of some comments about this topic:

------

> > This seems like it might be that evar creation is linear(?) in the size of the context of the evar
>
> An evar is represented in the evar map as a sequent: `context |- ?123 : type`, hence the size of these things are linear in the size of the context.

_Originally posted by @gares in https://github.com/coq/coq/issues/8237#issuecomment-412272316_

------

> One issue is that the evar context is named while the term context might
> just be de Bruijn variables (usually a mix of a named goal context and a
> local de Bruijn context). So there is not only a copy but a translation
> involved. I think that computation is shared as much as possible now,
> though.

_Originally posted by @mattam82 in https://github.com/coq/coq/issues/8237#issuecomment-412273713_

-----

> In Coq evars are equipped with an explicit substitution that has the size of the context, even if this substitution is the identity one. You proof term, before pose, is
> ```coq
> let H1 := I : True in ... ?1[H1 -> H1, ....]
> ```
> and after pose they are
> ```coq
> let H1 := I : True in ... let Hn := I : True in ?2[H1 -> H1, ...., Hn -> Hn]
> ```
> and in the evar map
> ```coq
> H1 ... |- ?1 := let Hn := I : True in ?2[H1 -> H1, ... Hn -> Hn]
> ```
>
> The explicit substitutions are useful only if some reduction is involved ( in this case they are not the identity)
>
> Given the current representation I really don't see how things can be made non-quadratic easily.
> Sure, one could carefully share the previous explicit susbtitution and just extend it, but I don't think it is easy to do in practice (eg, they are Arrays IIRC).
>
> In Matita we had a smarter representation for evars applied to the identity substitution, eg
> ```ocaml
> type exp = Id of int | Subst of term list
> ```
>
> Not sure this is the only root of the quadratic complexity, but for sure it is one.

_Originally posted by @gares in https://github.com/coq/coq/issues/8237#issuecomment-412280084_

-----

> @gmalecha This wouldn't work, because existential substitutions represent named substitutions whereas esubst stands for de Bruijn substitutions. More generally, the problem here is not so much the fact that the substitution is delayed, because we don't perform any in practice, they are all identity. Rather, it's the fact that the representation is very redundant for substitutions that are mostly identity.

_Originally posted by @ppedrot in https://github.com/coq/coq/issues/8237#issuecomment-414930389_

-----
-----

> This is a pretty naive question whose consequences I don't evaluate well: would it be worth to introduce the long-mentioned abstract representation of the identity substitution on named variables at the same time?

_Originally posted by @herbelin in https://github.com/coq/coq/pull/11896#issuecomment-603422911_

------

> @herbelin My understanding (though I can't now find the comment where @ppedrot told this to me) is that introducing that optimization complicates things somewhat, because then we need access to the environment/context to reconstruct the substitution, so we have to thread that through many more functions.

_Originally posted by @JasonGross in https://github.com/coq/coq/pull/11896#issuecomment-603437656_

------

> @JasonGross:
> > because then we need access to the environment/context to reconstruct the substitution, so we have to thread that through many more functions.
>
> Yes, I think my question indeed reduces to how "many" is the "many more". Is it mainly the functions of the form `occur_var` or `replace_vars`, as well as a few functions about evar unification, or more?

_Originally posted by @herbelin in https://github.com/coq/coq/pull/11896#issuecomment-603441270_

------

> @herbelin adding identity substitutions is much heavier in terms of API. A lot of code needs to be adapted to handle the quotient, and this requires threading additional data. Furthermore it is not even clear how to handle them in some cases, e.g. are we supposed to expand identity substitutions on-the-fly in term iterators? Also, is the quotient 'canonical' in the sense that we forbid identity substitutions to be represented by the non-compact representation? In this case we also need to change the API for `mkEvar`, and so on so forth.
>
> So, I believe that the current change is much more self-contained than additionally introducing efficient identity representations.

_Originally posted by @ppedrot in https://github.com/coq/coq/pull/11896#issuecomment-603524572_

------

> @ppedrot: Indeed, that'd be much more work. Spontaneously, I would lean for a canonical representation. For iterators, maybe having two versions of them, or, maybe, passing a function which tells what to do on the evar instance (I agree, this would be a rather noticeable change, but if the gain is worth...).
>
> My (intuitive) 2p.

_Originally posted by @herbelin in https://github.com/coq/coq/pull/11896#issuecomment-603528926_

------

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Evar substitutions should have an abstract/compact/efficient representation of the identity substitution #12526

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Evar substitutions should have an abstract/compact/efficient representation of the identity substitution #12526

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions