RFC: consistent (fine-grained?) equational lemmas

In this issue I’d like to flesh out and describe our plans for how Lean should generate equational lemmas.

## Status quo

The status quo is unsatisfactory in a few ways, at least according to my current understanding:

* The logic for creating the equational lemmas (in `Lean.Elab.Eqns.mkEqnTypes`) is basically “split until it holds by `rfl`, and only near recursive calls”. This is somewhat hard to predict for users, and may (potentially, not sure) lead to different results depending on how the recursion is implemented.

* Because it splits as little as possible, we may get equational lemmas that have “big” case expressions on the right. When they are used by the simplifier to reduce applications to constructors, the term will become large, only to be made smaller again because of the case-of-constructor. More fine-grained equational lemmas would simplify more efficiently.

* The equational lemmas do not necessarily correspond to the cases of the functional induction theorem, violating the principle of least surprise.

* Equational lemmas do not really exist for non-recursive functions (they are always just the unfolding lemma), but would sometimes be useful there as well. Especially now that we have “lazyily defined theorems” by way of reserved names, 

* Sometimes the equational lemmas are suitable for `dsimp`, sometimes they are not.

* Right now, for structural recursion, first the equational lemmas are generated, and then the unfolding lemma, while for well-founded recursion, first the unfolding lemma is derived from `fix_eq`, and then the equational lemmas follow.

## Proposal (variant A)

### Splitting heuristics

We want uniform equational lemams, so the same heuristics is used for non-recursive, structurally recursive and well-founded recursive lemmas. In particular

* We don’t use `rfl` to stop early.
* We split matches even if there is no recursive call on the right hand side.

This should split more than before, which should imply that where previously an equational lemma was by `rfl`, it will still be by `rfl`, and where possible, we do use `rfl` in the proof to make them `dsimp` lemmas. (TODO: not quite true: there are some `matches` where splitting them introduces assumptions, and breaks the `rfl`-property. Should we not split those?)

This splitting does not necessarily yield consistency with the `foo.induct` lemma. This should be fine as long as the induction cases are more specialized than the equations, i.e. in each induction case there is one equational lemma that will apply.

### Simp API

What does it mean to write `simp [f]`?

Ideally, the user can understand the semantics of `simp [f]` as if `f` was expanded to a list of “normal” theorems, and nothing else behaves differently.

This points to a design where `simp [f]` is equivalent to `simp [f.eq_1, f.eq_2,…]`. If the user wants to unfold `f` more aggressively, revealing the `match`es on the RHS, they can write `simp [f.eq_def]`.

### DSimp interaction

A wrinkle here is that `simp` also calls `dsimp`: How should `dsimp` behave?

Note that right now, `simp [some_lemma]` will use `some_lemma` also in `dsimp` if it happens to be by `rfl`. We can use the same rule her as well: `simp [f]` is `simp [f.eq_1, f.eq_2,…]` and will allow `dsimp` to use those `f.eq_n` that are by `rfl`.

We may want to special case the use of `simp [f.eq_def]`: Here the user is really asking to unfold `f` aggressively, so it seems reasonable to try do that in `dsimp` as much as possible. So even if `f.eq_def` isn’t a `rfl`-theorem (e.g. due to structural recursion), this should enable the use of “smart unfolding”, so that `dsimp` unfolds `f` whenever it can.

## Variants

###  Also splitting `if-then-else`

The above is less bold about making the equational theorems fine-grained and consistent with the induction principle, by not splitting `if`-expressions.

One could go further and do that. This would bring equational theorems into one-to-one correspondance with the induction principle cases, but it has costs:

* Right now `simp [foo]` will use local facts to discharge the side-conditoins of equational lemmas, without using `[*]`. By splitting `if` expressions, the shape of these side-conditions will not be restricted to those generated by the match compiler due to overlaps, but could be arbitrary conditions. We’d need a way to recognize them reliably (not using the shape) and discharge them.

* It might break even more code if that code relies on `foo.eq_<n>` being more coarse-grained. In particular if a function has an `if`-expression on the rhs the equational theorms may be harder to apply.

* What I am most worried about: Will the users expect that? And if they have `foo x`, how can they make progress, without manually writing `cases_on (copy of condition in definition of foo)`? Note that `simp [foo.eq_def]` will make progress, but will likely loop. Would we need 
a `functional split` tactic that looks for `foo x` and then splits `x` as needed? 


### Impact

Add :+1: to [issues you consider important](https://github.com/leanprover/lean4/issues?q=is%3Aissue+is%3Aopen+sort%3Areactions-%2B1-desc). If others benefit from the changes in this proposal being added, please ask them to add :+1: to it.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: consistent (fine-grained?) equational lemmas #3983

Status quo

Proposal (variant A)

Splitting heuristics

Simp API

DSimp interaction

Variants

Also splitting `if-then-else`

Impact

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

RFC: consistent (fine-grained?) equational lemmas #3983

Description

Status quo

Proposal (variant A)

Splitting heuristics

Simp API

DSimp interaction

Variants

Also splitting if-then-else

Impact

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Also splitting `if-then-else`