RFC: change variable inclusion mechanism

The current Lean 4 mechanism for including variables is the following: When starting the proof of a theorem (or the body of a definition, it doesn't make a difference), all variables in the context are included and may be used in the proof. When the proof is over, only those variables that have indeed been used are kept for the statement of the theorem (i.e., all the other ones are discarded). This looks nice, but we have faced several issues in mathib 4 because of this:

* if in the context there is `variable {n : ℕ} (hn : 1 < n) (h'n : 2 < n)` and a theorem using `hn` but not `h'n`, then `induction n` will also pull `h'n` in the assumptions of the theorem. So will `subst`. The solution is to `clear h'n` before the induction even though `h'n` should not be in sight anywhere.
* When cleaning up a proof, say by using lemma B instead of lemma A in the proof, new typeclass assumptions may be pulled in by lemma B and weaken the initial statement. All this will go unnoticed (no linter can help here).
* In files which are heavy on variables or typeclass assumptions (say files on vector bundles), the goal view is cluttered by dozens of lines of assumptions which are not relevant for the lemma being currently proved, making it really hard to use.
* Refactors in a file may change the *statements* of theorems in another file without breaking anything: if the proof of a theorem involves a typeclass `classA α`, and in the context there are assumptions `classB α` and `classC α`, both of them implying `classA α`, then the above mechanism will only keep `classB α` or `classC α`, and the one which will be kept depends on the details of the path chosen by typeclass inference. Therefore, adjusting instance priorities somewhere else may change which one will appear in the assumptions of the theorem. 

No linter can guard against this lack of robustness. Therefore, most mathlib folks have been convinced that a less clever but more predictable behavior would be an improvement over the current situation. After several discussions on Zulip, the following scheme has been suggested:
* Include for a lemma or a def all the variables that are mentioned explicitly in the statement of the lemma (not its body, even if it is in term mode, contrary to Lean 3)
* Also include all the typeclass assumptions about these variables which are in the context (named or unnamed, contrary to Lean 3)
* No automatic removal of unused typeclass assumptions at the end of the proof: Better let a linter signal it, and let the user fix it by hand (to avoid non-robustness issues pointed out in the discussion above).
* Add a mechanism to force inclusion of variables that are in the context but are not explicit in the statement. Two variations have been suggested here. Both of them are local to a theorem or a definition, as the global `include / omit` mechanism in Lean 3 is felt by many people as a major source of non-readability. Assume that `hf : ...` and `hg : ...` are variables in the context.
  - (a) Allow `include hf hg in` before the statement of a theorem, to force inclusion of `hf` and `hg` (with the binders they have in the variables list, and at their position in the variables list). This could also be used to adust binder types, with `include (hf) {hg} in` for instance. Optional: also allow `omit hf in` to remove a named typeclass assumptions `hf` that would otherwise be included.
  - (b) Allow `theorem foo ... (hf) ... {hg} ...` to force inclusion of `hf` and `hg`, adjusting their binders and their position in the variables list. The interplay of this syntax with autoparams is not completely clear, and what one should do if other variables depend on `hf` and `hg` and appear before them in the list should also be clarified.

A poll was organized on Zulip (https://leanprover.zulipchat.com/#narrow/stream/113488-general/topic/automatic.20inclusion.20of.20variables.20in.20mathlib.204/near/386620103), proposing either to keep the Lean 4 behavior, or go back to the Lean 3 behavior, or use the above scheme with one of the two suggestions (a) or (b). No-one opted for the current Lean 4 behavior nor the Lean 3 behavior. 12 people voted only for (a), 4 only for (b), and 4 for (a) or (b). 



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: change variable inclusion mechanism #2452

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

RFC: change variable inclusion mechanism #2452

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions