Skip to content

RFC: change variable inclusion mechanism #2452

@sgouezel

Description

@sgouezel

The current Lean 4 mechanism for including variables is the following: When starting the proof of a theorem (or the body of a definition, it doesn't make a difference), all variables in the context are included and may be used in the proof. When the proof is over, only those variables that have indeed been used are kept for the statement of the theorem (i.e., all the other ones are discarded). This looks nice, but we have faced several issues in mathib 4 because of this:

  • if in the context there is variable {n : ℕ} (hn : 1 < n) (h'n : 2 < n) and a theorem using hn but not h'n, then induction n will also pull h'n in the assumptions of the theorem. So will subst. The solution is to clear h'n before the induction even though h'n should not be in sight anywhere.
  • When cleaning up a proof, say by using lemma B instead of lemma A in the proof, new typeclass assumptions may be pulled in by lemma B and weaken the initial statement. All this will go unnoticed (no linter can help here).
  • In files which are heavy on variables or typeclass assumptions (say files on vector bundles), the goal view is cluttered by dozens of lines of assumptions which are not relevant for the lemma being currently proved, making it really hard to use.
  • Refactors in a file may change the statements of theorems in another file without breaking anything: if the proof of a theorem involves a typeclass classA α, and in the context there are assumptions classB α and classC α, both of them implying classA α, then the above mechanism will only keep classB α or classC α, and the one which will be kept depends on the details of the path chosen by typeclass inference. Therefore, adjusting instance priorities somewhere else may change which one will appear in the assumptions of the theorem.

No linter can guard against this lack of robustness. Therefore, most mathlib folks have been convinced that a less clever but more predictable behavior would be an improvement over the current situation. After several discussions on Zulip, the following scheme has been suggested:

  • Include for a lemma or a def all the variables that are mentioned explicitly in the statement of the lemma (not its body, even if it is in term mode, contrary to Lean 3)
  • Also include all the typeclass assumptions about these variables which are in the context (named or unnamed, contrary to Lean 3)
  • No automatic removal of unused typeclass assumptions at the end of the proof: Better let a linter signal it, and let the user fix it by hand (to avoid non-robustness issues pointed out in the discussion above).
  • Add a mechanism to force inclusion of variables that are in the context but are not explicit in the statement. Two variations have been suggested here. Both of them are local to a theorem or a definition, as the global include / omit mechanism in Lean 3 is felt by many people as a major source of non-readability. Assume that hf : ... and hg : ... are variables in the context.
    • (a) Allow include hf hg in before the statement of a theorem, to force inclusion of hf and hg (with the binders they have in the variables list, and at their position in the variables list). This could also be used to adust binder types, with include (hf) {hg} in for instance. Optional: also allow omit hf in to remove a named typeclass assumptions hf that would otherwise be included.
    • (b) Allow theorem foo ... (hf) ... {hg} ... to force inclusion of hf and hg, adjusting their binders and their position in the variables list. The interplay of this syntax with autoparams is not completely clear, and what one should do if other variables depend on hf and hg and appear before them in the list should also be clarified.

A poll was organized on Zulip (https://leanprover.zulipchat.com/#narrow/stream/113488-general/topic/automatic.20inclusion.20of.20variables.20in.20mathlib.204/near/386620103), proposing either to keep the Lean 4 behavior, or go back to the Lean 3 behavior, or use the above scheme with one of the two suggestions (a) or (b). No-one opted for the current Lean 4 behavior nor the Lean 3 behavior. 12 people voted only for (a), 4 only for (b), and 4 for (a) or (b).

Metadata

Metadata

Assignees

No one assigned

    Labels

    Mathlib4 high prioMathlib4 high priority issueP-highWe will work on this issueRFCRequest for comments

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions