Skip to content

RFC: tweak structure instance elaboration to avoid un-needed eta expansion #2451

@mattrobball

Description

@mattrobball

Currently, we have the following behavior for structure instance elaboration

structure A where 

structure B extends A where 

def a : A := sorry 

def b : B := { a with } 
#print b 
-- def b' : B :=
-- let src := a;
-- { toA := { } } <- There is an extra eta expansion for the `toA` field: 

I believe it would be advisable to change this to the following behavior

structure A where 

structure B extends A where 

def a : A := sorry 

def b : B := { a with } 
#print b 
-- def b' : B :=
-- let src := a;
-- { toA := src } <- There is no eta expansion for the `toA` field: 

The extra eta expansion is due to the following: the function mkProjStx? will not apply when the arguments are _ `A `toA as `toA is not a field of `A.

The change in behavior can be achieved by adding two lines to the function addMissingFields here:

if let some stx := s.source.explicit.find? (·.structName == substructName)|>.map (·.stx) then
  addField (FieldVal.term stx)

This accounts for the case where the projection is exactly to the type of an explicit source.

User Experience: How does this feature improve the user experience?

There is no extra eta expansion when constructing fields which are also fields of the explicit sources. This change makes the behavior more uniform and therefore reduces mental overhead for the user.

Furthermore, there is a performance penalty for the extra eta expansion. While small for a single instance, things can snowball when the pattern is nested. The expanded term forces unfolding during unification when otherwise it would not be necessary.

Beneficiaries: Which Lean users and projects do benefit most from this feature/change?

The largest benefit is for the libraries with deeply nested type class hierarchies, so Mathlib.

In leanprover-community/mathlib4#6759 I benchmarked Mathlib with a toolchain containing this change plus some complementary changes directly to Mathlib. The changes were overall very positive including a 4% reduction to total CPU instructions and dramatic improvements in many files.

Furthermore, at least one example of a serious timeout in algebraic hierarchy presented on Zulip has been completely eliminated.

Community Feedback: Have you sought feedback or insights from other Lean users?

This arose from multiple discussions on Zulip about performance problems in Mathlib, especially in the algebraic hierarchy. It bubbled up already in a separate issue #2387.

Maintainability: Will this change streamline code maintenance or simplify its structure?

Given the simplicity of the change, I would not expect additional maintenance burden.

I have a branch on my fork with this change plus a simple test (as above) to document the behavior and am ready to submit a PR. Looking forward to hearing feedback!

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions